Skip to content

LLM custom ops tutorial should direct to general custom ops #10139

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 14, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions docs/source/kernel-library-custom-aten-kernel.md
Original file line number Diff line number Diff line change
Expand Up @@ -299,6 +299,26 @@ torch.ops.load_library("libcustom_linear.so/dylib")
op = torch.ops.myop.custom_linear.default
```

#### Using a Custom Operator in a Model

The custom operator can explicitly used in the PyTorch model, or you can write a transformation to replace instances of a core operator with the custom variant. For this example, you could find
all instances of `torch.nn.Linear` and replace them with `CustomLinear`.

```python
def replace_linear_with_custom_linear(module):
for name, child in module.named_children():
if isinstance(child, nn.Linear):
setattr(
module,
name,
CustomLinear(child.in_features, child.out_features, child.bias),
)
else:
replace_linear_with_custom_linear(child)
```

The remaining steps are the same as the normal flow. Now you can run this module in eager mode as well as export to ExecuTorch.

### Custom Ops API Best Practices

Given that we have 2 kernel registration APIs for custom ops, which API should we use? Here are some pros and cons for each API:
Expand Down
92 changes: 3 additions & 89 deletions docs/source/llm/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -855,99 +855,13 @@ With the ExecuTorch custom operator APIs, custom operator and kernel authors can

There are three steps to use custom kernels in ExecuTorch:

1. Write the custom kernel using ExecuTorch types.
2. Compile and link the custom kernel to both AOT Python environment as well as the runtime binary.
3. Source-to-source transformation to swap an operator with a custom op.

### Writing a Custom Kernel

Define your custom operator schema for both functional variant (used in AOT compilation) and out variant (used in ExecuTorch runtime). The schema needs to follow PyTorch ATen convention (see [native_functions.yaml](https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/native/native_functions.yaml)).

```
custom_linear(Tensor weight, Tensor input, Tensor(?) bias) -> Tensor

custom_linear.out(Tensor weight, Tensor input, Tensor(?) bias, *, Tensor(a!) out) -> Tensor(a!)
```

Write your custom kernel according to the schema defined above. Use the `EXECUTORCH_LIBRARY` macro to make the kernel available to the ExecuTorch runtime.

```cpp
// custom_linear.h / custom_linear.cpp
#include <executorch/runtime/kernel/kernel_includes.h>

Tensor& custom_linear_out(const Tensor& weight, const Tensor& input, optional<Tensor> bias, Tensor& out) {
// calculation
return out;
}

// Register as myop::custom_linear.out
EXECUTORCH_LIBRARY(myop, "custom_linear.out", custom_linear_out);
```

To make this operator available in PyTorch, you can define a wrapper around the ExecuTorch custom kernel. Note that the ExecuTorch
implementation uses ExecuTorch tensor types, while the PyTorch wrapper uses ATen tensors.

```cpp
// custom_linear_pytorch.cpp

#include "custom_linear.h"
#include <torch/library.h>

at::Tensor custom_linear(const at::Tensor& weight, const at::Tensor& input, std::optional<at::Tensor> bias) {

// initialize out
at::Tensor out = at::empty({weight.size(1), input.size(1)});

// wrap kernel in custom_linear.cpp into ATen kernel
WRAP_TO_ATEN(custom_linear_out, 3)(weight, input, bias, out);

return out;
}

// Register the operator with PyTorch.
TORCH_LIBRARY(myop, m) {
m.def("custom_linear(Tensor weight, Tensor input, Tensor(?) bias) -> Tensor", custom_linear);
m.def("custom_linear.out(Tensor weight, Tensor input, Tensor(?) bias, *, Tensor(a!) out) -> Tensor(a!)", WRAP_TO_ATEN(custom_linear_out, 3));
}
```

### Compile and Link the Custom Kernel

To make it available to the ExecuTorch runtime, compile custom_linear.h/cpp into the binary target. You can also build the kernel as a dynamically loaded library (.so or .dylib) and link it as well.

To make it available to PyTorch, package custom_linear.h, custom_linear.cpp and custom_linear_pytorch.cpp into a dynamically loaded library (.so or .dylib) and load it into the python environment.
This is needed to make PyTorch aware of the custom operator at the time of export.

```python
import torch
torch.ops.load_library("libcustom_linear.so")
```

Once loaded, you can use the custom operator in PyTorch code.
1. [Write the custom kernel](../kernel-library-custom-aten-kernel.md#c-api-for-custom-ops) using ExecuTorch types.
2. [Compile and link the custom kernel](../kernel-library-custom-aten-kernel.md#compile-and-link-the-custom-kernel) to both AOT Python environment as well as the runtime binary.
3. [Source-to-source transformation](../kernel-library-custom-aten-kernel.md#using-a-custom-operator-in-a-model) to swap an operator with a custom op.

For more information, see [PyTorch Custom Operators](https://pytorch.org/tutorials/advanced/torch_script_custom_ops.html) and
and [ExecuTorch Kernel Registration](../kernel-library-custom-aten-kernel.md).

### Using a Custom Operator in a Model

The custom operator can explicitly used in the PyTorch model, or you can write a transformation to replace instances of a core operator with the custom variant. For this example, you could find
all instances of `torch.nn.Linear` and replace them with `CustomLinear`.

```python
def replace_linear_with_custom_linear(module):
for name, child in module.named_children():
if isinstance(child, nn.Linear):
setattr(
module,
name,
CustomLinear(child.in_features, child.out_features, child.bias),
)
else:
replace_linear_with_custom_linear(child)
```

The remaining steps are the same as the normal flow. Now you can run this module in eager mode as well as export to ExecuTorch.

## How to Build Mobile Apps
See the instructions for building and running LLMs using ExecuTorch on iOS and Android.

Expand Down
Loading