LLM custom ops tutorial should direct to general custom ops (#10139)

kirklandsign · web-flow · commit d711ec1adddc · 2025-04-14T11:29:16.000-07:00
Instead of having two pages using the same custom example, consolidate
into one. Also move Source-to-source transformation to
kernel-library-custom-aten-kernel.md
diff --git a/docs/source/kernel-library-custom-aten-kernel.md b/docs/source/kernel-library-custom-aten-kernel.md
@@ -299,6 +299,26 @@ torch.ops.load_library("libcustom_linear.so/dylib")
 op = torch.ops.myop.custom_linear.default
 ```
 
+#### Using a Custom Operator in a Model
+
+The custom operator can explicitly used in the PyTorch model, or you can write a transformation to replace instances of a core operator with the custom variant. For this example, you could find
+all instances of `torch.nn.Linear` and replace them with `CustomLinear`.
+
+```python
+def  replace_linear_with_custom_linear(module):
+    for name, child in module.named_children():
+        if isinstance(child, nn.Linear):
+            setattr(
+                module,
+                name,
+                CustomLinear(child.in_features,  child.out_features, child.bias),
+        )
+        else:
+            replace_linear_with_custom_linear(child)
+```
+
+The remaining steps are the same as the normal flow. Now you can run this module in eager mode as well as export to ExecuTorch.
+
 ### Custom Ops API Best Practices
 
 Given that we have 2 kernel registration APIs for custom ops, which API should we use? Here are some pros and cons for each API:
diff --git a/docs/source/llm/getting-started.md b/docs/source/llm/getting-started.md
@@ -855,99 +855,13 @@ With the ExecuTorch custom operator APIs, custom operator and kernel authors can
 
 There are three steps to use custom kernels in ExecuTorch:
 
-1.  Write the custom kernel using ExecuTorch types.
-2.  Compile and link the custom kernel to both AOT Python environment as well as the runtime binary.
-3.  Source-to-source transformation to swap an operator with a custom op.
-
-### Writing a Custom Kernel
-
-Define your custom operator schema for both functional variant (used in AOT compilation) and out variant (used in ExecuTorch runtime). The schema needs to follow PyTorch ATen convention (see [native_functions.yaml](https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/native/native_functions.yaml)).
-
-```
-custom_linear(Tensor weight, Tensor input, Tensor(?) bias) -> Tensor
-
-custom_linear.out(Tensor weight, Tensor input, Tensor(?) bias, *, Tensor(a!) out) -> Tensor(a!)
-```
-
-Write your custom kernel according to the schema defined above. Use the `EXECUTORCH_LIBRARY` macro to make the kernel available to the ExecuTorch runtime.
-
-```cpp
-// custom_linear.h / custom_linear.cpp
-#include <executorch/runtime/kernel/kernel_includes.h>
-
-Tensor& custom_linear_out(const Tensor& weight, const Tensor& input, optional<Tensor> bias, Tensor& out) {
-    // calculation
-    return out;
-}
-
-// Register as myop::custom_linear.out
-EXECUTORCH_LIBRARY(myop, "custom_linear.out", custom_linear_out);
-```
-
-To make this operator available in PyTorch, you can define a wrapper around the ExecuTorch custom kernel. Note that the ExecuTorch
-implementation uses ExecuTorch tensor types, while the PyTorch wrapper uses ATen tensors.
-
-```cpp
-// custom_linear_pytorch.cpp
-
-#include "custom_linear.h"
-#include <torch/library.h>
-
-at::Tensor custom_linear(const at::Tensor& weight, const at::Tensor& input, std::optional<at::Tensor> bias) {
-
-    // initialize out
-    at::Tensor out = at::empty({weight.size(1), input.size(1)});
-
-    // wrap kernel in custom_linear.cpp into ATen kernel
-    WRAP_TO_ATEN(custom_linear_out, 3)(weight, input, bias, out);
-
-    return out;
-}
-
-// Register the operator with PyTorch.
-TORCH_LIBRARY(myop,  m) {
-    m.def("custom_linear(Tensor weight, Tensor input, Tensor(?) bias) -> Tensor", custom_linear);
-    m.def("custom_linear.out(Tensor weight, Tensor input, Tensor(?) bias, *, Tensor(a!) out) -> Tensor(a!)", WRAP_TO_ATEN(custom_linear_out, 3));
-}
-```
-
-### Compile and Link the Custom Kernel
-
-To make it available to the ExecuTorch runtime, compile custom_linear.h/cpp into the binary target. You can also build the kernel as a dynamically loaded library (.so or .dylib) and link it as well.
-
-To make it available to PyTorch, package custom_linear.h, custom_linear.cpp and custom_linear_pytorch.cpp into a dynamically loaded library (.so or .dylib) and load it into the python environment.
-This is needed to make PyTorch aware of the custom operator at the time of export.
-
-```python
-import torch
-torch.ops.load_library("libcustom_linear.so")
-```
-
-Once loaded, you can use the custom operator in PyTorch code.
+1.  [Write the custom kernel](../kernel-library-custom-aten-kernel.md#c-api-for-custom-ops) using ExecuTorch types.
+2.  [Compile and link the custom kernel](../kernel-library-custom-aten-kernel.md#compile-and-link-the-custom-kernel) to both AOT Python environment as well as the runtime binary.
+3.  [Source-to-source transformation](../kernel-library-custom-aten-kernel.md#using-a-custom-operator-in-a-model) to swap an operator with a custom op.
 
 For more information, see [PyTorch Custom Operators](https://pytorch.org/tutorials/advanced/torch_script_custom_ops.html) and
 and [ExecuTorch Kernel Registration](../kernel-library-custom-aten-kernel.md).
 
-### Using a Custom Operator in a Model
-
-The custom operator can explicitly used in the PyTorch model, or you can write a transformation to replace instances of a core operator with the custom variant. For this example, you could find
-all instances of `torch.nn.Linear` and replace them with `CustomLinear`.
-
-```python
-def  replace_linear_with_custom_linear(module):
-    for name, child in module.named_children():
-        if isinstance(child, nn.Linear):
-            setattr(
-                module,
-                name,
-                CustomLinear(child.in_features,  child.out_features, child.bias),
-        )
-        else:
-            replace_linear_with_custom_linear(child)
-```
-
-The remaining steps are the same as the normal flow. Now you can run this module in eager mode as well as export to ExecuTorch.
-
 ## How to Build Mobile Apps
 See the instructions for building and running LLMs using ExecuTorch on iOS and Android.