-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Add tutorial for user defined triton kernels #2783
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/2783
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 8ee52f7 with merge base 5fbef68 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
3dae479
to
e4ff64e
Compare
if not has_triton: | ||
print("Skipping because triton is not supported on this device.") | ||
else: | ||
import triton | ||
from triton import language as tl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not very readable. Why not
if not has_trition:
print("Skipping because triton is not supported on this device.")
sys.exit(1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other tutorials are also doing this: https://pytorch.org/tutorials/recipes/torch_logs.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes me sad, let me try to propose a bit more elegant solution
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to add larger preamble that references back documentation, explains what the triton is, how it will only work with GPUs, what auto-tuning is for, again, link back to some document about dynamic shapes, etc
if not has_triton: | ||
print("Skipping because triton is not supported on this device.") | ||
else: | ||
import triton | ||
from triton import language as tl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes me sad, let me try to propose a bit more elegant solution
intermediate_source/torch_compile_user_defined_triton_kernel_tutorial.py
Outdated
Show resolved
Hide resolved
…utorial.py Merge a small fix to kick off the build
intermediate_source/torch_compile_user_defined_triton_kernel_tutorial.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some editorial suggestions. Let me know if you have questions! Please add a card and an entry to recipes_index.html
# -*- coding: utf-8 -*- | ||
|
||
""" | ||
Using User Defined Triton Kernels with ``torch.compile`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using User Defined Triton Kernels with ``torch.compile`` | |
Using User-Defined Triton Kernels with ``torch.compile`` |
""" | ||
|
||
###################################################################### | ||
# This tutorial explains how to use user defined Triton kernels with ``torch.compile``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to add a better introduction here. Maybe something like this:
# This tutorial explains how to use user defined Triton kernels with ``torch.compile``. | |
# User-defined Triton kernels can be used to optimize specific parts of your | |
# model's computation. These kernels are written in Triton's language, which is designed | |
# to make it easier to achieve peak hardware performance. By using user-defined Triton | |
# kernels with ``torch.compile``, you can integrate these optimized computations into | |
# your PyTorch model, potentially achieving significant performance improvements. | |
# | |
# This recipes demonstrates how you can use user-defined Triton kernels with ``torch.compile``. |
# | ||
# In this example, we will use a simple vector addition kernel from the Triton documentation | ||
# with ``torch.compile``. | ||
# Reference: https://triton-lang.org/main/getting-started/tutorials/01-vector-add.html |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Reference: https://triton-lang.org/main/getting-started/tutorials/01-vector-add.html | |
# For reference, see `Triton documentation <https://triton-lang.org/main/getting-started/tutorials/01-vector-add.html>`__. |
|
||
###################################################################### | ||
# Advanced Usage | ||
# ------------ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs to be as long or longer than the title
# ------------ | |
# ---------------------- |
# Advanced Usage | ||
# ------------ | ||
# | ||
# It is also possible to triton.autotune with ``torch.compile``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to expand the intro a bit. Maybe something like this:
# It is also possible to triton.autotune with ``torch.compile``. | |
# Triton's autotune feature is a powerful tool that automatically optimizes the configuration | |
# parameters of your Triton kernels. It explores a range of possible configurations and | |
# selects the one that delivers the best performance for your specific use case. | |
# | |
# When used with ``torch.compile``, ``triton.autotune`` can help ensure that your PyTorch | |
# model is running as efficiently as possible. Here is an example of using ``torch.compile`` | |
# and ``triton.autotune``. |
# As for PyTorch 2.3, the user defined triton kernel support in ``torch.compile`` | ||
# composes with dynamic shapes, ``torch.autograd.Function``, JIT inductor and | ||
# AOT inductor. | ||
# | ||
# The support for tensor subclasses and other advanced features currently do | ||
# not exist. | ||
# Support for ``triton.heuristics`` exists when it is used by itself or before | ||
# ``triton.autotune``; however, support for using ``triton.heuristic`` after | ||
# ``triton.autotune`` is not yet supported. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# As for PyTorch 2.3, the user defined triton kernel support in ``torch.compile`` | |
# composes with dynamic shapes, ``torch.autograd.Function``, JIT inductor and | |
# AOT inductor. | |
# | |
# The support for tensor subclasses and other advanced features currently do | |
# not exist. | |
# Support for ``triton.heuristics`` exists when it is used by itself or before | |
# ``triton.autotune``; however, support for using ``triton.heuristic`` after | |
# ``triton.autotune`` is not yet supported. | |
# As of PyTorch 2.3, the support for user-defined Triton kernels in ``torch.compile`` | |
# includes dynamic shapes, ``torch.autograd.Function``, JIT inductor, and AOT inductor. | |
# You can use these features together to build complex, high-performance models. | |
# | |
# However, there are certain limitations to be aware of: | |
# | |
# * **Tensor Subclasses:** Currently, there is no support for | |
# tensor subclasses and other advanced features. | |
# * **Triton Features:** While ``triton.heuristics`` can be used either standalone or | |
# before ``triton.autotune``, it cannot be used after ```triton.autotune``. This | |
# implies that if ``triton.heuristics`` and ``triton.autotune`` are to be used | |
# together, ``triton.heuristics`` must be used first. |
# not exist. | ||
# Support for ``triton.heuristics`` exists when it is used by itself or before | ||
# ``triton.autotune``; however, support for using ``triton.heuristic`` after | ||
# ``triton.autotune`` is not yet supported. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, need to add a Conclusion:
# ``triton.autotune`` is not yet supported. | |
# ``triton.autotune`` is not yet supported. | |
# | |
# Conclusion | |
# ----------- | |
# In this recipe, we explored how to utilize user-defined Triton kernels | |
# with ``torch.compile``. We delved into the basic usage of a simple | |
# vector addition kernel and advanced usage involving Triton's autotune | |
# feature. We also discussed the composability of user-defined Triton | |
# kernels with other PyTorch features and highlighted some current limitations. |
Can you also add what else the user should read?
# .. note:: | ||
# This tutorial requires PyTorch 2.3 or later and a GPU that supports Triton. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have this in the Prerequisites section
# .. note:: | |
# This tutorial requires PyTorch 2.3 or later and a GPU that supports Triton. |
recipes_source/torch_compile_user_defined_triton_kernel_tutorial.py
Outdated
Show resolved
Hide resolved
recipes_source/torch_compile_user_defined_triton_kernel_tutorial.py
Outdated
Show resolved
Hide resolved
ecca06a
to
5ca9fbf
Compare
LGTM from the publishing perspective, please get a technical reviewer to approve. We should not merge until 2.3 binaries are available on the main branch. |
Minor editorial fixes.
Minor formatting fix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor editorial fixes
No description provided.