Add tutorial for user defined triton kernels #2783

oulgen · 2024-03-01T20:39:50Z

No description provided.

pytorch-bot · 2024-03-01T20:39:53Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/2783

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 8ee52f7 with merge base 5fbef68 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

malfet · 2024-03-01T22:14:09Z

intermediate_source/torch_compile_user_defined_triton_kernel_tutorial.py

+if not has_triton:
+    print("Skipping because triton is not supported on this device.")
+else:
+    import triton
+    from triton import language as tl


This is not very readable. Why not

if not has_trition: print("Skipping because triton is not supported on this device.") sys.exit(1)

Other tutorials are also doing this: https://pytorch.org/tutorials/recipes/torch_logs.html

This makes me sad, let me try to propose a bit more elegant solution

malfet

It would be nice to add larger preamble that references back documentation, explains what the triton is, how it will only work with GPUs, what auto-tuning is for, again, link back to some document about dynamic shapes, etc

malfet · 2024-03-01T22:29:06Z

intermediate_source/torch_compile_user_defined_triton_kernel_tutorial.py

+if not has_triton:
+    print("Skipping because triton is not supported on this device.")
+else:
+    import triton
+    from triton import language as tl


This makes me sad, let me try to propose a bit more elegant solution

intermediate_source/torch_compile_user_defined_triton_kernel_tutorial.py

…utorial.py Merge a small fix to kick off the build

intermediate_source/torch_compile_user_defined_triton_kernel_tutorial.py

…utorial.py small change to kick off the build

svekars

Some editorial suggestions. Let me know if you have questions! Please add a card and an entry to recipes_index.html

svekars · 2024-03-22T15:43:11Z

recipes_source/torch_compile_user_defined_triton_kernel_tutorial.py

+# -*- coding: utf-8 -*-
+
+"""
+Using User Defined Triton Kernels with ``torch.compile``


Suggested change

Using User Defined Triton Kernels with ``torch.compile``

Using User-Defined Triton Kernels with ``torch.compile``

svekars · 2024-03-22T15:51:46Z

recipes_source/torch_compile_user_defined_triton_kernel_tutorial.py

+"""
+
+######################################################################
+# This tutorial explains how to use user defined Triton kernels with ``torch.compile``.


Need to add a better introduction here. Maybe something like this:

Suggested change

# This tutorial explains how to use user defined Triton kernels with ``torch.compile``.

# User-defined Triton kernels can be used to optimize specific parts of your

# model's computation. These kernels are written in Triton's language, which is designed

# to make it easier to achieve peak hardware performance. By using user-defined Triton

# kernels with ``torch.compile``, you can integrate these optimized computations into

# your PyTorch model, potentially achieving significant performance improvements.

#

# This recipes demonstrates how you can use user-defined Triton kernels with ``torch.compile``.

svekars · 2024-03-22T15:54:05Z

recipes_source/torch_compile_user_defined_triton_kernel_tutorial.py

+#
+# In this example, we will use a simple vector addition kernel from the Triton documentation
+# with ``torch.compile``.
+# Reference: https://triton-lang.org/main/getting-started/tutorials/01-vector-add.html


Suggested change

# Reference: https://triton-lang.org/main/getting-started/tutorials/01-vector-add.html

# For reference, see `Triton documentation <https://triton-lang.org/main/getting-started/tutorials/01-vector-add.html>`__.

svekars · 2024-03-22T16:01:02Z

recipes_source/torch_compile_user_defined_triton_kernel_tutorial.py

+
+######################################################################
+# Advanced Usage
+# ------------


Needs to be as long or longer than the title

Suggested change

# ------------

# ----------------------

svekars · 2024-03-22T16:05:14Z

recipes_source/torch_compile_user_defined_triton_kernel_tutorial.py

+# Advanced Usage
+# ------------
+#
+# It is also possible to triton.autotune with ``torch.compile``.


Need to expand the intro a bit. Maybe something like this:

Suggested change

# It is also possible to triton.autotune with ``torch.compile``.

# Triton's autotune feature is a powerful tool that automatically optimizes the configuration

# parameters of your Triton kernels. It explores a range of possible configurations and

# selects the one that delivers the best performance for your specific use case.

#

# When used with ``torch.compile``, ``triton.autotune`` can help ensure that your PyTorch

# model is running as efficiently as possible. Here is an example of using ``torch.compile``

# and ``triton.autotune``.

svekars · 2024-03-22T16:14:30Z

recipes_source/torch_compile_user_defined_triton_kernel_tutorial.py

+# As for PyTorch 2.3, the user defined triton kernel support in ``torch.compile``
+# composes with dynamic shapes, ``torch.autograd.Function``, JIT inductor and
+# AOT inductor.
+#
+# The support for tensor subclasses and other advanced features currently do
+# not exist.
+# Support for ``triton.heuristics`` exists when it is used by itself or before
+# ``triton.autotune``; however, support for using ``triton.heuristic`` after
+# ``triton.autotune`` is not yet supported.


Suggested change

# As for PyTorch 2.3, the user defined triton kernel support in ``torch.compile``

# composes with dynamic shapes, ``torch.autograd.Function``, JIT inductor and

# AOT inductor.

#

# The support for tensor subclasses and other advanced features currently do

# not exist.

# Support for ``triton.heuristics`` exists when it is used by itself or before

# ``triton.autotune``; however, support for using ``triton.heuristic`` after

# ``triton.autotune`` is not yet supported.

# As of PyTorch 2.3, the support for user-defined Triton kernels in ``torch.compile``

# includes dynamic shapes, ``torch.autograd.Function``, JIT inductor, and AOT inductor.

# You can use these features together to build complex, high-performance models.

#

# However, there are certain limitations to be aware of:

#

# * **Tensor Subclasses:** Currently, there is no support for

# tensor subclasses and other advanced features.

# * **Triton Features:** While ``triton.heuristics`` can be used either standalone or

# before ``triton.autotune``, it cannot be used after ```triton.autotune``. This

# implies that if ``triton.heuristics`` and ``triton.autotune`` are to be used

# together, ``triton.heuristics`` must be used first.

svekars · 2024-03-22T16:17:16Z

recipes_source/torch_compile_user_defined_triton_kernel_tutorial.py

+# not exist.
+# Support for ``triton.heuristics`` exists when it is used by itself or before
+# ``triton.autotune``; however, support for using ``triton.heuristic`` after
+# ``triton.autotune`` is not yet supported.


Also, need to add a Conclusion:

Suggested change

# ``triton.autotune`` is not yet supported.

# ``triton.autotune`` is not yet supported.

#

# Conclusion

# -----------

# In this recipe, we explored how to utilize user-defined Triton kernels

# with ``torch.compile``. We delved into the basic usage of a simple

# vector addition kernel and advanced usage involving Triton's autotune

# feature. We also discussed the composability of user-defined Triton

# kernels with other PyTorch features and highlighted some current limitations.

Can you also add what else the user should read?

svekars · 2024-03-22T16:30:59Z

recipes_source/torch_compile_user_defined_triton_kernel_tutorial.py

+# .. note::
+#   This tutorial requires PyTorch 2.3 or later and a GPU that supports Triton.


We have this in the Prerequisites section

Suggested change

# .. note::

# This tutorial requires PyTorch 2.3 or later and a GPU that supports Triton.

recipes_source/torch_compile_user_defined_triton_kernel_tutorial.py

svekars · 2024-03-22T17:47:55Z

LGTM from the publishing perspective, please get a technical reviewer to approve. We should not merge until 2.3 binaries are available on the main branch.

Minor editorial fixes.

Minor formatting fix

zou3519

looks great!

svekars

Minor editorial fixes

recipes_source/torch_compile_user_defined_triton_kernel_tutorial.py

Editorial fixes

.jenkins/metadata.json

facebook-github-bot added the cla signed label Mar 1, 2024

Add tutorial for user defined triton kernels

e4ff64e

oulgen force-pushed the user_defined_kernel branch from 3dae479 to e4ff64e Compare March 1, 2024 20:54

malfet reviewed Mar 1, 2024

View reviewed changes

svekars added the 2.3 label Mar 11, 2024

svekars changed the base branch from main to 2.3-RC-TEST March 19, 2024 22:51

svekars reviewed Mar 19, 2024

View reviewed changes

intermediate_source/torch_compile_user_defined_triton_kernel_tutorial.py Outdated Show resolved Hide resolved

Update intermediate_source/torch_compile_user_defined_triton_kernel_t…

a3f7939

…utorial.py Merge a small fix to kick off the build

svekars reviewed Mar 21, 2024

View reviewed changes

intermediate_source/torch_compile_user_defined_triton_kernel_tutorial.py Outdated Show resolved Hide resolved

svekars and others added 3 commits March 21, 2024 10:09

Update intermediate_source/torch_compile_user_defined_triton_kernel_t…

565e2c6

…utorial.py small change to kick off the build

update

caa4c60

update

538ed7d

oulgen marked this pull request as ready for review March 22, 2024 00:58

svekars reviewed Mar 22, 2024

View reviewed changes

update

9d05ab4

svekars reviewed Mar 22, 2024

View reviewed changes

recipes_source/torch_compile_user_defined_triton_kernel_tutorial.py Outdated Show resolved Hide resolved

recipes_source/torch_compile_user_defined_triton_kernel_tutorial.py Outdated Show resolved Hide resolved

oulgen requested a review from zou3519 March 22, 2024 17:41

update

5ca9fbf

oulgen force-pushed the user_defined_kernel branch from ecca06a to 5ca9fbf Compare March 22, 2024 17:46

svekars approved these changes Mar 22, 2024

View reviewed changes

svekars added 2 commits March 22, 2024 12:25

Update torch_compile_user_defined_triton_kernel_tutorial.py

bee572f

Minor editorial fixes.

Update torch_compile_user_defined_triton_kernel_tutorial.py

8236f8d

Minor formatting fix

zou3519 approved these changes Mar 25, 2024

View reviewed changes

svekars deleted the branch pytorch:main April 19, 2024 15:59

svekars closed this Apr 19, 2024

svekars reopened this Apr 19, 2024

svekars changed the base branch from 2.3-RC-TEST to main April 19, 2024 16:29

svekars added 2 commits April 19, 2024 09:29

Merge branch 'main' into user_defined_kernel

ca3fc23

Merge branch 'main' into user_defined_kernel

23cd0dd

svekars reviewed Apr 19, 2024

View reviewed changes

recipes_source/torch_compile_user_defined_triton_kernel_tutorial.py Outdated Show resolved Hide resolved

recipes_source/torch_compile_user_defined_triton_kernel_tutorial.py Outdated Show resolved Hide resolved

svekars added 5 commits April 19, 2024 11:06

Apply suggestions from code review

68be9d9

Editorial fixes

Merge branch 'main' into user_defined_kernel

910f403

Update metadata.json

2a87268

Merge branch 'main' into user_defined_kernel

2869b1e

Merge branch 'main' into user_defined_kernel

374b8c5

svekars reviewed Apr 19, 2024

View reviewed changes

.jenkins/metadata.json Outdated Show resolved Hide resolved

Update .jenkins/metadata.json

8ee52f7

svekars merged commit 6771cf5 into pytorch:main Apr 20, 2024

	Using User Defined Triton Kernels with ``torch.compile``
	Using User-Defined Triton Kernels with ``torch.compile``

-# This tutorial explains how to use user defined Triton kernels with ``torch.compile``.
+# User-defined Triton kernels can be used to optimize specific parts of your
+# model's computation. These kernels are written in Triton's language, which is designed
+# to make it easier to achieve peak hardware performance. By using user-defined Triton
+# kernels with ``torch.compile``, you can integrate these optimized computations into
+# your PyTorch model, potentially achieving significant performance improvements.
+#
+# This recipes demonstrates how you can use user-defined Triton kernels with ``torch.compile``.

	# Reference: https://triton-lang.org/main/getting-started/tutorials/01-vector-add.html
	# For reference, see `Triton documentation <https://triton-lang.org/main/getting-started/tutorials/01-vector-add.html>`__.

-# It is also possible to triton.autotune with ``torch.compile``.
+# Triton's autotune feature is a powerful tool that automatically optimizes the configuration
+# parameters of your Triton kernels. It explores a range of possible configurations and
+# selects the one that delivers the best performance for your specific use case.
+#
+# When used with ``torch.compile``, ``triton.autotune`` can help ensure that your PyTorch
+# model is running as efficiently as possible. Here is an example of using ``torch.compile``
+# and ``triton.autotune``.

-# As for PyTorch 2.3, the user defined triton kernel support in ``torch.compile``
-# composes with dynamic shapes, ``torch.autograd.Function``, JIT inductor and
-# AOT inductor.
-#
-# The support for tensor subclasses and other advanced features currently do
-# not exist.
-# Support for ``triton.heuristics`` exists when it is used by itself or before
-# ``triton.autotune``; however, support for using ``triton.heuristic`` after
-# ``triton.autotune`` is not yet supported.
+# As of PyTorch 2.3, the support for user-defined Triton kernels in ``torch.compile``
+# includes dynamic shapes, ``torch.autograd.Function``, JIT inductor, and AOT inductor.
+# You can use these features together to build complex, high-performance models.
+#
+# However, there are certain limitations to be aware of:
+#
+# * **Tensor Subclasses:** Currently, there is no support for
+#    tensor subclasses and other advanced features.
+# * **Triton Features:** While ``triton.heuristics`` can be used either standalone or
+#    before ``triton.autotune``, it cannot be used after ```triton.autotune``. This
+#    implies that if ``triton.heuristics`` and ``triton.autotune`` are to be used
+#    together, ``triton.heuristics`` must be used first.

-# ``triton.autotune`` is not yet supported.
+# ``triton.autotune`` is not yet supported.
+#
+# Conclusion
+# -----------
+# In this recipe, we explored how to utilize user-defined Triton kernels
+# with ``torch.compile``. We delved into the basic usage of a simple
+# vector addition kernel and advanced usage involving Triton's autotune
+# feature. We also discussed the composability of user-defined Triton
+# kernels with other PyTorch features and highlighted some current limitations.

		# .. note::
		# This tutorial requires PyTorch 2.3 or later and a GPU that supports Triton.

Add tutorial for user defined triton kernels #2783

Add tutorial for user defined triton kernels #2783

Uh oh!

Conversation

oulgen commented Mar 1, 2024

Uh oh!

pytorch-bot bot commented Mar 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/2783

✅ No Failures

Uh oh!

malfet Mar 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

malfet left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

svekars left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

svekars commented Mar 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zou3519 left a comment

Choose a reason for hiding this comment

Uh oh!

svekars left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pytorch-bot bot commented Mar 1, 2024 •

edited

Loading

malfet Mar 1, 2024 •

edited

Loading

svekars commented Mar 22, 2024 •

edited

Loading