-
Notifications
You must be signed in to change notification settings - Fork 279
Move config out of experimental #1954
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1954
Note: Links to docs will display an error until the docs builds have been completed. ❌ 3 New FailuresAs of commit c8d7871 with merge base f3ff2e5 ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe move https://github.com/pytorch/ao/blob/main/torchao/quantization/README.md#int8_dynamic_activation_intx_weight-quantization to the supported section as well
This PR doesn't move the config/quant api yet. It just moved the layouts. The config is still in torchao/experimental and will be moved in follow-up PR. |
Hello @metascroy , where is the documentation to use the new |
That API is not intended for use by most people. The aten kleidiai kernels can be used from the quantize_ config as they could before (https://github.com/pytorch/ao/blob/main/torchao/experimental/tests/test_int8_dynamic_activation_intx_weight.py#L177-L186):
But note that we're also moving int8_dynamic_activation_intx_weight out of experimental and changing its API slightly to align with torchao's QAT routines (#1968). After that, the call site will be:
Also note that users can specify target as "auto" (default), "kleidiai", or "universal". When KleidiAI is chosen, it dispatches to KleidiAI kernels in torchao (https://github.com/pytorch/ao/blob/main/torchao/experimental/ops/linear_8bit_act_xbit_weight/kernel_selector.h#L187). The nice thing about "auto" is it will choose KleidiAI when supported (torch.int4, ZeroPointDomain.NONE), but fall back to neondot GEMV kernels when a quantization option that KleidiAI does not support is chosen. |
Moves q_dq_layout and packed_linear_int8_dynamic_activation_intx_weight_layout.py out of experimental.