-
Notifications
You must be signed in to change notification settings - Fork 4.1k
[Quant] Add quantization 2.0 document #2354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Quant] Add quantization 2.0 document #2354
Conversation
@jgong5 @mingfeima @Xia-Weiwen Here is the draft of documents for Quantization 2.0. |
✅ Deploy Preview for pytorch-tutorials-preview ready!
To edit notification comments on pull requests, go to your Netlify site settings. |
Please make sure to add a |
@svekars Thanks for the suggestion. |
We fixed the issue and I re-kicked the build. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few editorial suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few editorial suggestions.
Hi @jerryzh168, the comments has been addressed. Please help to take a look again. Thanks. |
prototype_source/quantization_in_pytorch_2_0_export_tutorial.rst
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
@kimishpatel please take a look again as well |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left several comments. I feel overall this is not painting the picture of why we are doing new API and thus the motivaiton. HOwever, since I am reviewing this too late, I dont want to block this on my behalf.
prototype_source/quantization_in_pytorch_2_0_export_tutorial.rst
Outdated
Show resolved
Hide resolved
prototype_source/quantization_in_pytorch_2_0_export_tutorial.rst
Outdated
Show resolved
Hide resolved
prototype_source/quantization_in_pytorch_2_0_export_tutorial.rst
Outdated
Show resolved
Hide resolved
prototype_source/quantization_in_pytorch_2_0_export_tutorial.rst
Outdated
Show resolved
Hide resolved
prototype_source/quantization_in_pytorch_2_0_export_tutorial.rst
Outdated
Show resolved
Hide resolved
Hi @jerryzh168 @kimishpatel, thanks for the suggestions. The comments has been addressed. Please help to take a look again. |
f9ee24b
to
5a43584
Compare
- Currently, in ``QConfig`` we are exposing observer/fake_quant classes as an object for user to configure quantization. | ||
This increases the things that user needs to care about, e.g. not only the ``dtype`` but also how the observation should | ||
happen. These could potentially be hidden from user to make user interface simpler. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit confused by this, I think the new API also has the observer setting in the QuantizationSpec
. So it is also part of the new API, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comment. In creation of QuantizationSpec
, we do still need to specify the class of observer
. But I think it's simpler for users to use comparing with FX Quantization Mode
.
- In
FX Quantization Mode
,QConfig
is created with observer. User needs to learn the constructor and specify the quantization parameters for different observers in order to create aQConfig
for his use case. - Now the general quantization parameters such as
dtype
,qscheme
for different observers are decoupled fromobserver
to theQuantizationSpec
at user interface. Although user still need to specify observer type, It decreases what's user need to know for different observers.
@jerryzh168 I think you may help to comment more about this bullet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so in the new quantizer API, the idea is that configurability is controlled by each backend, the quantizer/annotation API is not facing common modeling user directly, it is an API for backend developers or advanced modeling users only. for common users, they will be interacting with each backend specific quantizer only, e.g. I can have a quantizer that only exposes "quantize/not quantize" option to users:
backend_quantizer = BackendQuantizer()
# BackendQuantizer is interacting with QuantizationSpec, not modeling users,
# modeling user is interacting configurations exposed by the BackendQuantizer only
backend_quantizer.enable_quantization()
model = prepare_pt2e(model, backend_quantizer)
...
see the graph in the end of motivation section: https://docs.google.com/document/d/1_jjXrdaPbkmy7Fzmo35-r1GnNKL7anYoAnqozjyY-XI/edit#heading=h.jtqauapwj95c for details
prototype_source/quantization_in_pytorch_2_0_export_tutorial.rst
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
way by annotating the appropriate nodes. A backend specific quantizer inherited from base quantizer, | ||
some methods that need to be implemented: | ||
|
||
- `annotate method <https://github.com/pytorch/pytorch/blob/3e988316b5976df560c51c998303f56a234a6a1f/torch/ao/quantization/_pt2e/quantizer/qnnpack_quantizer.py#L269>`__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this the only method that needs to be implemented (above description says some methods)? Why create a separate bullet point for it in that case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually there are some other methods needs to be implemented. However, the detail design for the other methods are not fully set per the discussion with @jerryzh168. So we only mention annotate method
here which is also the most important method. Sure, I will merge this bullet into previous paragraph for now.
Fixes #issue_2336
Description
Add the new document for Quantization 2.0 flow.
Checklist
cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @ZailiWang @ZhaoqiongZ @Xia-Weiwen @sekahler2 @CaoE @zhuhaozhe @Valentine233