[ET-VK][ez] Make squeeze insertion requirements more strict #9950

SS-JIA · 2025-04-07T22:51:38Z

This PR was created by the merge bot to help merge the original PR into the main branch.
ghstack PR number: #9917 by @SS-JIA
^ Please use this as the source of truth for the PR details, comments, and reviews
ghstack PR base: https://github.com/pytorch/executorch/tree/gh/SS-JIA/207/base
ghstack PR head: https://github.com/pytorch/executorch/tree/gh/SS-JIA/207/head
Merge bot PR base: https://github.com/pytorch/executorch/tree/gh/SS-JIA/206/orig
Merge bot PR head: https://github.com/pytorch/executorch/tree/gh/SS-JIA/207/orig
@diff-train-skip-merge

pytorch-bot · 2025-04-07T22:51:42Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/9950

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Pull Request resolved: #9917 ## Context Refactor the `SqueezeUnsqueezeInputs` pass to be more clear about its intention. For Llama models, input shapes to 4 bit linear will oftentimes have the shape `[1, seq_len, dim]`; under the current implementation of the pass, the input would be squeezed to `[seq_len, dim]` even though the squeeze is not necessary. The original intention of thispass was to squeeze inputs with shape `[batch_size, 1, dim]` to `[batch_size, dim]` before calling the 4-bit linear operator. ## Changes To avoid inserting unnecessary squeeze/unsqueezes, be more specific about when squeeze/unsqueeze should be added. I would like to consider refactoring this pass in the future, since the logic is currently a bit uninttuitive. Squeeze/unsqueeze is also inserted for gelu and relu, but this is to create a chain of unsqueeze/squeeze that will be eliminated by a later pass (see #8601 / D69673068). I think eventually it will be good to rewrite the pass to make shape management more explicit and self contained within the pass rather than inserting ops which are expected to be removed later on. ghstack-source-id: 276566115 @exported-using-ghexport Differential Revision: [D72480178](https://our.internmc.facebook.com/intern/diff/D72480178/)

## Context Refactor the `SqueezeUnsqueezeInputs` pass to be more clear about its intention. For Llama models, input shapes to 4 bit linear will oftentimes have the shape `[1, seq_len, dim]`; under the current implementation of the pass, the input would be squeezed to `[seq_len, dim]` even though the squeeze is not necessary. The original intention of thispass was to squeeze inputs with shape `[batch_size, 1, dim]` to `[batch_size, dim]` before calling the 4-bit linear operator. ## Changes To avoid inserting unnecessary squeeze/unsqueezes, be more specific about when squeeze/unsqueeze should be added. I would like to consider refactoring this pass in the future, since the logic is currently a bit uninttuitive. Squeeze/unsqueeze is also inserted for gelu and relu, but this is to create a chain of unsqueeze/squeeze that will be eliminated by a later pass (see #8601 / D69673068). I think eventually it will be good to rewrite the pass to make shape management more explicit and self contained within the pass rather than inserting ops which are expected to be removed later on. Differential Revision: [D72480178](https://our.internmc.facebook.com/intern/diff/D72480178/)

…9950) ## Context Refactor the `SqueezeUnsqueezeInputs` pass to be more clear about its intention. For Llama models, input shapes to 4 bit linear will oftentimes have the shape `[1, seq_len, dim]`; under the current implementation of the pass, the input would be squeezed to `[seq_len, dim]` even though the squeeze is not necessary. The original intention of thispass was to squeeze inputs with shape `[batch_size, 1, dim]` to `[batch_size, dim]` before calling the 4-bit linear operator. ## Changes To avoid inserting unnecessary squeeze/unsqueezes, be more specific about when squeeze/unsqueeze should be added. I would like to consider refactoring this pass in the future, since the logic is currently a bit uninttuitive. Squeeze/unsqueeze is also inserted for gelu and relu, but this is to create a chain of unsqueeze/squeeze that will be eliminated by a later pass (see pytorch#8601 / D69673068). I think eventually it will be good to rewrite the pass to make shape management more explicit and self contained within the pass rather than inserting ops which are expected to be removed later on. Differential Revision: [D72480178](https://our.internmc.facebook.com/intern/diff/D72480178/)

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 7, 2025

kirklandsign approved these changes Apr 7, 2025

View reviewed changes

Base automatically changed from gh/SS-JIA/206/orig to main April 7, 2025 22:54

kirklandsign force-pushed the gh/SS-JIA/207/orig branch from 75f61e4 to e6e6880 Compare April 7, 2025 22:54

kirklandsign added the release notes: vulkan Changes to the Vulkan backend delegate label Apr 7, 2025

kirklandsign merged commit 1bef520 into main Apr 7, 2025
3 checks passed

kirklandsign deleted the gh/SS-JIA/207/orig branch April 7, 2025 22:54

This was referenced Apr 14, 2025

Weekly pr metrics report - 2025-04-01..2025-04-07 wdvr/pytorch#28

Open

Weekly pr metrics report - 2025-04-01..2025-04-07 wdvr/pytorch#30

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ET-VK][ez] Make squeeze insertion requirements more strict #9950

[ET-VK][ez] Make squeeze insertion requirements more strict #9950

SS-JIA commented Apr 7, 2025

pytorch-bot bot commented Apr 7, 2025

[ET-VK][ez] Make squeeze insertion requirements more strict #9950

[ET-VK][ez] Make squeeze insertion requirements more strict #9950

Conversation

SS-JIA commented Apr 7, 2025

pytorch-bot bot commented Apr 7, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/9950