Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make partitioning utils QDQ aware so it does not break up QDQ node units #19723

Merged
merged 16 commits into from
Mar 12, 2024

Conversation

skottmckay
Copy link
Contributor

Description

If the EP handles QDQ node units, we need to make sure we do not split those into different partitions.

Update the partitioning utils to be QDQ aware. If there are node units we process the logical nodes they represent instead of individual nodes. This ensure we process all nodes in a QDQ node unit at the same time so that they are always in the same partition.

Motivation and Context

Fix one of the issues in #19590

@skottmckay skottmckay requested a review from edgchen1 February 29, 2024 12:36
@jywu-msft jywu-msft requested a review from HectorSVC March 1, 2024 04:33
Fix order of nodes in partition. DQ -> target -> Q must be added sequentially.
Fix x86 build error.
Fix android issue with gtest flags not being processed (couldn't debug failing test easily when --gtest_filter didn't work)
@HectorSVC
Copy link
Contributor

CreateSupportedPartitions(const GraphViewer& graph_viewer,

QNN EP use this one.


Refers to: onnxruntime/core/providers/partitioning_utils.cc:409 in 956412b. [](commit_id = 956412b, deletion_comment = False)

@HectorSVC
Copy link
Contributor

  if (!Contains(node_outputs, input)) {

Is it guaranteed that the node order is topological sorted? Otherwise, I think it's safe to build up the node_outputs first before processing any inputs.


Refers to: onnxruntime/core/providers/partitioning_utils.cc:328 in 956412b. [](commit_id = 956412b, deletion_comment = False)

@skottmckay
Copy link
Contributor Author

  if (!Contains(node_outputs, input)) {

It is currently as we process nodes in topological order when doing partitioning.

We could process all outputs first before inputs. Is there a benefit apart from not requiring things to be topologically sorted if we do that?


In reply to: 1980232432


Refers to: onnxruntime/core/providers/partitioning_utils.cc:328 in 956412b. [](commit_id = 956412b, deletion_comment = False)

…ns variant.

Fix issue with QDQ node group that has no Q nodes.

TODO: Fix QnnHTPBackendTests.TopK_LargestFloats_U8_LastAxis
@HectorSVC
Copy link
Contributor

  if (!Contains(node_outputs, input)) {

It happened when I was trying to fix it in my PR which made the nodes not in topological order any more. That's why I was asking. I was wondering in case someone else do that again without the awareness the code here which requires nodes in topological order.


In reply to: 1980424111


Refers to: onnxruntime/core/providers/partitioning_utils.cc:328 in 956412b. [](commit_id = 956412b, deletion_comment = False)

HectorSVC
HectorSVC previously approved these changes Mar 6, 2024
Copy link
Contributor

@HectorSVC HectorSVC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

…et node (int64_t indices output that is not quantized) as well as edge through Q node (values output)
  - The whole QDQ setup needs a rethink at some point as it's currently spread across too many places (framework, optimizer, base providers lib, EP specific providers lib)
- move NodeGroup to framework/node_unit.h and ValidateNodeGroupQDQNodes to NodeGroup::CanCreateNodeGroup so it's in the framework lib as it's used by NodeUnit
- move GetAllNodeUnits to optimizer
  - doesn't quite belong there but this works will all the current EPs that use it.
@skottmckay
Copy link
Contributor Author

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline

@skottmckay
Copy link
Contributor Author

/azp run Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-binary-size-checks-ci-pipeline,Big Models,Android CI Pipeline

@skottmckay
Copy link
Contributor Author

/azp run iOS CI Pipeline,ONNX Runtime React Native CI Pipeline

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

Copy link

Azure Pipelines successfully started running 10 pipeline(s).

1 similar comment
Copy link

Azure Pipelines successfully started running 10 pipeline(s).

Copy link
Contributor

@edgchen1 edgchen1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haven't finished reviewing but it looks good so far.

edgchen1
edgchen1 previously approved these changes Mar 9, 2024
@skottmckay skottmckay merged commit 978c40d into main Mar 12, 2024
91 of 94 checks passed
@skottmckay skottmckay deleted the skottmckay/MakePartitioningUtilsQDQAware branch March 12, 2024 00:55
HectorSVC added a commit that referenced this pull request Mar 16, 2024
### Description
Enable code in QNN UT to verify the fix for partition issue relate to
QDQ model.
#19723
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants