forked from microsoft/onnxruntime
-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backmerging with Msft Commits #654
Open
jatinwadhwa921
wants to merge
5
commits into
ovep-develop
Choose a base branch
from
sync_msft_11_4_25
base: ovep-develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The Azure DevOps pipeline template [/nuget/templates/dml-vs-2022.yml](https://github.com/microsoft/onnxruntime/blob/main/tools/ci_build/github/azure-pipelines/nuget/templates/dml-vs-2022.yml) is used to build the ONNX Runtime DirectML (DML) components. It historically contained two potential mechanisms for creating NuGet packages: 1. Invoking `python tools/ci_build/build.py` with the `--build_nuget` flag. 2. Executing a specific `NuPackScript` (usually calling `msbuild /t:CreatePackage`). This redundancy created a significant problem during release builds (when the pipeline parameter IsReleaseBuild is set to true). Here's why: - Duplicate Package Creation: Both packaging methods would execute. - build.py --build_nuget created a package with a development/pre-release version suffix (e.g., Microsoft.ML.OnnxRuntime.DirectML.1.21.1-dev-20250408-0849-84808eb710.nupkg). - The NuPackScript's msbuild call, influenced by IsReleaseBuild=true, created the clean release version package (e.g., Microsoft.ML.OnnxRuntime.DirectML.1.21.1.nupkg). - ren Command Failure: For the x86 and arm64 builds, the NuPackScript contains a command like: ```Bash ren Microsoft.ML.OnnxRuntime.DirectML.* win-dml-x86.zip ``` This command fails when two files match the pattern Microsoft.ML.OnnxRuntime.DirectML.* (the dev package and the release package), as ren requires a single source file when using wildcards for renaming. - Result: This caused build failures specifically when attempting to create release candidates or final release builds for x86 and arm64 DML components. This issue did not typically occur in regular nightly builds (IsReleaseBuild: false) because only one package (the dev version) was likely produced, allowing the ren command to succeed. Therefore we only found the problem when doing a patch release for ONNX Runtime 1.21. (@amarin16, the release manager of ONNX Runtime 1.21, found the issue and explained it to us why the pipeline was not working) The change is relatively simple. This PR removes the `--build_nuget` flag from the `python tools/ci_build/build.py` command within the dml-vs-2022.yml template. By removing the redundant packaging step from build.py, only the NuPackScript's msbuild command generates a package file. This ensures only one file matches the Microsoft.ML.OnnxRuntime.DirectML.* pattern, allowing the subsequent ren command in the x86 and arm64 scripts to execute successfully during release builds. # Background (how the DML packaging pipeline works) The build has two stages: 1. Individual Architecture Builds (Using dml-vs-2022.yml): Each stage (x64, x86, arm64) runs, now reliably using only its specific NuPackScript to generate its artifact without the risk of the ren command failing during release. x64 produces: Microsoft.ML.OnnxRuntime.DirectML.[version].nupkg x86 produces: win-dml-x86.zip arm64 produces: win-dml-arm64.zip (arm32 is not built/included). 2. Final Packaging Stage (e.g., stages/nuget_dml_packaging_stage.yml): Downloads these artifacts and combines them by unpacking the base x64 .nupkg, injecting the contents of the .zip files into the appropriate runtimes/ directories (e.g., runtimes/win-x86/native/, runtimes/win-arm64/native/), and re-packing the final, multi-architecture Microsoft.ML.OnnxRuntime.DirectML.nupkg. In stage 1 only x64 produces a nuget package, therefore specific MSBuild parameters: `/p:IsReleaseBuild=${{ parameters.IsReleaseBuild }}` is passed to all architectures' MSBuild calls, while `/p:CurrentData=$(BuildDate) /p:CurrentTime=$(BuildTime)` are passed only in the x64 script. BTW, the property "CurrentData" apparently is a typo. It should be `CurrentDate`.
…#24348) ### Description <!-- Describe your changes. --> Make test `CApiTest.RequestLoadCancellation` deterministic by removing the `terminator` thread. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> The test contributes to CI failures
### Description This change allows NPM tests to run the nodejs binding for webgpu. This helps to debug test failures much easier because WebAssembly is generally very difficult to debug. Steps to debug: 0. build - {ORT_ROOT}> build --config Debug --use_webgpu --build_nodejs - {ORT_ROOT}\js\web> npm ci - {ORT_ROOT}\js\web> npm run pull:wasm 2. run `npm test -- <args> -b=webgpu -e=node` once. ( this command generates necessary .js files and `testdata-config.json`.) 3. use native debugger to debug: ``` C:\Program Files\nodejs\node.exe {ORT_ROOT}\js\node_modules\mocha\bin\_mocha --timeout 999999 --colors -r {ORT_ROOT}\js/web/dist/ort.node.min.js {ORT_ROOT}\js/web/test/test-main ```
### Description MlasTranspose was running single-thread and was not performing well enough on a multi-threaded CPU. Therefore, I modified it to run with multi-thread to improve performance. The `MlasTranspose` was previously running in a single-threaded, which resulted in suboptimal performance on multi-threaded CPUs. To address this, I have modified it to utilize multi-threading. ### Motivation and Context We encountered this issue while running the [multilingual-e5-large](https://huggingface.co/intfloat/multilingual-e5-large), which was converted to ONNX format and executed on a multi-core CPU (Xeon 6338). Below are the performance metrics before and after the modification: | | INTER_NUM_THREADS | INTRA_NUM_THREADS | INPUT_LENGTH | BATCH_SIZE | Duration time[sec] | | ------ | ----------------- | ----------------- | ------------ | ---------- | ------------------ | | BEFORE | 1 | 16 | 512 | 4 | 1.24 | | AFTER | 1 | 16 | 512 | 4 | 1.09 | Condition - FP32 - CPUExecutionProvider This change resulted in a performance improvement of approximately 14%. MlasTranspose stand-alone performance improvements are as follows | | INTRA_NUM_THREADS | BEFORE | AFTER | | --------------------------------- | ---- | -------------- | ------------- | | MlasTranspose [msec] | 16 | 182.55 [ms] | 11.60 [ms] | `MlasTranspose` is x15~16 faster.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backmerging with Msft Commits