-
Notifications
You must be signed in to change notification settings - Fork 125
[L0] Support updating kernel commands in command buffers #1353
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[L0] Support updating kernel commands in command buffers #1353
Conversation
7c0b644
to
ddc3675
Compare
ddc3675
to
ac72e87
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - Ideally we could test this with the DPC++ PR in intel/llvm#12840 on a setup with this driver feature and see if the new tests pass (apart from range update) when enabled for L0. But with the CTS tests I wouldn't say this is required and we could fix up any issues in a follow-up PR.
3059fa1
to
5f8a62c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for Level Zero
Please pull in the main branch to have up to date testing, also update the tag in the intel/llvm PR. |
5f8a62c
to
0b355e1
Compare
Codecov ReportAttention: Patch coverage is
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #1353 +/- ##
==========================================
- Coverage 14.82% 12.48% -2.34%
==========================================
Files 250 239 -11
Lines 36220 36022 -198
Branches 4094 4092 -2
==========================================
- Hits 5369 4498 -871
- Misses 30800 31520 +720
+ Partials 51 4 -47 ☔ View full report in Codecov by Sentry. |
* Execution range update is not supported by the L0 driver right now. Currently it supports only kernel arguments update. * There is a synchronization issue with immediate submission when used for command buffers. It is reproducible even without changes of this PR, so should be fixed separately. For now use batched submission for command buffer update tests.
* Remove unnecesary env variable setting * Check support of the feature on device * Check sub-feature support before usage at urCommandBufferUpdateKernelLaunchExp * Remove redundant code
0b355e1
to
c9be1e2
Compare
Done. |
Implement the API for updating the kernel commands in a command-buffer defined by oneapi-src#1089 for the OpenCL adapter. However, the following changes to the UR kernel update API have been made based on implementation experience: 1. Forbid updating the work-dim of the kernel, see KhronosGroup/OpenCL-Docs#1057 2. Remove struct fields to update exec info, after [DPC++ implementation prototype](intel/llvm#12840) shows this isn't needed. 3. Forbid changing the local work size from user to impl defined and vice-versa. See discussion in [L0 implementation PR](oneapi-src#1353 (comment)). This adapter implementation depends on support for the [cl_khr_command_buffer_mutable_dispatch](https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_Ext.html#cl_khr_command_buffer_mutable_dispatch) extension. Tested on Intel GPU/CPUs OpenCL implementations with the [command-buffer emulation layer](https://github.com/bashbaug/SimpleOpenCLSamples/tree/main/layers/10_cmdbufemu). ```bash $ OPENCL_LAYERS=<path/to/SimpleOpenCLSamples/build/layers/10_cmdbufemu/libCmdBufEmu.so> ./bin/test-exp_command_buffer --platform="Intel(R) OpenCL Graphics" ``` DPC++ PR intel/llvm#12724
Fixes Coverity defect report from L0 command-buffer update code merged in oneapi-src#1353 ``` This greater-than-or-equal-to-zero comparison of an unsigned value is always true. "CommandDesc->newWorkDim >= 0U". ```
Fixes Coverity defect report from L0 command-buffer update code merged in oneapi-src#1353 ``` This greater-than-or-equal-to-zero comparison of an unsigned value is always true. "CommandDesc->newWorkDim >= 0U". ```
Fixes Coverity defect report from L0 command-buffer update code merged in oneapi-src#1353 ``` This greater-than-or-equal-to-zero comparison of an unsigned value is always true. "CommandDesc->newWorkDim >= 0U". ```
Fixes Coverity defect report from L0 command-buffer update code merged in oneapi-src#1353 ``` This greater-than-or-equal-to-zero comparison of an unsigned value is always true. "CommandDesc->newWorkDim >= 0U". ```
Implement the API for updating the kernel commands in a command-buffer defined by oneapi-src#1089 for the OpenCL adapter. However, the following changes to the UR kernel update API have been made based on implementation experience: 1. Forbid updating the work-dim of the kernel, see KhronosGroup/OpenCL-Docs#1057 2. Remove struct fields to update exec info, after [DPC++ implementation prototype](intel/llvm#12840) shows this isn't needed. 3. Forbid changing the local work size from user to impl defined and vice-versa. See discussion in [L0 implementation PR](oneapi-src#1353 (comment)). This adapter implementation depends on support for the [cl_khr_command_buffer_mutable_dispatch](https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_Ext.html#cl_khr_command_buffer_mutable_dispatch) extension. Tested on Intel GPU/CPUs OpenCL implementations with the [command-buffer emulation layer](https://github.com/bashbaug/SimpleOpenCLSamples/tree/main/layers/10_cmdbufemu). ```bash $ OPENCL_LAYERS=<path/to/SimpleOpenCLSamples/build/layers/10_cmdbufemu/libCmdBufEmu.so> ./bin/test-exp_command_buffer --platform="Intel(R) OpenCL Graphics" ``` DPC++ PR intel/llvm#12724
Implement the API for updating the kernel commands in a command-buffer defined by oneapi-src#1089 for the OpenCL adapter. However, the following changes to the UR kernel update API have been made based on implementation experience: 1. Forbid updating the work-dim of the kernel, see KhronosGroup/OpenCL-Docs#1057 2. Remove struct fields to update exec info, after [DPC++ implementation prototype](intel/llvm#12840) shows this isn't needed. 3. Forbid changing the local work size from user to impl defined and vice-versa. See discussion in [L0 implementation PR](oneapi-src#1353 (comment)). This adapter implementation depends on support for the [cl_khr_command_buffer_mutable_dispatch](https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_Ext.html#cl_khr_command_buffer_mutable_dispatch) extension. Tested on Intel GPU/CPUs OpenCL implementations with the [command-buffer emulation layer](https://github.com/bashbaug/SimpleOpenCLSamples/tree/main/layers/10_cmdbufemu). ```bash $ OPENCL_LAYERS=<path/to/SimpleOpenCLSamples/build/layers/10_cmdbufemu/libCmdBufEmu.so> ./bin/test-exp_command_buffer --platform="Intel(R) OpenCL Graphics" ``` DPC++ PR intel/llvm#12724
Implement the API for updating the kernel commands in a command-buffer defined by oneapi-src#1089 for the OpenCL adapter. However, the following changes to the UR kernel update API have been made based on implementation experience: 1. Forbid updating the work-dim of the kernel, see KhronosGroup/OpenCL-Docs#1057 2. Remove struct fields to update exec info, after [DPC++ implementation prototype](intel/llvm#12840) shows this isn't needed. 3. Forbid changing the local work size from user to impl defined and vice-versa. See discussion in [L0 implementation PR](oneapi-src#1353 (comment)). This adapter implementation depends on support for the [cl_khr_command_buffer_mutable_dispatch](https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_Ext.html#cl_khr_command_buffer_mutable_dispatch) extension. Tested on Intel GPU/CPUs OpenCL implementations with the [command-buffer emulation layer](https://github.com/bashbaug/SimpleOpenCLSamples/tree/main/layers/10_cmdbufemu). ```bash $ OPENCL_LAYERS=<path/to/SimpleOpenCLSamples/build/layers/10_cmdbufemu/libCmdBufEmu.so> ./bin/test-exp_command_buffer --platform="Intel(R) OpenCL Graphics" ``` DPC++ PR intel/llvm#12724
Implement the API for updating the kernel commands in a command-buffer defined by oneapi-src#1089 for the OpenCL adapter. However, the following changes to the UR kernel update API have been made based on implementation experience: 1. Forbid updating the work-dim of the kernel, see KhronosGroup/OpenCL-Docs#1057 2. Remove struct fields to update exec info, after [DPC++ implementation prototype](intel/llvm#12840) shows this isn't needed. 3. Forbid changing the local work size from user to impl defined and vice-versa. See discussion in [L0 implementation PR](oneapi-src#1353 (comment)). This adapter implementation depends on support for the [cl_khr_command_buffer_mutable_dispatch](https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_Ext.html#cl_khr_command_buffer_mutable_dispatch) extension. Tested on Intel GPU/CPUs OpenCL implementations with the [command-buffer emulation layer](https://github.com/bashbaug/SimpleOpenCLSamples/tree/main/layers/10_cmdbufemu). ```bash $ OPENCL_LAYERS=<path/to/SimpleOpenCLSamples/build/layers/10_cmdbufemu/libCmdBufEmu.so> ./bin/test-exp_command_buffer --platform="Intel(R) OpenCL Graphics" ``` DPC++ PR intel/llvm#12724
Initial support for updating command buffers.
Also applied temporary fixes for conformance tests:
Execution range update is not supported by the L0 driver right now, so skipping those tests for L0 backend.
There is a synchronization issue with immediate submission when used for command buffers. It is reproducible even without changes of this PR, so should be fixed separately. For now use batched submission for command buffer update tests.
intel/llvm PR: intel/llvm#12897