-
Notifications
You must be signed in to change notification settings - Fork 769
[Doc] Add Mar'24 Release Notes #13879
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 9 commits
8c2cf2c
7e37797
4b974d7
e5b040d
e7cf705
a0d143c
fab0d8e
b017778
2f74949
0507f58
6b9d55b
c5aaa00
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,118 @@ | ||
# Mar'24 release notes | ||
Release notes for commit range [f4e0d3177338](https://github.com/intel/llvm/commit/f4ed132f243ab43816ebe826669d978139964df2).. [d2817d6d317db1](https://github.com/intel/llvm/commit/d2817d6d317db1143bb227168e85c409d5ab7c82) | ||
|
||
## New Features | ||
### SYCL Compiler | ||
|
||
- Added more available CPU for `-march` option in OpenCL AOT compiler. [7911773c] | ||
- Added support for additional AMD GPU targets. [c1ce15944] | ||
- Added C++11 ABI=0 support. [459e122a] | ||
uditagarwal97 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Created an additional version-agnostic copy of the SYCL import library during compiler build. [2d2e418c] | ||
uditagarwal97 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Supported detecting out-of-bound errors on CPU device, static local memory, and device globals via AddressSanitizer. [f331ba2063] [a14cfdd7999] | ||
- Provide a preprocessor macro to locate the CUPTI library when XPTI tracing is enabled during compiler build. [e15ebd08] [acf89a6c90] | ||
|
||
### SYCL Library | ||
|
||
- Implemented [ext_oneapi_kernel_compiler](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_oneapi_kernel_compiler.asciidoc) SYCL extension. [096676e8] [e5826540] [67086100] | ||
uditagarwal97 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Implemented [ext_intel_fp_control] (https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_intel_fp_control.asciidoc) SYCL extension. [bf8ea96f] | ||
- Implemented [ext_oneapi_kernel_compiler_opencl](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_oneapi_kernel_compiler_opencl.asciidoc) SYCL extension. [6344ead19] | ||
- Enabled kernel fusion with heterogeneous ND ranges for HIP targets. [e44888873] | ||
- Enabled [ext_oneapi_graph](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_oneapi_graph.asciidoc) SYCL extension for OpenCL and HIP backend. [5d7524543] [897b27076] | ||
- Supported graph partitioning for host task dependencies in [ext_oneapi_graph](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_oneapi_graph.asciidoc) SYCL extension. [d53f123a] | ||
- Added ESIMD APIs for stochastic rounding, property-based gather, masked-gather, and ReaD timestamp counting. [aa4e87801] [3eca2d473] [1261e0518] | ||
- Added out-of-bounds `load`,`store`,`fill` and overloads accepting annotated pointers in [ext_oneapi_matrix](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_matrix/sycl_ext_oneapi_matrix.asciidoc) SYCL extension [4c17a7f39] [f3137e99] | ||
- Supported `Queue::mem_advise` on HIP backends [a669374b7] [ab86d0db] | ||
uditagarwal97 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Supported `fill` and `memset` nodes in [ext_oneapi_graph](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_oneapi_graph.asciidoc) SYCL extension. [8ea022954] | ||
- Implemented [ext_oneapi_in_order_queue_events](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_oneapi_in_order_queue_events.asciidoc) SYCL extension. [19072756e] | ||
- Implemented [ext_oneapi_address_cast](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/proposed/sycl_ext_oneapi_address_cast.asciidoc) SYCL extension. [123705190] | ||
- Implemented [ext_oneapi_kernel_compiler_spirv](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_oneapi_kernel_compiler_spirv.asciidoc) SYCL extension. [36e123d3e1] | ||
- Implemented [ext_oneapi_composite_device](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_oneapi_composite_device.asciidoc) SYCL extension. [2db1a4f6a5] | ||
- Implemented joint matrix query from [ext_oneapi_matrix](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_matrix/sycl_ext_oneapi_matrix.asciidoc) SYCL extension on CUDA and HIP backends. [00eebe1e4] | ||
- Add support for unsampled image arrays in [ext_oneapi_bindless_images](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_oneapi_bindless_images.asciidoc) SYCL extension. [76ec3f0f7] | ||
- Added `__imf_rcp64h` - equivalent to CUDA's `__nv_rcp64h` - and `sqrt` function with selectable rounding modes to Intel math libdevice. [ce70cb521] [6c1dde4243b5] | ||
- Integrated OneAPI construction kit's vectorizer to Native CPU backend. [330ac57d6] | ||
- Added ability to compare device architecture and support for PVC-VG to [ext_oneapi_device_architecture](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_oneapi_device_architecture.asciidoc) SYCL extension. [68445467] [ac0e142e12] | ||
- Added helper functions in SYCLCompat library for occupancy calculation in Intel GPUs. [b209b321] | ||
- Added support for SYCL barriers on Native CPU. [3c39d132a] | ||
- Added support for `bfloat16` to `sycl::vec`. [bbbe8839] | ||
|
||
### Documentation | ||
- Proposed [ext_intel_fp_control](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_intel_fp_control.asciidoc) extension to allow specifying the rounding and denorm mode for floating-point operations in SYCL kernels. [bf8ea96f4] | ||
- Proposed [ext_oneapi_raw_kernel_arg](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/proposed/sycl_ext_oneapi_raw_kernel_arg.asciidoc) SYCL extension to allow opaque types to be passed to SYCL kernels. [4168793978] | ||
- Proposed [ext_oneapi_composite_device](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_oneapi_composite_device.asciidoc) SYCL extension to allow card-level device access on PVC GPUs. [9a1b9084] | ||
- Proposed [ext_oneapi_in_order_queue_events](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_oneapi_in_order_queue_events.asciidoc) SYCL extension to allow getting event from the last submitted command and setting an external event as an implicit dependence on the next command submitted to the queue [19072756e] | ||
- Proposed [ext_oneapi_profiling_tag](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/proposed/sycl_ext_oneapi_profiling_tag.asciidoc) SYCL extension to time commands submitted to the queue. [b4ade420] | ||
- Proposed [ext_oneapi_private_alloca](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_oneapi_private_alloca.asciidoc) SYCL extension to have specialization constant-length private memory allocations. [aaf7a58863] | ||
|
||
|
||
## Improvements | ||
### SYCL Compiler | ||
- Enabled default selection of general register file (GRF) size on Linux for PVC GPUs. [8083f8a8] | ||
- Made SYCL compiler pass `-cl-fp32-correctly-rounded-divide-sqrt` option to device compiler when `-fpp-model=precise` is used. [bfbf8ab8698] | ||
|
||
|
||
### SYCL Library | ||
- Improved error messages for invalid properties specified on non pointer types. [728b132a5] | ||
- Adopted a unified and scalable way to pass alignment and cache flags to all ESIMD functions. [a2208484ab] [960d898c] [5ef8df837d] [a57a96c77] [19cd6144a] [646ab086e5] [0bf2e666c] | ||
- Added default constructor to bindless sampler and image handler in [ext_oneapi_bindless_images](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_oneapi_bindless_images.asciidoc) SYCL extension. [d65f3aa560] [7bfdcfd4cabf] | ||
- Added `SYCL_CACHE_IN_MEM` environment variable to disable in-memory caching of programs and facilitated automatic program cache cleaning when running out of memory. [9322d14ce] [6cf1ae081ac] | ||
- Changed return type of `abs_diff` to be same as that of the input. [2a3e1ab82] | ||
uditagarwal97 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Improved templated and convertible builtins after clarification in SYCL 2020 revision 8. [92861835] | ||
- Allowed generic_space `multi_ptr` in math builtins. [eda8a587f1] | ||
- Improved error message when writing beyond the bounds of `simd_view` object. [197c33a2b] | ||
- Optimized `ext_oneapi_submit_barrier` from [ext_oneapi_enqueue_barrier](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/supported/sycl_ext_oneapi_enqueue_barrier.asciidoc) into `NOP` for in-order queues with empty waitlist. [7e08c15dd] | ||
- Supported prefetch, memory advise, and automatic management of dependencies for multiple command-buffer submissions in [ext_oneapi_graph](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_oneapi_graph.asciidoc) SYCL extension. [c6fbac59] [56f8d38c] | ||
- Added support for profiling command buffers. [b04f894dbd06b] | ||
- Implemented ESIMD APIs that accepts compile-time properties. [655ab100] [5582ce4db] [d286f4ab1c] [961793913] [0cfe7e35] [656b8be7] | ||
- Removed depreciated esimd_emulators from device filters and depreciated `SYCL_DEVICE_FILTER` in favor of `ONEAPI_DEVICE_SELECTOR`. [9d0888ca3] [8d0fa9875] | ||
uditagarwal97 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Improved error message when trying to fuse kernels with incompatible ND-Ranges in [ext_codeplay_kernel_fusion](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_codeplay_kernel_fusion.asciidoc). [7d492f87ec97] | ||
- Made user functions to always inline in the SYCL kernels to reduce overhead in SYCLCompat library. [e121c8811] | ||
- Made runtime choose device image with inlined specialization constant when `-fsycl-add-default-spec-consts-image` option is used. [73d34739b] | ||
- Made `nd_item` stateless to reduce initialization overhead. [7999e27b] | ||
- Made backend return a sorted list of platforms when `platform::get_platforms()` is called. [feb7722076] | ||
uditagarwal97 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Improved warning messages and added `-ignore-device-selector` flag to `sycl-ls` to ignore device selection environment variables. [6e3aa218] | ||
- Improved error handling when calling `matrix_combinations` query on platforms unsupported by [ext_oneapi_device_architecture](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_oneapi_device_architecture.asciidoc) SYCL extension. [c00305b73] | ||
- Made default `sycl::queue` context reusable on Windows. [491e6e4ea] | ||
|
||
|
||
### Documentation | ||
- Updated [ext_oneapi_kernel_compiler_opencl](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_oneapi_kernel_compiler_opencl.asciidoc) SYCL extension to allow querying OpenCL version. [6344ead19e] | ||
- Updated [ext_intel_data_flow_pipes_properties](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_intel_data_flow_pipes_properties.asciidoc) to include AXI streaming as a protocol choice on FPGAs. [2a0911892] | ||
- Updated [KernelFusionJIT](https://github.com/intel/llvm/blob/sycl/sycl/doc/design/KernelFusionJIT.md) to include details on local/private memory allocation size, different promotion hints, etc. [b9854a12] | ||
- Updated [ext_oneapi_in_order_queue_events](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_oneapi_in_order_queue_events.asciidoc) to make external events wait when queue is waited on. [b0f584c675f9] | ||
- Improved [ext_oneapi_address_cast](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/proposed/sycl_ext_oneapi_address_cast.asciidoc) SYCL extension to allow casting raw pointers to multi_ptr. [84a92e03] | ||
|
||
## Bug Fixes | ||
### SYCL Compiler | ||
- Fixed compiler build error due to unused private field when built without assertions. [ff61613e2] | ||
uditagarwal97 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Made the device binary generated by `-fsycl-link=image` linkable by adding more information into the binary. [219d4ef54] | ||
|
||
### SYCL Library | ||
- Fixed computation of submit time based on host timestamps. [254756369c] | ||
- Fixed SYCL CTS failures for Unified Runtime's OpenCL adapter. [4c0780e76] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I will @aarongreig here, but it seems to me that all those fixes were a part of Unified Runtime, and not a part of SYCL RT for which this release notes are. Considering that Unified Runtime is an external dependency, I wonder if we need to include a separate sub-section here to specify which version/hash/tag of Unified Runtime should be used together with that release - users will then be able to go and look for release notes of that version in the UR repo, or browse history of changes. Tagging @againull, @steffenlarsen and @nrspruit here for their feedback There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am in favor of having a sub-section to specify the UR tag. |
||
- Fixed strict aliasing violations in `sycl::vec` routines. [a9d0e1b8] | ||
- Fixed logical operations and integer conversions among sycl::vec types. [3d5e41fddf] [ff48612f] [7868596d] | ||
- Fixed compound operators on `annoted_ptr` when the user-defined type only defines a compound operator. [c43a90f2] | ||
- Fixed exponential slowdown in multiple calls to `queue::ext_oneapi_submit_barrier`. [079fc97b] | ||
- Fixed input handling for `ONEAPI_DEVICE_SELECTOR` environment variable. [90b6aee46] | ||
- Fixed in-order dependency filtering for isolated kernels. [8e7995df] | ||
- Fixed double-free bug in kernel-program cache. [04ff5b81] | ||
- Fixed resource leak in `SYCL_FALLBACK_ASSERT`. [b478d2fa] | ||
- Fix deadlock in in-order queue when submitting a host task and simultaneously accessing stream service events. [3031733] | ||
- Made `sycl::vec` interface consistent with `sycl::marray` and `sycl::buffer` by defining `value_type` alias. [33e5b10] | ||
|
||
### Documentation | ||
- Clarified [ext_oneapi_graph](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_oneapi_graph.asciidoc) SYCL extension to make it illegal for graph nodes to depend on events from outside the graph. [2581123a1] | ||
|
||
## Known Issues | ||
- On Windows, the Unified Runtime's Level Zero leak check does not work correctly with | ||
the default contexts on Windows. This is because on Windows the release | ||
of the plugin DLLs races against the release of static global variables | ||
(like the default context). | ||
- Intel Graphic Compiler's Vector Compute backend does not support O0 code and often gets miscompiled, produces wrong answers and crashes. This issue directly affects ESIMD code at O0. As a temporary workaround, we have optimize ESIMD code even in O0 mode. [00749b1e8](https://github.com/intel/llvm/commit/00749b1e8e3085acfdc63108f073a255842533e2) | ||
- `multi_ptr` relational operators assume the lowest possible value of `std::null_ptr` which might cause issues with the CUDA and AMDGPU backends. | ||
uditagarwal97 marked this conversation as resolved.
Show resolved
Hide resolved
uditagarwal97 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
|
||
# Nov'23 release notes | ||
Release notes for commit range f4e0d3177338..f4ed132f243a | ||
|
||
|
Uh oh!
There was an error while loading. Please reload this page.