|
| 1 | +# November'20 release notes |
| 2 | + |
| 3 | +Release notes for commit range c9d50752..5d7e0925 |
| 4 | + |
| 5 | +## New features |
| 6 | + - Implemented support for new loop attribute(intel::nofusion) for FPGA |
| 7 | + [68ab67ad] |
| 8 | + - Implemented support for new FPGA function attribute stall_enable [8fbf4bbe] |
| 9 | + - Implemented accessor-based gather/scatter and scalar mem access for ESIMD |
| 10 | + feature [0aac708a] |
| 11 | + - Implemented support for dot_product API [6cc97d2a] |
| 12 | + - Implemented ONEAPI::filter_selector that accepts one or more filters for |
| 13 | + device selection [174fd168] |
| 14 | + - Introduced SYCL_DEVICE_FILTER environment variable allowing to filter |
| 15 | + available devices [14e227c4], [ccdf8475] |
| 16 | + - Implemented accessor_properties extension [f7d073d1] |
| 17 | + - Implemented SYCL_INTEL_device_specific_kernel_queries [24ae95b3] |
| 18 | + - Implemented support for group algorithms on CUDA backend [909459ba] |
| 19 | + - Implemented support for sub_group extension in CUDA backend [baed6a5b], |
| 20 | + [551d7067], [f189e413] |
| 21 | + - Implemented support for USM extension in CUDA plugin [da8929e0] |
| 22 | + - Implemented support for cl_intel_create_buffer_with_properties extension [b8a7b012] |
| 23 | + - Implemented support for sycl::info::device::host_unified_memory [08066b24] |
| 24 | + - Added clang support for FPGA kernel attribute scheduler_target_fmax_mhz |
| 25 | + [20013e23] |
| 26 | + |
| 27 | +## Improvements |
| 28 | +### SYCL Compiler |
| 29 | + - Enabled USM address space by default for the FPGA hardware [7896819d] |
| 30 | + - Added emitting of a warning when size of kernel arguments exceeds 2kB for all |
| 31 | + devices [e00ab746], [4960fc90] |
| 32 | + - Changed default SYCL standard version to 2020 [67acf814] |
| 33 | + - Added diagnostics when the translator encounters an unknown or unsupported |
| 34 | + LLVM instruction [5a28d4e5] |
| 35 | + - Added diagnostic for attempt to pass a pointer to variable length array as |
| 36 | + kernel argument [538c4c9c] |
| 37 | + - Improved FPGA AOT help output with -fsycl-help [dc8a0593] |
| 38 | + - Made /MD the default option of compiler on Windows and made driver |
| 39 | + generate error if /MT was passed to it. SYCL library is designed in such a way |
| 40 | + that STL objects must cross the sycl.dll boundary, which is guaranteed to |
| 41 | + work safe on Windows only if the runtime in the app using sycl.dll and in |
| 42 | + sycl.dll is the same and is dynamic [d31184e1], [8735bb81], [0092d4da] |
| 43 | + - Enforced C++ for C source files when compiling in SYCL mode [adc2ac72] |
| 44 | + - Added use of template parameter in [[intelfpga::num_simd_work_items()]] |
| 45 | + attribute [678911a8] |
| 46 | + - Added new spellings for SYCL FPGA attributes [5949228d], [b1cf776e9] |
| 47 | + - All definitions used for compiler needs were marked with underscore prefix |
| 48 | + [51d3c205] |
| 49 | + - Disabled diagnostic about use of functions with raw pointer in kernels |
| 50 | + [b4a3f03f] |
| 51 | + - Improved diagnostics for invalid SYCL kernel names [cb5ddb49], [89fd4284] |
| 52 | + |
| 53 | +### SYCL Library |
| 54 | + - Removed deprecated spelling ([[cl::intel_reqd_sub_group_size()]]) of |
| 55 | + IntelReqdSubGroupSize attribute [9dda36fe] |
| 56 | + - Added support for USM shared memory allocator for Level Zero backend |
| 57 | + [db5037ca] |
| 58 | + - Added support for a context with multiple device in Level Zero backend |
| 59 | + [129ee442] |
| 60 | + - Added diagnostics for deprecated environment variables: SYCL_BE and |
| 61 | + SYCL_DEVICE_TYPE [6242160b] |
| 62 | + - Made spec_constant default constructor public and available on host |
| 63 | + [53d909e2] |
| 64 | + - Added constraints instead of static asserts and is_native_function_object() |
| 65 | + for group algorithms [97bec247] |
| 66 | + - Added support for L0 loader validation layer [4c6cda3f] |
| 67 | + - Added multi-device and multi-platform support for SYCL_DEVICE_ALLOWLIST |
| 68 | + [dbf31c3c] |
| 69 | + - Removed two-input sub-group shuffles [ef969c14] |
| 70 | + - Enabled inspecting values wrapped into private_memory<T> by evaluating |
| 71 | + `operator()` from GDB [31c23ddc] |
| 72 | + - Changed buffer allocation in the Level Zero plugin to use host shared memory for integrated GPUs [2ae1bc9e] |
| 73 | + - Implemented `queue::parallel_for()` accepting reduction [ffdadc2e] |
| 74 | + - Improved performance of float atomic_ref [0b7dacf1] |
| 75 | + - Made CUDA backend try to find a better block size using |
| 76 | + cuOccupancyMaxPotentialBlockSize function from the CUDA driver [4fabfd16a] |
| 77 | + - Supported GroupBroadcast with 32-bit id to cover broadcast algorithm with |
| 78 | + the sub_group class [6e3f2440] |
| 79 | + |
| 80 | +### Documentation |
| 81 | + - Added specification for SPV_INTEL_variable_length_array extension [9e4c51c4] |
| 82 | + - Added specification for accessor_properties and buffer_location extensions |
| 83 | + [f90614c5] |
| 84 | + - Moved specification for Unified Shared Memory to Khronos specification |
| 85 | + [a7ffe039] |
| 86 | + - Added documentation for filter_selector [c3f5cfba] |
| 87 | + - Updated C-CXX-StandardLibrary extension specification [0b6f8cd8] |
| 88 | + - Added ESIMD extension introduction document [c36a1411] |
| 89 | + - Added specialization constants extension introduction document [d88ef3b6] |
| 90 | + - Added specialization constant feature design doc [15cac431] |
| 91 | + - Documented kernel-program caching mechanism [5947cde81] |
| 92 | + - Added the SYCL_INTEL_mem_channel_property extension specification [5cf8088c] |
| 93 | + - Provided detailed description for guaranteed sub-group sizes[542c32ae] |
| 94 | + - Documented test-related processes [ff90e06d] |
| 95 | + - Added code examples for all SYCL FPGA loop attributes [6b958205] |
| 96 | + |
| 97 | +## Bug fixes |
| 98 | +### SYCL Compiler |
| 99 | + - Fixed crash of compiler on invalid kernel type argument. [0c220ca5e] |
| 100 | + - Made clang search for the full executable name as opposed to just the case |
| 101 | + name for the AOT tools (aoc, ocloc, opencl-aot) to avoid directory calls |
| 102 | + [78a86da3], [244e874b] |
| 103 | + - Linked SYCL device libraries by default as not all backends support SPIRV |
| 104 | + online linkage [9dd18ca8] |
| 105 | + - Fixed assertion when /P option is used on windows[a21d7ef4] |
| 106 | + - Fixed crash when array of pointers is passed to kernel[1fc0e4f84] |
| 107 | + - Fixed issues with use of type from std namespace in kernel type names |
| 108 | + [dd7fec83] |
| 109 | + - Fixed debug information missed for work-item built-in translation [9c06d429] |
| 110 | + - Added warnings emission which had been suppressed for SYCL headers [e6eed1a7] |
| 111 | + - Fixed optimization disabling option for gen to use -cl-opt-disable |
| 112 | + [ba4e567fe] |
| 113 | + - Emulated "funnel shift left" which was not supported in the OpenCL |
| 114 | + ExtInst set on SPIRV translator side [97d7eec5] |
| 115 | + - Fixed build issue when TBB and libstdc++ 10.X were used caused by including |
| 116 | + std C++ headers in integration header file [63369132] |
| 117 | + - Fixed processing for partial link step with static archives by passing linker |
| 118 | + specific arguments there [3ab8cc82] |
| 119 | + - Enabled `-reuse-exe` support for Windows [43f2d4ba] |
| 120 | + - Fixed missing dependency file when compiling for `-fintelfpga` and using a named |
| 121 | + dependency output file [df5f1ab67] |
| 122 | + |
| 123 | +### SYCL Library |
| 124 | + - Fixed build log preserving for L0 plugin [638b71b1] |
| 125 | + - Added missing math APIs for devicelib [e438bc814] |
| 126 | + - Enabled async_work_group_copy for scalar and vector bool types [bb78d2cb] |
| 127 | + - Aligned image class constructors with the SYCL specification [049ae996] |
| 128 | + - Removed half type alias causing name conflicts with CUDA headers [c00c1fa3] |
| 129 | + - Fixed explicit copy operation for host device [f20fd4de] |
| 130 | + - Made stream flush operation non-blocking [e7492fb2] |
| 131 | + - Fixed image arguments order when passing to PI routines [70d6f87b] |
| 132 | + - Fixed circular dependency between the device_impl and the platform_impl |
| 133 | + causing handler leak [255f304f] |
| 134 | + - Fixed work-group size selection in reductions [2ae49f57e] |
| 135 | + - Fixed compilation errors when built with --std=c++20 [ecd0adbb] |
| 136 | + - Fixed treating internal allocations of host memory as read only for memory objects created with const pointer, causing double free issue [8b5506255] |
| 137 | + - Fixed a problem in Level Zero plugin with kernels and programs destruction |
| 138 | + while they can be used [b9bf9f5f] |
| 139 | + - Fixed wrong exception raised by ALLOWLIST mechanism [d81081f7] |
| 140 | + - Fixed reporting supported device partitioning in Level Zero [766367be] |
| 141 | + - Aligned get_info<info::device::version>() with the SYCL spec [4644e639] |
| 142 | + - Set default work group size to {1, 1, 1} to fix out-of-memory crashes on |
| 143 | + some configurations [4d76de43] |
| 144 | + |
| 145 | +### Documentation |
| 146 | + - Fixed path to FPGA device selector [ca33f7f7] |
| 147 | + - Renamed LEVEL0 environment variable to LEVEL_ZERO in documents and code |
| 148 | + comments following source code change [2c3908b4] |
| 149 | + - Clarified --system-ocl key in GetStartedGuide.md [e31b94e5] |
| 150 | + |
| 151 | +## API/ABI breakages |
| 152 | + - Implemented accessor_properties extension for accessor class [f7d073d1] |
| 153 | + |
| 154 | +## Known issues |
| 155 | + - GlobalWorkOffset is not supported by Level Zero backend [6f9e9a76] |
| 156 | + - The code with function pointer is hanging on Level Zero [d384295e] |
| 157 | + - If an application uses std::* math function in the kernel code the |
| 158 | + -fsycl-device-lib=libm-fp64 option should be passed to the compiler. |
| 159 | + - User-defined functions with the same name and signature (exact match of |
| 160 | + arguments, return type doesn't matter) as of an OpenCL C built-in |
| 161 | + function, can lead to Undefined Behavior. |
| 162 | + - A DPC++ system that has FPGAs installed does not support multi-process |
| 163 | + execution. Creating a context opens the device associated with the context |
| 164 | + and places a lock on it for that process. No other process may use that |
| 165 | + device. Some queries about the device through device.get_info<>() also |
| 166 | + open up the device and lock it to that process since the runtime needs |
| 167 | + to query the actual device to obtain that information. |
| 168 | + - On Windows, DPC++ compiler enforces using dynamic C++ runtime for |
| 169 | + application linked with SYCL library by: |
| 170 | + - linking with msvcrt[d].dll when -fsycl switch is used; |
| 171 | + - emitting an error on attempts to compile a program with static C++ RT |
| 172 | + using -fsycl and /MT or /MTd. |
| 173 | + That protects you from complicated runtime errors caused by C++ objects |
| 174 | + crossing sycl[d].dll boundary and not always handled properly by different |
| 175 | + versions of C++ RT used on app and sycl[d].dll sides. |
| 176 | + - The format of the object files produced by the compiler can change between |
| 177 | + versions. The workaround is to rebuild the application. |
| 178 | + - The SYCL library doesn't guarantee stable API/ABI, so applications compiled |
| 179 | + with older version of the SYCL library may not work with new one. |
| 180 | + The workaround is to rebuild the application. |
| 181 | + [ABI policy guide](doc/ABIPolicyGuide.md) |
| 182 | + - Using `cl::sycl::program` API to refer to a kernel defined in another |
| 183 | + translation unit leads to undefined behavior |
| 184 | + - Linkage errors with the following message: |
| 185 | + `error LNK2005: "bool const std::_Is_integral<bool>" (??$_Is_integral@_N@std@@3_NB) already defined` |
| 186 | + can happen when a SYCL application is built using MS Visual Studio 2019 |
| 187 | + version below 16.3.0 and user specifies `-std=c++14` or `/std:c++14`. |
| 188 | + - Employing read sampler for image accessor may result in sporadic issues with |
| 189 | + Level Zero plugin/backend [2c50c03] |
| 190 | + - Printing internal defines isn't supported on Windows [50628db] |
| 191 | + - Group algorithms for MUL/AND/OR/XOR cannot be enabled for group scope due to |
| 192 | + SPIR-V limitations, and are not enabled for sub-group scope yet as the |
| 193 | + SPIR-V version isn't automatically raised from 1.1 to 1.3 [96da39e] |
| 194 | + - We cannot run Dead Argument Elimination for ESIMD since the pointers to SPIR |
| 195 | + kernel functions are saved in `!genx.kernels metadata` [cf10351] |
| 196 | + |
1 | 197 | # September'20 release notes
|
2 | 198 |
|
3 | 199 | Release notes for commit range 5976ff0..1fc0e4f
|
|
0 commit comments