Skip to content

Commit 54f31ad

Browse files
author
iclsrc
committed
Merge from 'sycl' to 'sycl-web'
2 parents 96644b7 + 8ef8798 commit 54f31ad

File tree

2 files changed

+198
-1
lines changed

2 files changed

+198
-1
lines changed

buildbot/dependency.conf

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,8 @@ ocl_gpu_rt_ver_win=27.20.100.8935
1212
intel_sycl_ver=build
1313
# https://github.com/oneapi-src/oneTBB/releases/download/v2021.1-beta10/oneapi-tbb-2021.1-beta10-lin.tgz
1414
tbb_ver=2021.1.053
15-
# https://github.com/oneapi-src/oneTBB/releases/download/v2021.1-beta10/oneapi-tbb-2021.1-beta10-win.zip
15+
# Binaries can be built from sources following instructions under:
16+
# https://github.com/oneapi-src/oneTBB/blob/master/cmake/README.md
1617
tbb_ver_win=2021.1.049
1718
# https://github.com/intel/llvm/releases/download/2020-WW45/fpgaemu-2020.11.11.0.04_rel.tar.gz
1819
ocl_fpga_emu_ver=2020.11.11.0.04

sycl/ReleaseNotes.md

Lines changed: 196 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,199 @@
1+
# November'20 release notes
2+
3+
Release notes for commit range c9d50752..5d7e0925
4+
5+
## New features
6+
- Implemented support for new loop attribute(intel::nofusion) for FPGA
7+
[68ab67ad]
8+
- Implemented support for new FPGA function attribute stall_enable [8fbf4bbe]
9+
- Implemented accessor-based gather/scatter and scalar mem access for ESIMD
10+
feature [0aac708a]
11+
- Implemented support for dot_product API [6cc97d2a]
12+
- Implemented ONEAPI::filter_selector that accepts one or more filters for
13+
device selection [174fd168]
14+
- Introduced SYCL_DEVICE_FILTER environment variable allowing to filter
15+
available devices [14e227c4], [ccdf8475]
16+
- Implemented accessor_properties extension [f7d073d1]
17+
- Implemented SYCL_INTEL_device_specific_kernel_queries [24ae95b3]
18+
- Implemented support for group algorithms on CUDA backend [909459ba]
19+
- Implemented support for sub_group extension in CUDA backend [baed6a5b],
20+
[551d7067], [f189e413]
21+
- Implemented support for USM extension in CUDA plugin [da8929e0]
22+
- Implemented support for cl_intel_create_buffer_with_properties extension [b8a7b012]
23+
- Implemented support for sycl::info::device::host_unified_memory [08066b24]
24+
- Added clang support for FPGA kernel attribute scheduler_target_fmax_mhz
25+
[20013e23]
26+
27+
## Improvements
28+
### SYCL Compiler
29+
- Enabled USM address space by default for the FPGA hardware [7896819d]
30+
- Added emitting of a warning when size of kernel arguments exceeds 2kB for all
31+
devices [e00ab746], [4960fc90]
32+
- Changed default SYCL standard version to 2020 [67acf814]
33+
- Added diagnostics when the translator encounters an unknown or unsupported
34+
LLVM instruction [5a28d4e5]
35+
- Added diagnostic for attempt to pass a pointer to variable length array as
36+
kernel argument [538c4c9c]
37+
- Improved FPGA AOT help output with -fsycl-help [dc8a0593]
38+
- Made /MD the default option of compiler on Windows and made driver
39+
generate error if /MT was passed to it. SYCL library is designed in such a way
40+
that STL objects must cross the sycl.dll boundary, which is guaranteed to
41+
work safe on Windows only if the runtime in the app using sycl.dll and in
42+
sycl.dll is the same and is dynamic [d31184e1], [8735bb81], [0092d4da]
43+
- Enforced C++ for C source files when compiling in SYCL mode [adc2ac72]
44+
- Added use of template parameter in [[intelfpga::num_simd_work_items()]]
45+
attribute [678911a8]
46+
- Added new spellings for SYCL FPGA attributes [5949228d], [b1cf776e9]
47+
- All definitions used for compiler needs were marked with underscore prefix
48+
[51d3c205]
49+
- Disabled diagnostic about use of functions with raw pointer in kernels
50+
[b4a3f03f]
51+
- Improved diagnostics for invalid SYCL kernel names [cb5ddb49], [89fd4284]
52+
53+
### SYCL Library
54+
- Removed deprecated spelling ([[cl::intel_reqd_sub_group_size()]]) of
55+
IntelReqdSubGroupSize attribute [9dda36fe]
56+
- Added support for USM shared memory allocator for Level Zero backend
57+
[db5037ca]
58+
- Added support for a context with multiple device in Level Zero backend
59+
[129ee442]
60+
- Added diagnostics for deprecated environment variables: SYCL_BE and
61+
SYCL_DEVICE_TYPE [6242160b]
62+
- Made spec_constant default constructor public and available on host
63+
[53d909e2]
64+
- Added constraints instead of static asserts and is_native_function_object()
65+
for group algorithms [97bec247]
66+
- Added support for L0 loader validation layer [4c6cda3f]
67+
- Added multi-device and multi-platform support for SYCL_DEVICE_ALLOWLIST
68+
[dbf31c3c]
69+
- Removed two-input sub-group shuffles [ef969c14]
70+
- Enabled inspecting values wrapped into private_memory<T> by evaluating
71+
`operator()` from GDB [31c23ddc]
72+
- Changed buffer allocation in the Level Zero plugin to use host shared memory for integrated GPUs [2ae1bc9e]
73+
- Implemented `queue::parallel_for()` accepting reduction [ffdadc2e]
74+
- Improved performance of float atomic_ref [0b7dacf1]
75+
- Made CUDA backend try to find a better block size using
76+
cuOccupancyMaxPotentialBlockSize function from the CUDA driver [4fabfd16a]
77+
- Supported GroupBroadcast with 32-bit id to cover broadcast algorithm with
78+
the sub_group class [6e3f2440]
79+
80+
### Documentation
81+
- Added specification for SPV_INTEL_variable_length_array extension [9e4c51c4]
82+
- Added specification for accessor_properties and buffer_location extensions
83+
[f90614c5]
84+
- Moved specification for Unified Shared Memory to Khronos specification
85+
[a7ffe039]
86+
- Added documentation for filter_selector [c3f5cfba]
87+
- Updated C-CXX-StandardLibrary extension specification [0b6f8cd8]
88+
- Added ESIMD extension introduction document [c36a1411]
89+
- Added specialization constants extension introduction document [d88ef3b6]
90+
- Added specialization constant feature design doc [15cac431]
91+
- Documented kernel-program caching mechanism [5947cde81]
92+
- Added the SYCL_INTEL_mem_channel_property extension specification [5cf8088c]
93+
- Provided detailed description for guaranteed sub-group sizes[542c32ae]
94+
- Documented test-related processes [ff90e06d]
95+
- Added code examples for all SYCL FPGA loop attributes [6b958205]
96+
97+
## Bug fixes
98+
### SYCL Compiler
99+
- Fixed crash of compiler on invalid kernel type argument. [0c220ca5e]
100+
- Made clang search for the full executable name as opposed to just the case
101+
name for the AOT tools (aoc, ocloc, opencl-aot) to avoid directory calls
102+
[78a86da3], [244e874b]
103+
- Linked SYCL device libraries by default as not all backends support SPIRV
104+
online linkage [9dd18ca8]
105+
- Fixed assertion when /P option is used on windows[a21d7ef4]
106+
- Fixed crash when array of pointers is passed to kernel[1fc0e4f84]
107+
- Fixed issues with use of type from std namespace in kernel type names
108+
[dd7fec83]
109+
- Fixed debug information missed for work-item built-in translation [9c06d429]
110+
- Added warnings emission which had been suppressed for SYCL headers [e6eed1a7]
111+
- Fixed optimization disabling option for gen to use -cl-opt-disable
112+
[ba4e567fe]
113+
- Emulated "funnel shift left" which was not supported in the OpenCL
114+
ExtInst set on SPIRV translator side [97d7eec5]
115+
- Fixed build issue when TBB and libstdc++ 10.X were used caused by including
116+
std C++ headers in integration header file [63369132]
117+
- Fixed processing for partial link step with static archives by passing linker
118+
specific arguments there [3ab8cc82]
119+
- Enabled `-reuse-exe` support for Windows [43f2d4ba]
120+
- Fixed missing dependency file when compiling for `-fintelfpga` and using a named
121+
dependency output file [df5f1ab67]
122+
123+
### SYCL Library
124+
- Fixed build log preserving for L0 plugin [638b71b1]
125+
- Added missing math APIs for devicelib [e438bc814]
126+
- Enabled async_work_group_copy for scalar and vector bool types [bb78d2cb]
127+
- Aligned image class constructors with the SYCL specification [049ae996]
128+
- Removed half type alias causing name conflicts with CUDA headers [c00c1fa3]
129+
- Fixed explicit copy operation for host device [f20fd4de]
130+
- Made stream flush operation non-blocking [e7492fb2]
131+
- Fixed image arguments order when passing to PI routines [70d6f87b]
132+
- Fixed circular dependency between the device_impl and the platform_impl
133+
causing handler leak [255f304f]
134+
- Fixed work-group size selection in reductions [2ae49f57e]
135+
- Fixed compilation errors when built with --std=c++20 [ecd0adbb]
136+
- Fixed treating internal allocations of host memory as read only for memory objects created with const pointer, causing double free issue [8b5506255]
137+
- Fixed a problem in Level Zero plugin with kernels and programs destruction
138+
while they can be used [b9bf9f5f]
139+
- Fixed wrong exception raised by ALLOWLIST mechanism [d81081f7]
140+
- Fixed reporting supported device partitioning in Level Zero [766367be]
141+
- Aligned get_info<info::device::version>() with the SYCL spec [4644e639]
142+
- Set default work group size to {1, 1, 1} to fix out-of-memory crashes on
143+
some configurations [4d76de43]
144+
145+
### Documentation
146+
- Fixed path to FPGA device selector [ca33f7f7]
147+
- Renamed LEVEL0 environment variable to LEVEL_ZERO in documents and code
148+
comments following source code change [2c3908b4]
149+
- Clarified --system-ocl key in GetStartedGuide.md [e31b94e5]
150+
151+
## API/ABI breakages
152+
- Implemented accessor_properties extension for accessor class [f7d073d1]
153+
154+
## Known issues
155+
- GlobalWorkOffset is not supported by Level Zero backend [6f9e9a76]
156+
- The code with function pointer is hanging on Level Zero [d384295e]
157+
- If an application uses std::* math function in the kernel code the
158+
-fsycl-device-lib=libm-fp64 option should be passed to the compiler.
159+
- User-defined functions with the same name and signature (exact match of
160+
arguments, return type doesn't matter) as of an OpenCL C built-in
161+
function, can lead to Undefined Behavior.
162+
- A DPC++ system that has FPGAs installed does not support multi-process
163+
execution. Creating a context opens the device associated with the context
164+
and places a lock on it for that process. No other process may use that
165+
device. Some queries about the device through device.get_info<>() also
166+
open up the device and lock it to that process since the runtime needs
167+
to query the actual device to obtain that information.
168+
- On Windows, DPC++ compiler enforces using dynamic C++ runtime for
169+
application linked with SYCL library by:
170+
- linking with msvcrt[d].dll when -fsycl switch is used;
171+
- emitting an error on attempts to compile a program with static C++ RT
172+
using -fsycl and /MT or /MTd.
173+
That protects you from complicated runtime errors caused by C++ objects
174+
crossing sycl[d].dll boundary and not always handled properly by different
175+
versions of C++ RT used on app and sycl[d].dll sides.
176+
- The format of the object files produced by the compiler can change between
177+
versions. The workaround is to rebuild the application.
178+
- The SYCL library doesn't guarantee stable API/ABI, so applications compiled
179+
with older version of the SYCL library may not work with new one.
180+
The workaround is to rebuild the application.
181+
[ABI policy guide](doc/ABIPolicyGuide.md)
182+
- Using `cl::sycl::program` API to refer to a kernel defined in another
183+
translation unit leads to undefined behavior
184+
- Linkage errors with the following message:
185+
`error LNK2005: "bool const std::_Is_integral<bool>" (??$_Is_integral@_N@std@@3_NB) already defined`
186+
can happen when a SYCL application is built using MS Visual Studio 2019
187+
version below 16.3.0 and user specifies `-std=c++14` or `/std:c++14`.
188+
- Employing read sampler for image accessor may result in sporadic issues with
189+
Level Zero plugin/backend [2c50c03]
190+
- Printing internal defines isn't supported on Windows [50628db]
191+
- Group algorithms for MUL/AND/OR/XOR cannot be enabled for group scope due to
192+
SPIR-V limitations, and are not enabled for sub-group scope yet as the
193+
SPIR-V version isn't automatically raised from 1.1 to 1.3 [96da39e]
194+
- We cannot run Dead Argument Elimination for ESIMD since the pointers to SPIR
195+
kernel functions are saved in `!genx.kernels metadata` [cf10351]
196+
1197
# September'20 release notes
2198

3199
Release notes for commit range 5976ff0..1fc0e4f

0 commit comments

Comments
 (0)