@@ -90,8 +90,8 @@ git clone --config core.autocrlf=false https://github.com/intel/llvm -b sycl
90
90
## Build DPC++ toolchain
91
91
92
92
The easiest way to get started is to use the buildbot
93
- [ configure] ( ../.. /buildbot/configure.py) and
94
- [ compile] ( ../.. /buildbot/compile.py) scripts.
93
+ [ configure] ( https://github.com/intel/llvm/blob/sycl /buildbot/configure.py) and
94
+ [ compile] ( https://github.com/intel/llvm/blob/sycl /buildbot/compile.py) scripts.
95
95
96
96
In case you want to configure CMake manually the up-to-date reference for
97
97
variables is in these files.
@@ -233,21 +233,21 @@ LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$DPCPP_HOME/llvm/build/lib ./a.out
233
233
234
234
### Build DPC++ toolchain with support for HIP AMD
235
235
236
- There is beta support for oneAPI DPC++ for HIP on AMD devices. It is not feature
237
- complete and it still contains known and unknown bugs. Currently it has only
238
- been tried on Linux, with ROCm 4.2.0, 4.3.0, 4.5.2, 5.3.0, and 5.4.3, using the
239
- AMD Radeon Pro W6800 (gtx1030), MI50 (gfx906), MI100 (gfx908) and MI250x
240
- (gfx90a) devices. The backend is tested by a relevant device/toolkit prior to a
241
- oneAPI plugin release. Go to the plugin release
242
- [ pages] ( https://developer.codeplay.com/products/oneapi/amd ) for further details.
243
-
244
236
To enable support for HIP devices, follow the instructions for the Linux DPC++
245
237
toolchain, but add the ` --hip ` flag to ` configure.py ` .
246
238
247
239
Enabling this flag requires an installation of ROCm on the system, for
248
240
instruction on how to install this refer to
249
241
[ AMD ROCm Installation Guide for Linux] ( https://rocmdocs.amd.com/en/latest/Installation_Guide/Installation-Guide.html ) .
250
242
243
+ ROCm versions above 5.7 are recommended as earlier versions don't have graph
244
+ support. DPC++ aims to support new ROCm versions as they come out, so there may
245
+ be a delay but generally the latest ROCm version should work. The ROCm support
246
+ is mostly tested on AMD Radeon Pro W6800 (gfx1030), and MI250x (gfx90a), however
247
+ other architectures supported by LLVM may work just fine. The full list of ROCm
248
+ versions tested prior to oneAPI releases are listed on the plugin release
249
+ [ pages] ( https://developer.codeplay.com/products/oneapi/amd ) .
250
+
251
251
The DPC++ build assumes that ROCm is installed in ` /opt/rocm ` , if it is
252
252
installed somewhere else, the directory must be provided through the CMake
253
253
variable ` UR_HIP_ROCM_DIR ` which can be passed through to cmake using the
@@ -276,7 +276,10 @@ by default when configuring for HIP. For more details on building LLD refer to
276
276
277
277
### Build DPC++ toolchain with support for HIP NVIDIA
278
278
279
- There is experimental support for oneAPI DPC++ for HIP on Nvidia devices.
279
+ HIP applications can be built to target Nvidia GPUs, so in theory it is possible
280
+ to build the DPC++ HIP support for Nvidia, however this is not supported, so it
281
+ may not work.
282
+
280
283
There is no continuous integration for this and there are no guarantees for
281
284
supported platforms or configurations.
282
285
@@ -288,13 +291,12 @@ To enable support for HIP NVIDIA devices, follow the instructions for the Linux
288
291
DPC++ toolchain, but add the ` --hip ` and ` --hip-platform NVIDIA ` flags to
289
292
` configure.py ` .
290
293
291
- Enabling this flag requires HIP to be installed, more specifically
292
- [ HIP NVCC] ( https://rocmdocs.amd.com/en/latest/Installation_Guide/HIP-Installation.html#nvidia-platform ) ,
293
- as well as the CUDA Runtime API to be installed, see
294
- [ NVIDIA CUDA Installation Guide for Linux] ( https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html ) .
295
-
296
- Currently, this has only been tried on Linux, with ROCm 4.2.0 or 4.3.0, with
297
- CUDA 11, and using a GeForce 1060 device.
294
+ Enabling this flag requires HIP to be installed, specifically for Nvidia, see
295
+ the Nvidia tab on the HIP installation docs
296
+ [ here] ( https://rocm.docs.amd.com/projects/HIP/en/latest/install/install.html ) ,
297
+ as well as the CUDA Runtime API to be installed, see [ NVIDIA CUDA Installation
298
+ Guide for
299
+ Linux] ( https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html ) .
298
300
299
301
### Build DPC++ toolchain with support for ARM processors
300
302
@@ -705,14 +707,6 @@ clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda \
705
707
The results are correct!
706
708
` ` `
707
709
708
- ** NOTE** : Currently, when the application has been built with the CUDA target,
709
- the CUDA backend must be selected at runtime using the ` ONEAPI_DEVICE_SELECTOR`
710
- environment variable.
711
-
712
- ` ` ` bash
713
- ONEAPI_DEVICE_SELECTOR=cuda:* ./simple-sycl-app-cuda.exe
714
- ` ` `
715
-
716
710
** NOTE** : oneAPI DPC++/SYCL developers can specify SYCL device for execution
717
711
using device selectors (e.g. ` sycl::cpu_selector_v` , ` sycl::gpu_selector_v` ,
718
712
[Intel FPGA selector(s)](extensions/supported/sycl_ext_intel_fpga_device_selector.asciidoc))
@@ -746,6 +740,14 @@ clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda \
746
740
-Xsycl-target-backend --cuda-gpu-arch=sm_80
747
741
` ` `
748
742
743
+ Additionally AMD and Nvidia targets also support aliases for the target to
744
+ simplify passing the specific architectures, for example
745
+ ` -fsycl-targets=nvidia_gpu_sm_80` is equivalent to
746
+ ` -fsycl-targets=nvptx64-nvidia-cuda -Xsycl-target-backend
747
+ --cuda-gpu-arch=sm_80` , the full list of available aliases is documented in the
748
+ [Users Manual](UsersManual.md#generic-options), for the ` -fsycl-targets`
749
+ option.
750
+
749
751
To build simple-sycl-app ahead of time for GPU, CPU or Accelerator devices,
750
752
specify the target architecture. The examples provided use a supported
751
753
alias for the target, representing a full triple. Additional details can
@@ -914,11 +916,14 @@ int CUDASelector(const sycl::device &Device) {
914
916
915
917
### HIP back-end limitations
916
918
917
- * Requires a ROCm compatible operating system, for full details of supported
918
- Operating System for ROCm, please refer to the
919
- [ROCm Supported Operating Systems](https://github.com/RadeonOpenCompute/ROCm#supported-operating-systems).
920
- * Support is still in a beta state, but the backend is being actively developed.
921
- * Global offsets are currently not supported.
919
+ * Requires a ROCm compatible system and GPU, see for
920
+ [Linux](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html#supported-skus)
921
+ and for
922
+ [Windows](https://rocm.docs.amd.com/projects/install-on-windows/en/latest/reference/system-requirements.html#supported-skus).
923
+ * Windows for HIP is not supported by DPC++ at the moment so it may not work.
924
+ * `printf` within kernels is not supported.
925
+ * C++ standard library functions using complex types are not supported,
926
+ `sycl::complex` should be used instead.
922
927
923
928
## Find More
924
929
0 commit comments