You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The example built with the DPC++ with CUDA support finishes without result mismatch.
cp ../sssp-cuda/data.tar.gz .
tar -zxf data.tar.gz
make HIP=yes run
./main -g 120 -t 1 -w 10 -r 1
Number of nodes = 264346 Number of edges = 733846 Initialization Time (ms): 349.456024
Allocation Time (ms): 0.116000
Copy To Device Time (ms): 9.258000
Kernel Time (ms): 2.901000
Copy Back and Merge Time (ms): 1.888000
FAIL: Computed node 2 cost (-2147483647 != 50999) does not match the expected value
The text was updated successfully, but these errors were encountered:
I had a quick look and it looks like this sample uses local memory accessors, unfortunately I believe this isn't supported by the HIP backend at the moment.
The NVPTX backend has a special IR pass to handle these which doesn't exist yet for the AMDGCN backend so any local accessor argument will not be setup correctly. I suspect using global buffer for the two arguments may work as a workaround, however we do need to add local argument support for HIP.
Now, that it lives in `SYCLLowerIR` it can be easily shared between AMDGCN and NVPTX backends.
This requires the same alignment fix as for Cuda, see: #5113Fixes#5013
To reproduce issue for the example https://github.com/zjin-lcf/HeCBench/tree/master/sssp-sycl
The example built with the DPC++ with CUDA support finishes without result mismatch.
The text was updated successfully, but these errors were encountered: