You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello. I have build clang with cuda backend, however, blur (which I've reduced to a simple data copy) doesn't compute. The same code works fine on Windows version of oneAPI on CPU device, and on custom-built version of AdaptiveCpp on CUDA device.
The output of the program with all error messages:
Running on device: NVIDIA GeForce RTX 3080 Laptop GPU
UR CUDA ERROR:
Value: 700
Name: CUDA_ERROR_ILLEGAL_ADDRESS
Description: an illegal memory access was encountered
Function: operator ()
Source Location: D:\source\dpcpp\llvm\build\_deps\unified-runtime-src\source\adapters\cuda\queue.cpp:218
UR CUDA ERROR:
Value: 700
Name: CUDA_ERROR_ILLEGAL_ADDRESS
Description: an illegal memory access was encountered
Function: getNextTransferStream
Source Location: D:\source\dpcpp\llvm\build\_deps\unified-runtime-src\source\adapters\cuda\queue.cpp:107
UR CUDA ERROR:
Value: 700
Name: CUDA_ERROR_ILLEGAL_ADDRESS
Description: an illegal memory access was encountered
Function: wait
Source Location: D:\source\dpcpp\llvm\build\_deps\unified-runtime-src\source\adapters\cuda\event.cpp:142
An exception is caught for blur:
Native API failed. Native API returns: -999 (Unknown PI error) -999 (Unknown PI error)
Commands I used to compile and run the program:
set PATH=D:\source\dpcpp\llvm\build\bin;%PATH%
set LIB=D:\source\dpcpp\llvm\build\lib;%LIB%
clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda simple_blur1.cpp -o simple_blur1.exe
set ONEAPI_DEVICE_SELECTOR=cuda:*
simple-blur1.exe
OS: Windows 11 Home version : 22H2
CPU: 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz 2.30 GHz
Target device and vendor: NVIDIA GeForce RTX 3080 Laptop GPU Driver Version: 31.0.15.3713
CUDA version: 12.2
clang version 18.0.0 (https://github.com/intel/llvm8fe166f)
Code which doesn't work:
#include <sycl/sycl.hpp>
#include <vector>
#include <iostream>
using namespace sycl;
const size_t resolution = 2048;
size_t vector_size = resolution * resolution;
void Blur(queue& q, std::vector<float>& a_data, std::vector<float>& b_data)
{
range<2> num_items{ resolution, resolution };
buffer a_buf(a_data);
buffer b_buf(b_data);
q.submit([&](handler& h) {
auto a = a_buf.get_access<access::mode::read>(h);
auto b = b_buf.get_access<access::mode::write>(h);
h.parallel_for(num_items, [=](auto it)
{
int x = it[1];
int y = it[0];
float value = a[y * resolution + x];
b[y * resolution + x] = value;
});
});
q.wait();
}
int main(int argc, char* argv[]) {
auto d_selector{ sycl::gpu_selector_v };
std::vector<float> a, b;
a.resize(vector_size);
b.resize(vector_size);
a[0] = 1.0f;
try
{
queue q(d_selector);
std::cout << "Running on device: "
<< q.get_device().get_info<info::device::name>() << "\n";
Blur(q, a, b);
}
catch (exception const& e)
{
std::cout << "An exception is caught for blur:\n";
std::cout << e.what();
std::terminate();
}
std::cout << "Blur successfully completed on device.\n";
system("pause");
return 0;
}
The text was updated successfully, but these errors were encountered:
Hi @blinkfrog this should have been fixed by this patch oneapi-src/unified-runtime#1326 which has just merged. If you build DPC++ from the latest sycl branch does the above example work?
Hi @blinkfrog this should have been fixed by this patch oneapi-src/unified-runtime#1326 which has just merged. If you build DPC++ from the latest sycl branch does the above example work?
Thank you very much, the problem is solved: this program works now when compiled with new DPC++!
Hello. I have build clang with cuda backend, however, blur (which I've reduced to a simple data copy) doesn't compute. The same code works fine on Windows version of oneAPI on CPU device, and on custom-built version of AdaptiveCpp on CUDA device.
The output of the program with all error messages:
Commands I used to compile and run the program:
OS: Windows 11 Home version : 22H2
CPU: 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz 2.30 GHz
Target device and vendor: NVIDIA GeForce RTX 3080 Laptop GPU Driver Version: 31.0.15.3713
CUDA version: 12.2
clang version 18.0.0 (https://github.com/intel/llvm 8fe166f)
Code which doesn't work:
The text was updated successfully, but these errors were encountered: