Skip to content

CUDA_ERROR_ILLEGAL_ADDRESS when executing a simple program using CUDA backend (Windows) #11740

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
blinkfrog opened this issue Nov 1, 2023 · 3 comments
Labels
bug Something isn't working cuda CUDA back-end Windows

Comments

@blinkfrog
Copy link

Hello. I have build clang with cuda backend, however, blur (which I've reduced to a simple data copy) doesn't compute. The same code works fine on Windows version of oneAPI on CPU device, and on custom-built version of AdaptiveCpp on CUDA device.

The output of the program with all error messages:

Running on device: NVIDIA GeForce RTX 3080 Laptop GPU

UR CUDA ERROR:
        Value:           700
        Name:            CUDA_ERROR_ILLEGAL_ADDRESS
        Description:     an illegal memory access was encountered
        Function:        operator ()
        Source Location: D:\source\dpcpp\llvm\build\_deps\unified-runtime-src\source\adapters\cuda\queue.cpp:218


UR CUDA ERROR:
        Value:           700
        Name:            CUDA_ERROR_ILLEGAL_ADDRESS
        Description:     an illegal memory access was encountered
        Function:        getNextTransferStream
        Source Location: D:\source\dpcpp\llvm\build\_deps\unified-runtime-src\source\adapters\cuda\queue.cpp:107


UR CUDA ERROR:
        Value:           700
        Name:            CUDA_ERROR_ILLEGAL_ADDRESS
        Description:     an illegal memory access was encountered
        Function:        wait
        Source Location: D:\source\dpcpp\llvm\build\_deps\unified-runtime-src\source\adapters\cuda\event.cpp:142

An exception is caught for blur:
Native API failed. Native API returns: -999 (Unknown PI error) -999 (Unknown PI error)

Commands I used to compile and run the program:

set PATH=D:\source\dpcpp\llvm\build\bin;%PATH%
set LIB=D:\source\dpcpp\llvm\build\lib;%LIB%
clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda simple_blur1.cpp -o simple_blur1.exe
set ONEAPI_DEVICE_SELECTOR=cuda:*
simple-blur1.exe

OS: Windows 11 Home version : 22H2
CPU: 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz 2.30 GHz
Target device and vendor: NVIDIA GeForce RTX 3080 Laptop GPU Driver Version: 31.0.15.3713
CUDA version: 12.2
clang version 18.0.0 (https://github.com/intel/llvm 8fe166f)

Code which doesn't work:

#include <sycl/sycl.hpp>
#include <vector>
#include <iostream>
using namespace sycl;

const size_t resolution = 2048;
size_t vector_size = resolution * resolution;

void Blur(queue& q, std::vector<float>& a_data, std::vector<float>& b_data)
{
    range<2> num_items{ resolution, resolution };
    buffer a_buf(a_data);
    buffer b_buf(b_data);
	q.submit([&](handler& h) {
		auto a = a_buf.get_access<access::mode::read>(h);
		auto b = b_buf.get_access<access::mode::write>(h);

		h.parallel_for(num_items, [=](auto it)
		{
			int x = it[1];
			int y = it[0];
			float value = a[y * resolution + x];
			b[y * resolution + x] = value;
		});
	});
	q.wait();
}

int main(int argc, char* argv[]) {
    auto d_selector{ sycl::gpu_selector_v };
    std::vector<float> a, b;
    a.resize(vector_size);
    b.resize(vector_size);
    a[0] = 1.0f;
    try
	{
		queue q(d_selector);
		std::cout << "Running on device: "
			<< q.get_device().get_info<info::device::name>() << "\n";
		Blur(q, a, b);
    }
	catch (exception const& e)
	{
        std::cout << "An exception is caught for blur:\n";
		std::cout << e.what();
        std::terminate();
    }   
    std::cout << "Blur successfully completed on device.\n";
    system("pause");
    return 0;
}
@blinkfrog blinkfrog added the bug Something isn't working label Nov 1, 2023
@bader bader added the cuda CUDA back-end label Nov 1, 2023
@hdelan
Copy link
Contributor

hdelan commented Mar 22, 2024

Hi @blinkfrog this should have been fixed by this patch oneapi-src/unified-runtime#1326 which has just merged. If you build DPC++ from the latest sycl branch does the above example work?

@blinkfrog
Copy link
Author

Hi @blinkfrog this should have been fixed by this patch oneapi-src/unified-runtime#1326 which has just merged. If you build DPC++ from the latest sycl branch does the above example work?

Thank you very much, the problem is solved: this program works now when compiled with new DPC++!

@blinkfrog
Copy link
Author

Closing as the issue is solved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cuda CUDA back-end Windows
Projects
None yet
Development

No branches or pull requests

4 participants