-
Notifications
You must be signed in to change notification settings - Fork 769
[CUDA] floating-point exception in cuda_piextUSMSharedAlloc #1467
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Looks like the crash happens because alignment is 0 on the following line The OpenCL extension for USM (https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/USM/cl_intel_unified_shared_memory.asciidoc) says:
I believe largest type is double16. The SYCL extension for USM(https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/USM/USM.adoc) says:
So, @fwyzard
@jbrodman What should an implementation do if an alignment passed by user is not a valid alignment supported by the implementation? |
hi @romanovvlad
Neither can I.
Let me try some empirical tests and/or asking NVIDIA about it.
Do you mean it should raise an exception rather than aborting ?
I would take assert((alignment == 0) or (reinterpret_cast<std::uintptr_t>(*result_ptr) % alignment == 0)); |
From a quick test, it looks like the memory returned by |
Can you check if #1577 fixes the exception ? |
I agree with your interpretation of alignment == 0. |
It would probably be good if we could through runtime errors instead of crashing. Hmm.. CUDA doesn't seem to have user aligned allocations. |
@jbrodman Should we clarify what happens if required alignment is not supported by the implementation? |
Rather, it seems that all allocations are aligned to 512 bytes. |
Hi, I am also facing this problem. I compiled and ran a DPCPP code successfully in CPU. When, I compiled it for GPU it worked fine. However, when I offload the code to GPU, it shows a floating point exception (cored dumped). I was wondering whether the problem is resolved for this case. |
I too see this issue and wonder how I'm supposed to use USM on NVIDIA while this bug exists...
|
Can't reproduce with the tip of the branch. Most likely it's addressed by #2557. |
It looks like python on windows is always python.exe and on some systems we have python3.exe alias created manually.
https://github.com/jeffhammond/dpcpp-tutorial/blob/master/saxpy-usm.cc
Debugging the above program shows the following:
Thread 1 "a.out" received signal SIGFPE, Arithmetic exception.
0x00007ffff62eae0b in cuda_piextUSMSharedAlloc () from /home/cc/sycl_workspace/build/install/lib/libpi_cuda.so
(gdb) bt
#0 0x00007ffff62eae0b in cuda_piextUSMSharedAlloc () from /home/cc/sycl_workspace/build/install/lib/libpi_cuda.so
#1 0x00007ffff72002cb in cl::sycl::detail::usm::alignedAlloc(unsigned long, unsigned long, cl::sycl::context const&, cl::sycl::device const&, cl::sycl::usm::alloc) () from /home/cc/sycl_workspace/build/install/lib/libsycl.so
#2 0x00000000004027ee in main ()
The text was updated successfully, but these errors were encountered: