Skip to content

Dev cloud back end JIT failure #1642

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
paboyle opened this issue May 5, 2020 · 15 comments
Closed

Dev cloud back end JIT failure #1642

paboyle opened this issue May 5, 2020 · 15 comments
Labels
bug Something isn't working

Comments

@paboyle
Copy link

paboyle commented May 5, 2020

Much more painless than CUDA support was so far. C++ friendliness is good,
though hitting some not trivially copy constructible complaints for Lambda captures.

First cut compile of entire application goes (almost) through, which is pretty awesome.
(Compile is rather slow, but mustn't grumble)

git clone https://www.github.com/paboyle/Grid
cd Grid
git checkout sycl
./bootstrap.sh
mkdir build
cd build
../configure --enable-simd=GEN --enable-gen-simd-width=64 --enable-precision=single --enable-sycl CXX=dpcpp
make -j 12 -C Grid
make -j 12 -C benchmarks
cd benchmarks
./Benchmark_su3_gpu

But running gives

Grid : Message : ================================================ 
Grid : Message : MPI is initialised and logging filters activated 
Grid : Message : ================================================ 
Grid : Message : Requested 1073741824 byte stencil comms buffers 
Grid : Message : 0.523132 s : Grid is setup to use 1 threads
Grid : Message : 0.523143 s : ============================================================
Grid : Message : 0.523156 s : = Benchmarking SU3xSU3  ext/ins z = x*y
============================================================
Grid : Message : 0.523179 s :   L  		bytes			GB/s		 GFlop/s
Grid : Message : 0.523190 s : -----------------------------------------------------
terminate called after throwing an instance of 'cl::sycl::compile_program_error'
  what():  The program was built for 1 devices
Build program log for 'Intel(R) Gen9 HD Graphics NEO':
error: undefined reference to `__mulsc3()'
error: backend compiler failed build.
 -17 (CL_LINK_PROGRAM_FAILURE)
Aborted

Advice welcome?

@AlexeySachkov AlexeySachkov added the bug Something isn't working label May 6, 2020
@bader
Copy link
Contributor

bader commented May 7, 2020

@paboyle, did you tried to report this to https://software.intel.com/en-us/forums/intel-oneapi-forums? We have more engineers there helping with the questions related to DevCould and oneAPI toolchain.

What toolchain/compiler version do you use?

@paboyle
Copy link
Author

paboyle commented May 7, 2020

I was using beta06 on the devcloud; I'm registering with the forum and will report there

@bader
Copy link
Contributor

bader commented May 7, 2020

I just realized that we have a separate forum for DevCould issues: https://software.intel.com/en-us/forums/intel-devcloud, but your seems to be related to toolchain.
It looks like the app is using complex number in device code and I'm surprised it doesn't work as it should be supported in Beta06.
One reason it might not work is that app is using complex numbers with double precision FP, but fp64 is not supported by the GPU HW.

@paboyle
Copy link
Author

paboyle commented May 25, 2020

../configure --enable-simd=GEN --enable-gen-simd-width=64 --enable-precision=single --enable-accelerator=sycl CXX=dpcpp

Updated the GitHub branch and configure option now as above.
This builds CUDA, HIP and SYCL as options, but warts continue to exist for both HIP and SYCL.

The complex arithmetic JIT bug persists. Beta06 on devcloud. Wasn't able to register for the forum - the confirmation email never came.

@bd4
Copy link

bd4 commented Jun 8, 2020

@paboyle I was able to get a simple std::complex<double> example working by linking to libsycl-cmath-fp64.o and libsycl-complex-fp64.o in /opt/intel/inteloneapi/compiler/latest/linux/lib, see https://github.com/intel/llvm/blob/sycl/sycl/test/devicelib/std_complex_math_fp64_test.cpp

@paboyle
Copy link
Author

paboyle commented Jun 8, 2020

Wow Bryce - thanks a million. That is incredibly helpful. Will have a go.

@bd4
Copy link

bd4 commented Jun 8, 2020

Let me know if it works for you. This is not a good "standard" SYCL solution, but FWIW hipsycl seems to work out of the box with std::complex. ComputeCpp 2.0.0 warns that __muldc3 is not found at compile time, then fails in a similar way as dpcpp at runtime. I started a forum conversation at their forum here:
https://support.codeplay.com/t/std-complex-multiplication/355

Hacks are fine short term, but I'd also like to understand the direction of the SYCL standard on this. Seems to me like it's a gap in the standard - if the device compiler / runtime happens to support it, everything works great, but AFAIK it's not required or even clearly defined as an extension.

@paboyle
Copy link
Author

paboyle commented Jun 8, 2020

Yeah - complex and GPU is a mess right now.
HIP and CUDA are sticking in the STL partially replacement namespace "thrust", but I had hoped the more standard C++ environment of SYCL could make it even better than such an ugly hack.

@bd4
Copy link

bd4 commented Jun 8, 2020

HIP seems to work out of the box with std::complex, at least for basic arithmetic, I haven't tried using transcendental functions yet.

@paboyle
Copy link
Author

paboyle commented Jun 9, 2020

Hi Bryce

-- very interesting - single still fails, but double precision now passes simple arithmetic tests with complex arithmetic with these files linked.

Haven't check all transcendental complex maths yet, but double complex exponential works.

Of course all the thrust discussion is about getting that functionality. +,-,*,/ is easy - it's the library functionality that is the real benefit from reuse...

Assuming they eventually support all the STL mandated transcendental it puts SYCL in a much more civilised state than CUDA.

@paboyle
Copy link
Author

paboyle commented Jun 9, 2020

I assume I can get single with the non-fp64 versions :

libsycl-cmath-fp64.o
libsycl-cmath.o
libsycl-complex-fp64.o
libsycl-complex.o

@bd4
Copy link

bd4 commented Jun 9, 2020

Yes I think so, I haven't tested single precision yet. Including both in the link line also seems to work, which is potentially convenient for a library install that might require both. For cmake, I added the .o files to target_sources on the gtensor INTERFACE library, to get the object files linked on executable that link gtensor.

@jeremyong
Copy link

I am trying to resolve this right now when compiling against the devcloud as well (targeting Intel Gen 9 integrated GPU)

Unfortunately, I get a number of math-related undefined references even when I link against the compiled objects mentioned in this thread.

terminate called after throwing an instance of 'cl::sycl::compile_program_error'
  what():  The program was built for 1 devices
Build program log for 'Intel(R) Gen9':

error: undefined reference to `__devicelib_scalbnf()'
undefined reference to `__devicelib_logf()'
undefined reference to `__devicelib_expf()'
undefined reference to `__devicelib_frexpf()'
undefined reference to `__devicelib_ldexpf()'
undefined reference to `__devicelib_log10f()'
undefined reference to `__devicelib_modff()'
undefined reference to `__devicelib_exp2f()'
undefined reference to `__devicelib_expm1f()'
undefined reference to `__devicelib_ilogbf()'
undefined reference to `__devicelib_log1pf()'
undefined reference to `__devicelib_log2f()'
undefined reference to `__devicelib_logbf()'
undefined reference to `__devicelib_sqrtf()'
undefined reference to `__devicelib_cbrtf()'
undefined reference to `__devicelib_hypotf()'
undefined reference to `__devicelib_erff()'
undefined reference to `__devicelib_erfcf()'
undefined reference to `__devicelib_tgammaf()'
undefined reference to `__devicelib_lgammaf()'
undefined reference to `__devicelib_fmodf()'
undefined reference to `__devicelib_remainderf()'
undefined reference to `__devicelib_remquof()'
undefined reference to `__devicelib_nextafterf()'
undefined reference to `__devicelib_fdimf()'
undefined reference to `__devicelib_fmaf()'
undefined reference to `__devicelib_sinf()'
undefined reference to `__devicelib_cosf()'
undefined reference to `__devicelib_tanf()'
undefined reference to `__devicelib_powf()'
undefined reference to `__devicelib_acosf()'
undefined reference to `__devicelib_asinf()'
undefined reference to `__devicelib_atanf()'
undefined reference to `__devicelib_atan2f()'
undefined reference to `__devicelib_coshf()'
undefined reference to `__devicelib_sinhf()'
undefined reference to `__devicelib_tanhf()'
undefined reference to `__devicelib_acoshf()'
undefined reference to `__devicelib_asinhf()'
undefined reference to `__devicelib_atanhf()'
undefined reference to `__devicelib_log()'
undefined reference to `__devicelib_exp()'
undefined reference to `__devicelib_frexp()'
undefined reference to `__devicelib_ldexp()'
undefined reference to `__devicelib_log10()'
undefined reference to `__devicelib_modf()'
undefined reference to `__devicelib_exp2()'
undefined reference to `__devicelib_expm1()'
undefined reference to `__devicelib_ilogb()'
undefined reference to `__devicelib_log1p()'
undefined reference to `__devicelib_log2()'
undefined reference to `__devicelib_logb()'
undefined reference to `__devicelib_sqrt()'
undefined reference to `__devicelib_cbrt()'
undefined reference to `__devicelib_hypot()'
undefined reference to `__devicelib_erf()'
undefined reference to `__devicelib_erfc()'
undefined reference to `__devicelib_tgamma()'
undefined reference to `__devicelib_lgamma()'
undefined reference to `__devicelib_fmod()'
undefined reference to `__devicelib_remainder()'
undefined reference to `__devicelib_remquo()'
undefined reference to `__devicelib_nextafter()'
undefined reference to `__devicelib_fdim()'
undefined reference to `__devicelib_fma()'
undefined reference to `__devicelib_sin()'
undefined reference to `__devicelib_cos()'
undefined reference to `__devicelib_tan()'
undefined reference to `__devicelib_pow()'
undefined reference to `__devicelib_acos()'
undefined reference to `__devicelib_asin()'
undefined reference to `__devicelib_atan()'
undefined reference to `__devicelib_atan2()'
undefined reference to `__devicelib_cosh()'
undefined reference to `__devicelib_sinh()'
undefined reference to `__devicelib_tanh()'
undefined reference to `__devicelib_acosh()'
undefined reference to `__devicelib_asinh()'
undefined reference to `__devicelib_atanh()'

error: backend compiler failed build.
 -11 (CL_BUILD_PROGRAM_FAILURE)

@bader
Copy link
Contributor

bader commented Oct 11, 2020

@paboyle, this issue should be addressed by #2400. Could you confirm, please?

@paboyle
Copy link
Author

paboyle commented Oct 31, 2020

Closing as this works on devcloud under beta10 with -fsycl-device-lib=all

@paboyle paboyle closed this as completed Oct 31, 2020
preethi-intel pushed a commit to preethi-intel/llvm that referenced this issue Oct 17, 2022
preethi-intel pushed a commit to preethi-intel/llvm that referenced this issue Oct 17, 2022
preethi-intel pushed a commit to preethi-intel/llvm that referenced this issue Oct 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants