Skip to content

[OpenMP] Remove 'libomptarget.devicertl.a' fatbinary and use static library #126143

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 18, 2025

Conversation

jhuber6
Copy link
Contributor

@jhuber6 jhuber6 commented Feb 6, 2025

Summary:
Currently, we build a single libomptarget.devicertl.a which is a
fatbinary. It is a host object file that contains the embedded archive
files for both the NVIDIA and AMDGPU targets. This was done primarily as
a convenience due to naming conflicts. Now that the clang driver for the
GPU targets can appropriate link via the per-target runtime-dir, we can
just make two separate static libraries and remove the indirection.

This patch creates two new static libraries that get installed into

/lib/amdgcn-amd-amdhsa/libomp.a
/lib/nvptx64-nvidia-cuda/libomp.a

for AMDGPU and NVPTX respectively. The link job created by the linker
wrapper now simply needs to do -lomp and it will search those
directories and link those static libraries. This requires far less
special handling.

This patch is a precursor to changing the build system entirely to be a
runtimes based one. Soon this target will be a standard add_library
and done through the GPU runtime targets.

NOTE that this actually does remove an additional optimization step.
Previously we merged all of the files into a single bitcode object and
forcibly internalized some definitions. This, instead, just treats them
like a normal static library. This may possibly affect performance for
some files, but I think it's better overall to use static library
semantics because it allows us to have an 'include-what-you-use'
relationship with the library.

Performance testing will be required. If we really need the merged blob
then we can simply pack that into a new static library.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' offload labels Feb 6, 2025
@llvmbot
Copy link
Member

llvmbot commented Feb 6, 2025

@llvm/pr-subscribers-offload

@llvm/pr-subscribers-clang

Author: Joseph Huber (jhuber6)

Changes

Summary:
Currently, we build a single libomptarget.devicertl.a which is a
fatbinary. It is a host object file that contains the embedded archive
files for both the NVIDIA and AMDGPU targets. This was done primarily as
a convenience due to naming conflicts. Now that the clang driver for the
GPU targets can appropriate link via the per-target runtime-dir, we can
just make two separate static libraries and remove the indirection.

This patch creates two new static libraries that get installed into

/lib/amdgcn-amd-amdhsa/libomp.a
/lib/nvptx64-nvidia-cuda/libomp.a

for AMDGPU and NVPTX respectively. The link job created by the linker
wrapper now simply needs to do -lomp and it will search those
directories and link those static libraries. This requires far less
special handling.

This patch is a precursor to changing the build system entirely to be a
runtimes based one. Soon this target will be a standard add_library
and done through the GPU runtime targets.

NOTE that this actually does remove an additional optimization step.
Previously we merged all of the files into a single bitcode object and
forcibly internalized some definitions. This, instead, just treats them
like a normal static library. This may possibly affect performance for
some files, but I think it's better overall to use static library
semantics because it allows us to have an 'include-what-you-use'
relationship with the library.

Performance testing will be required. If we really need the merged blob
then we can simply pack that into a new static library.


Full diff: https://github.com/llvm/llvm-project/pull/126143.diff

4 Files Affected:

  • (modified) clang/lib/Driver/ToolChains/Clang.cpp (+4)
  • (modified) clang/lib/Driver/ToolChains/CommonArgs.cpp (-3)
  • (modified) offload/DeviceRTL/CMakeLists.txt (+30-103)
  • (modified) offload/test/lit.cfg (+4-4)
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index c0891d46b0a62cd..fd690ab11c1c2c3 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -9209,6 +9209,10 @@ void LinkerWrapper::ConstructJob(Compilation &C, const JobAction &JA,
           A->render(Args, LinkerArgs);
       }
 
+      // If this is OpenMP the device linker will need `-lomp`.
+      if (Kind == Action::OFK_OpenMP && !Args.hasArg(OPT_nogpulib))
+        LinkerArgs.emplace_back("-lomp");
+
       // Forward all of these to the appropriate toolchain.
       for (StringRef Arg : CompilerArgs)
         CmdArgs.push_back(Args.MakeArgString(
diff --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp b/clang/lib/Driver/ToolChains/CommonArgs.cpp
index 699aadec86dcba9..93031d2f5302386 100644
--- a/clang/lib/Driver/ToolChains/CommonArgs.cpp
+++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp
@@ -1289,9 +1289,6 @@ bool tools::addOpenMPRuntime(const Compilation &C, ArgStringList &CmdArgs,
   if (IsOffloadingHost)
     CmdArgs.push_back("-lomptarget");
 
-  if (IsOffloadingHost && !Args.hasArg(options::OPT_nogpulib))
-    CmdArgs.push_back("-lomptarget.devicertl");
-
   addArchSpecificRPath(TC, Args, CmdArgs);
 
   addOpenMPRuntimeLibraryPath(TC, Args, CmdArgs);
diff --git a/offload/DeviceRTL/CMakeLists.txt b/offload/DeviceRTL/CMakeLists.txt
index 8f2a1fd01fabcc8..b3dd4a1997d80d0 100644
--- a/offload/DeviceRTL/CMakeLists.txt
+++ b/offload/DeviceRTL/CMakeLists.txt
@@ -107,15 +107,15 @@ set(bc_flags -c -flto -std=c++17 -fvisibility=hidden
 )
 
 # first create an object target
-add_library(omptarget.devicertl.all_objs OBJECT IMPORTED)
 function(compileDeviceRTLLibrary target_name target_triple)
   set(target_bc_flags ${ARGN})
 
   set(bc_files "")
+  add_library(omp.${target_name}.all_objs OBJECT IMPORTED)
   foreach(src ${src_files})
     get_filename_component(infile ${src} ABSOLUTE)
     get_filename_component(outfile ${src} NAME)
-    set(outfile "${outfile}-${target_name}.bc")
+    set(outfile "${outfile}-${target_name}.o")
     set(depfile "${outfile}.d")
 
     # Passing an empty CPU to -march= suppressed target specific metadata.
@@ -142,99 +142,36 @@ function(compileDeviceRTLLibrary target_name target_triple)
     endif()
     set_property(DIRECTORY APPEND PROPERTY ADDITIONAL_MAKE_CLEAN_FILES ${outfile})
 
-    list(APPEND bc_files ${outfile})
+    list(APPEND obj_files ${CMAKE_CURRENT_BINARY_DIR}/${outfile})
   endforeach()
-
-  set(bclib_name "libomptarget-${target_name}.bc")
-
-  # Link to a bitcode library.
-  add_custom_command(OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/linked_${bclib_name}
-      COMMAND ${LINK_TOOL}
-        -o ${CMAKE_CURRENT_BINARY_DIR}/linked_${bclib_name} ${bc_files}
-      DEPENDS ${bc_files}
-      COMMENT "Linking LLVM bitcode ${bclib_name}"
-  )
-
-  if(TARGET llvm-link)
-    add_custom_command(OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/linked_${bclib_name}
-      DEPENDS llvm-link
-      APPEND)
-  endif()
-
-  add_custom_command(OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/internalized_${bclib_name}
-      COMMAND ${OPT_TOOL} ${link_export_flag} ${CMAKE_CURRENT_BINARY_DIR}/linked_${bclib_name}
-                      -o ${CMAKE_CURRENT_BINARY_DIR}/internalized_${bclib_name}
-      DEPENDS ${source_directory}/exports ${CMAKE_CURRENT_BINARY_DIR}/linked_${bclib_name}
-      COMMENT "Internalizing LLVM bitcode ${bclib_name}"
-  )
-  if(TARGET opt)
-    add_custom_command(OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/internalized_${bclib_name}
-      DEPENDS opt
-      APPEND)
-  endif()
-
-  add_custom_command(OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/${bclib_name}
-      COMMAND ${OPT_TOOL} ${link_opt_flags} ${CMAKE_CURRENT_BINARY_DIR}/internalized_${bclib_name}
-                      -o ${CMAKE_CURRENT_BINARY_DIR}/${bclib_name}
-      DEPENDS ${CMAKE_CURRENT_BINARY_DIR}/internalized_${bclib_name}
-      COMMENT "Optimizing LLVM bitcode ${bclib_name}"
-  )
-  if(TARGET opt)
-    add_custom_command(OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/${bclib_name}
-      DEPENDS opt
-      APPEND)
-  endif()
-
-  set(bclib_target_name "omptarget-${target_name}-bc")
-  add_custom_target(${bclib_target_name} DEPENDS ${CMAKE_CURRENT_BINARY_DIR}/${bclib_name})
-
-  # Copy library to destination.
-  add_custom_command(TARGET ${bclib_target_name} POST_BUILD
-                    COMMAND ${CMAKE_COMMAND} -E copy ${CMAKE_CURRENT_BINARY_DIR}/${bclib_name}
-                    ${LIBOMPTARGET_LIBRARY_DIR})
-  add_dependencies(omptarget.devicertl.${target_name} ${bclib_target_name})
-
-  set_property(DIRECTORY APPEND PROPERTY ADDITIONAL_MAKE_CLEAN_FILES ${bclib_name} ${LIBOMPTARGET_LIBRARY_DIR}/${bclib_name})
-
-  # Install bitcode library under the lib destination folder.
-  install(FILES ${CMAKE_CURRENT_BINARY_DIR}/${bclib_name} DESTINATION "${OFFLOAD_INSTALL_LIBDIR}")
-
-  set(target_feature "")
-  if("${target_triple}" STREQUAL "nvptx64-nvidia-cuda")
-    set(target_feature "feature=+ptx63")
-  endif()
-
-  # Package the bitcode in the bitcode and embed it in an ELF for the static library
-  add_custom_command(OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/packaged_${bclib_name}
-      COMMAND ${PACKAGER_TOOL} -o ${CMAKE_CURRENT_BINARY_DIR}/packaged_${bclib_name}
-        "--image=file=${CMAKE_CURRENT_BINARY_DIR}/${bclib_name},${target_feature},triple=${target_triple},arch=generic,kind=openmp"
-      DEPENDS ${CMAKE_CURRENT_BINARY_DIR}/${bclib_name}
-      COMMENT "Packaging LLVM offloading binary ${bclib_name}.out"
+  set_property(TARGET omp.${target_name}.all_objs
+               APPEND PROPERTY IMPORTED_OBJECTS ${obj_files})
+
+  # Archive all the object files generated above into a static library
+  add_library(omp.${target_name} STATIC)
+  set_target_properties(omp.${target_name} PROPERTIES
+    ARCHIVE_OUTPUT_DIRECTORY "${LIBOMPTARGET_LLVM_LIBRARY_INTDIR}/${target_triple}"
+    ARCHIVE_OUTPUT_NAME omp
+    LINKER_LANGUAGE CXX
   )
-  if(TARGET clang-offload-packager)
-    add_custom_command(OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/packaged_${bclib_name}
-      DEPENDS clang-offload-packager
-      APPEND)
-  endif()
-
-  set(output_name "${CMAKE_CURRENT_BINARY_DIR}/devicertl-${target_name}.o")
-  add_custom_command(OUTPUT ${output_name}
-    COMMAND ${CLANG_TOOL} --std=c++17 -c -nostdlib
-            -Xclang -fembed-offload-object=${CMAKE_CURRENT_BINARY_DIR}/packaged_${bclib_name}
-            -o ${output_name}
-            ${source_directory}/Stub.cpp
-    DEPENDS ${CMAKE_CURRENT_BINARY_DIR}/packaged_${bclib_name} ${source_directory}/Stub.cpp
-    COMMENT "Embedding LLVM offloading binary in devicertl-${target_name}.o"
-    VERBATIM
-  )
-  if(TARGET clang)
-    add_custom_command(OUTPUT ${output_name}
-      DEPENDS clang
-      APPEND)
-  endif()
-
-  set_property(DIRECTORY APPEND PROPERTY ADDITIONAL_MAKE_CLEAN_FILES ${output_name})
-  set_property(TARGET omptarget.devicertl.all_objs APPEND PROPERTY IMPORTED_OBJECTS ${output_name})
+  target_link_libraries(omp.${target_name} PRIVATE omp.${target_name}.all_objs)
+
+  install(TARGETS omp.${target_name}
+          ARCHIVE DESTINATION "lib${LLVM_LIBDIR_SUFFIX}/${target_triple}")
+
+  # Trick to combine these into a bitcode file via the linker's LTO pass. This
+  # is used to provide the legacy `libomptarget-<name>.bc` files.
+  add_executable(libomptarget-${target_name} ${obj_files})
+  set_target_properties(libomptarget-${target_name} PROPERTIES
+    RUNTIME_OUTPUT_DIRECTORY ${LIBOMPTARGET_LLVM_LIBRARY_INTDIR}
+    LINKER_LANGUAGE CXX
+    RUNTIME_OUTPUT_NAME libomptarget-${target_name}.bc)
+  target_compile_options(libomptarget-${target_name} PRIVATE "--target=${target_triple}")
+  target_link_options(libomptarget-${target_name} PRIVATE "--target=${target_triple}"
+                      "-r" "-nostdlib" "-flto" "-Wl,--lto-emit-llvm")
+  install(TARGETS libomptarget-${target_name}
+          PERMISSIONS OWNER_WRITE OWNER_READ GROUP_READ WORLD_READ
+          DESTINATION ${OFFLOAD_INSTALL_LIBDIR})
 
   if (CMAKE_EXPORT_COMPILE_COMMANDS)
     set(ide_target_name omptarget-ide-${target_name})
@@ -259,13 +196,3 @@ compileDeviceRTLLibrary(amdgpu amdgcn-amd-amdhsa -Xclang -mcode-object-version=n
 
 add_custom_target(omptarget.devicertl.nvptx)
 compileDeviceRTLLibrary(nvptx nvptx64-nvidia-cuda --cuda-feature=+ptx63)
-
-# Archive all the object files generated above into a static library
-add_library(omptarget.devicertl STATIC)
-set_target_properties(omptarget.devicertl PROPERTIES
-  ARCHIVE_OUTPUT_DIRECTORY "${LIBOMPTARGET_LLVM_LIBRARY_INTDIR}"
-  LINKER_LANGUAGE CXX
-)
-target_link_libraries(omptarget.devicertl PRIVATE omptarget.devicertl.all_objs)
-
-install(TARGETS omptarget.devicertl ARCHIVE DESTINATION ${OFFLOAD_INSTALL_LIBDIR})
diff --git a/offload/test/lit.cfg b/offload/test/lit.cfg
index 658ae5f9653ba90..565edc3e7faeb9d 100644
--- a/offload/test/lit.cfg
+++ b/offload/test/lit.cfg
@@ -183,11 +183,11 @@ def remove_suffix_if_present(name):
 
 def add_libraries(source):
     if config.libomptarget_has_libc:
-        return source + " -Xoffload-linker " + "-lc " + \
-               "-Xoffload-linker " + "-lm " + \
-               config.llvm_library_intdir + "/libomptarget.devicertl.a"
+        return source + " -Xoffload-linker -lc " + \
+               "-Xoffload-linker -lm " + \
+               "-Xoffload-linker -lomp "
     else:
-        return source + " " + config.llvm_library_intdir + "/libomptarget.devicertl.a"
+        return source + " " + "-Xoffload-lnker -lomp"
 
 # Add platform targets
 host_targets = [

@llvmbot
Copy link
Member

llvmbot commented Feb 6, 2025

@llvm/pr-subscribers-clang-driver

Author: Joseph Huber (jhuber6)

Changes

Summary:
Currently, we build a single libomptarget.devicertl.a which is a
fatbinary. It is a host object file that contains the embedded archive
files for both the NVIDIA and AMDGPU targets. This was done primarily as
a convenience due to naming conflicts. Now that the clang driver for the
GPU targets can appropriate link via the per-target runtime-dir, we can
just make two separate static libraries and remove the indirection.

This patch creates two new static libraries that get installed into

/lib/amdgcn-amd-amdhsa/libomp.a
/lib/nvptx64-nvidia-cuda/libomp.a

for AMDGPU and NVPTX respectively. The link job created by the linker
wrapper now simply needs to do -lomp and it will search those
directories and link those static libraries. This requires far less
special handling.

This patch is a precursor to changing the build system entirely to be a
runtimes based one. Soon this target will be a standard add_library
and done through the GPU runtime targets.

NOTE that this actually does remove an additional optimization step.
Previously we merged all of the files into a single bitcode object and
forcibly internalized some definitions. This, instead, just treats them
like a normal static library. This may possibly affect performance for
some files, but I think it's better overall to use static library
semantics because it allows us to have an 'include-what-you-use'
relationship with the library.

Performance testing will be required. If we really need the merged blob
then we can simply pack that into a new static library.


Full diff: https://github.com/llvm/llvm-project/pull/126143.diff

4 Files Affected:

  • (modified) clang/lib/Driver/ToolChains/Clang.cpp (+4)
  • (modified) clang/lib/Driver/ToolChains/CommonArgs.cpp (-3)
  • (modified) offload/DeviceRTL/CMakeLists.txt (+30-103)
  • (modified) offload/test/lit.cfg (+4-4)
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index c0891d46b0a62cd..fd690ab11c1c2c3 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -9209,6 +9209,10 @@ void LinkerWrapper::ConstructJob(Compilation &C, const JobAction &JA,
           A->render(Args, LinkerArgs);
       }
 
+      // If this is OpenMP the device linker will need `-lomp`.
+      if (Kind == Action::OFK_OpenMP && !Args.hasArg(OPT_nogpulib))
+        LinkerArgs.emplace_back("-lomp");
+
       // Forward all of these to the appropriate toolchain.
       for (StringRef Arg : CompilerArgs)
         CmdArgs.push_back(Args.MakeArgString(
diff --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp b/clang/lib/Driver/ToolChains/CommonArgs.cpp
index 699aadec86dcba9..93031d2f5302386 100644
--- a/clang/lib/Driver/ToolChains/CommonArgs.cpp
+++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp
@@ -1289,9 +1289,6 @@ bool tools::addOpenMPRuntime(const Compilation &C, ArgStringList &CmdArgs,
   if (IsOffloadingHost)
     CmdArgs.push_back("-lomptarget");
 
-  if (IsOffloadingHost && !Args.hasArg(options::OPT_nogpulib))
-    CmdArgs.push_back("-lomptarget.devicertl");
-
   addArchSpecificRPath(TC, Args, CmdArgs);
 
   addOpenMPRuntimeLibraryPath(TC, Args, CmdArgs);
diff --git a/offload/DeviceRTL/CMakeLists.txt b/offload/DeviceRTL/CMakeLists.txt
index 8f2a1fd01fabcc8..b3dd4a1997d80d0 100644
--- a/offload/DeviceRTL/CMakeLists.txt
+++ b/offload/DeviceRTL/CMakeLists.txt
@@ -107,15 +107,15 @@ set(bc_flags -c -flto -std=c++17 -fvisibility=hidden
 )
 
 # first create an object target
-add_library(omptarget.devicertl.all_objs OBJECT IMPORTED)
 function(compileDeviceRTLLibrary target_name target_triple)
   set(target_bc_flags ${ARGN})
 
   set(bc_files "")
+  add_library(omp.${target_name}.all_objs OBJECT IMPORTED)
   foreach(src ${src_files})
     get_filename_component(infile ${src} ABSOLUTE)
     get_filename_component(outfile ${src} NAME)
-    set(outfile "${outfile}-${target_name}.bc")
+    set(outfile "${outfile}-${target_name}.o")
     set(depfile "${outfile}.d")
 
     # Passing an empty CPU to -march= suppressed target specific metadata.
@@ -142,99 +142,36 @@ function(compileDeviceRTLLibrary target_name target_triple)
     endif()
     set_property(DIRECTORY APPEND PROPERTY ADDITIONAL_MAKE_CLEAN_FILES ${outfile})
 
-    list(APPEND bc_files ${outfile})
+    list(APPEND obj_files ${CMAKE_CURRENT_BINARY_DIR}/${outfile})
   endforeach()
-
-  set(bclib_name "libomptarget-${target_name}.bc")
-
-  # Link to a bitcode library.
-  add_custom_command(OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/linked_${bclib_name}
-      COMMAND ${LINK_TOOL}
-        -o ${CMAKE_CURRENT_BINARY_DIR}/linked_${bclib_name} ${bc_files}
-      DEPENDS ${bc_files}
-      COMMENT "Linking LLVM bitcode ${bclib_name}"
-  )
-
-  if(TARGET llvm-link)
-    add_custom_command(OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/linked_${bclib_name}
-      DEPENDS llvm-link
-      APPEND)
-  endif()
-
-  add_custom_command(OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/internalized_${bclib_name}
-      COMMAND ${OPT_TOOL} ${link_export_flag} ${CMAKE_CURRENT_BINARY_DIR}/linked_${bclib_name}
-                      -o ${CMAKE_CURRENT_BINARY_DIR}/internalized_${bclib_name}
-      DEPENDS ${source_directory}/exports ${CMAKE_CURRENT_BINARY_DIR}/linked_${bclib_name}
-      COMMENT "Internalizing LLVM bitcode ${bclib_name}"
-  )
-  if(TARGET opt)
-    add_custom_command(OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/internalized_${bclib_name}
-      DEPENDS opt
-      APPEND)
-  endif()
-
-  add_custom_command(OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/${bclib_name}
-      COMMAND ${OPT_TOOL} ${link_opt_flags} ${CMAKE_CURRENT_BINARY_DIR}/internalized_${bclib_name}
-                      -o ${CMAKE_CURRENT_BINARY_DIR}/${bclib_name}
-      DEPENDS ${CMAKE_CURRENT_BINARY_DIR}/internalized_${bclib_name}
-      COMMENT "Optimizing LLVM bitcode ${bclib_name}"
-  )
-  if(TARGET opt)
-    add_custom_command(OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/${bclib_name}
-      DEPENDS opt
-      APPEND)
-  endif()
-
-  set(bclib_target_name "omptarget-${target_name}-bc")
-  add_custom_target(${bclib_target_name} DEPENDS ${CMAKE_CURRENT_BINARY_DIR}/${bclib_name})
-
-  # Copy library to destination.
-  add_custom_command(TARGET ${bclib_target_name} POST_BUILD
-                    COMMAND ${CMAKE_COMMAND} -E copy ${CMAKE_CURRENT_BINARY_DIR}/${bclib_name}
-                    ${LIBOMPTARGET_LIBRARY_DIR})
-  add_dependencies(omptarget.devicertl.${target_name} ${bclib_target_name})
-
-  set_property(DIRECTORY APPEND PROPERTY ADDITIONAL_MAKE_CLEAN_FILES ${bclib_name} ${LIBOMPTARGET_LIBRARY_DIR}/${bclib_name})
-
-  # Install bitcode library under the lib destination folder.
-  install(FILES ${CMAKE_CURRENT_BINARY_DIR}/${bclib_name} DESTINATION "${OFFLOAD_INSTALL_LIBDIR}")
-
-  set(target_feature "")
-  if("${target_triple}" STREQUAL "nvptx64-nvidia-cuda")
-    set(target_feature "feature=+ptx63")
-  endif()
-
-  # Package the bitcode in the bitcode and embed it in an ELF for the static library
-  add_custom_command(OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/packaged_${bclib_name}
-      COMMAND ${PACKAGER_TOOL} -o ${CMAKE_CURRENT_BINARY_DIR}/packaged_${bclib_name}
-        "--image=file=${CMAKE_CURRENT_BINARY_DIR}/${bclib_name},${target_feature},triple=${target_triple},arch=generic,kind=openmp"
-      DEPENDS ${CMAKE_CURRENT_BINARY_DIR}/${bclib_name}
-      COMMENT "Packaging LLVM offloading binary ${bclib_name}.out"
+  set_property(TARGET omp.${target_name}.all_objs
+               APPEND PROPERTY IMPORTED_OBJECTS ${obj_files})
+
+  # Archive all the object files generated above into a static library
+  add_library(omp.${target_name} STATIC)
+  set_target_properties(omp.${target_name} PROPERTIES
+    ARCHIVE_OUTPUT_DIRECTORY "${LIBOMPTARGET_LLVM_LIBRARY_INTDIR}/${target_triple}"
+    ARCHIVE_OUTPUT_NAME omp
+    LINKER_LANGUAGE CXX
   )
-  if(TARGET clang-offload-packager)
-    add_custom_command(OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/packaged_${bclib_name}
-      DEPENDS clang-offload-packager
-      APPEND)
-  endif()
-
-  set(output_name "${CMAKE_CURRENT_BINARY_DIR}/devicertl-${target_name}.o")
-  add_custom_command(OUTPUT ${output_name}
-    COMMAND ${CLANG_TOOL} --std=c++17 -c -nostdlib
-            -Xclang -fembed-offload-object=${CMAKE_CURRENT_BINARY_DIR}/packaged_${bclib_name}
-            -o ${output_name}
-            ${source_directory}/Stub.cpp
-    DEPENDS ${CMAKE_CURRENT_BINARY_DIR}/packaged_${bclib_name} ${source_directory}/Stub.cpp
-    COMMENT "Embedding LLVM offloading binary in devicertl-${target_name}.o"
-    VERBATIM
-  )
-  if(TARGET clang)
-    add_custom_command(OUTPUT ${output_name}
-      DEPENDS clang
-      APPEND)
-  endif()
-
-  set_property(DIRECTORY APPEND PROPERTY ADDITIONAL_MAKE_CLEAN_FILES ${output_name})
-  set_property(TARGET omptarget.devicertl.all_objs APPEND PROPERTY IMPORTED_OBJECTS ${output_name})
+  target_link_libraries(omp.${target_name} PRIVATE omp.${target_name}.all_objs)
+
+  install(TARGETS omp.${target_name}
+          ARCHIVE DESTINATION "lib${LLVM_LIBDIR_SUFFIX}/${target_triple}")
+
+  # Trick to combine these into a bitcode file via the linker's LTO pass. This
+  # is used to provide the legacy `libomptarget-<name>.bc` files.
+  add_executable(libomptarget-${target_name} ${obj_files})
+  set_target_properties(libomptarget-${target_name} PROPERTIES
+    RUNTIME_OUTPUT_DIRECTORY ${LIBOMPTARGET_LLVM_LIBRARY_INTDIR}
+    LINKER_LANGUAGE CXX
+    RUNTIME_OUTPUT_NAME libomptarget-${target_name}.bc)
+  target_compile_options(libomptarget-${target_name} PRIVATE "--target=${target_triple}")
+  target_link_options(libomptarget-${target_name} PRIVATE "--target=${target_triple}"
+                      "-r" "-nostdlib" "-flto" "-Wl,--lto-emit-llvm")
+  install(TARGETS libomptarget-${target_name}
+          PERMISSIONS OWNER_WRITE OWNER_READ GROUP_READ WORLD_READ
+          DESTINATION ${OFFLOAD_INSTALL_LIBDIR})
 
   if (CMAKE_EXPORT_COMPILE_COMMANDS)
     set(ide_target_name omptarget-ide-${target_name})
@@ -259,13 +196,3 @@ compileDeviceRTLLibrary(amdgpu amdgcn-amd-amdhsa -Xclang -mcode-object-version=n
 
 add_custom_target(omptarget.devicertl.nvptx)
 compileDeviceRTLLibrary(nvptx nvptx64-nvidia-cuda --cuda-feature=+ptx63)
-
-# Archive all the object files generated above into a static library
-add_library(omptarget.devicertl STATIC)
-set_target_properties(omptarget.devicertl PROPERTIES
-  ARCHIVE_OUTPUT_DIRECTORY "${LIBOMPTARGET_LLVM_LIBRARY_INTDIR}"
-  LINKER_LANGUAGE CXX
-)
-target_link_libraries(omptarget.devicertl PRIVATE omptarget.devicertl.all_objs)
-
-install(TARGETS omptarget.devicertl ARCHIVE DESTINATION ${OFFLOAD_INSTALL_LIBDIR})
diff --git a/offload/test/lit.cfg b/offload/test/lit.cfg
index 658ae5f9653ba90..565edc3e7faeb9d 100644
--- a/offload/test/lit.cfg
+++ b/offload/test/lit.cfg
@@ -183,11 +183,11 @@ def remove_suffix_if_present(name):
 
 def add_libraries(source):
     if config.libomptarget_has_libc:
-        return source + " -Xoffload-linker " + "-lc " + \
-               "-Xoffload-linker " + "-lm " + \
-               config.llvm_library_intdir + "/libomptarget.devicertl.a"
+        return source + " -Xoffload-linker -lc " + \
+               "-Xoffload-linker -lm " + \
+               "-Xoffload-linker -lomp "
     else:
-        return source + " " + config.llvm_library_intdir + "/libomptarget.devicertl.a"
+        return source + " " + "-Xoffload-lnker -lomp"
 
 # Add platform targets
 host_targets = [

@jhuber6 jhuber6 force-pushed the OpenMPLib branch 4 times, most recently from ea890db to cda46da Compare February 7, 2025 03:39
Copy link
Contributor

shiltian commented Feb 7, 2025

I'm not sure if calling it libomp.a is a good idea. I personally find it confusing because "libomp" is already an established name—it refers to the OpenMP host runtime contributed by Intel, just as "libgomp" is the OpenMP host runtime for GCC. These names are well-recognized. On the other hand, I have no issue with continuing to call it libomptarget.devicertl.a. Other than that, I think this is a great thing to have and a big step forward.

@Meinersbur
Copy link
Member

I second @shiltian's concern. Host-side and device-side libraries should have different names1 if there differences between them. We have host-offloading as well.

Footnotes

  1. Unless they would be interchangeable, i.e. contains host-side libomp, libomptarget AND device device-side devicertl code, if applicable for the respective target.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Feb 7, 2025

I'm not sure if calling it libomp.a is a good idea. I personally find it confusing because "libomp" is already an established name—it refers to the OpenMP host runtime contributed by Intel, just as "libgomp" is the OpenMP host runtime for GCC. These names are well-recognized. On the other hand, I have no issue with continuing to call it libomptarget.devicertl.a. Other than that, I think this is a great thing to have and a big step forward.

Sure, I can name it something else. Maybe we could just tall it libomptarget.a? I don't like the devicertl part.

@Meinersbur
Copy link
Member

We already have a host-side libomptarget:

add_llvm_library(omptarget

@jhuber6
Copy link
Contributor Author

jhuber6 commented Feb 7, 2025

We already have a host-side libomptarget:

add_llvm_library(omptarget

We can have conflicting names since they're in separate install directories. I'd like this to have a more 'standard' name if possible, and I feel putting a gpu or something on it is redundant since it's already in the GPU's library dir. To me, the main concern with using libomp was that people might think that it provides the same libomp that the CPU uses.

@saiislam
Copy link
Contributor

saiislam commented Feb 7, 2025

We already have a host-side libomptarget:

add_llvm_library(omptarget

We can have conflicting names since they're in separate install directories. I'd like this to have a more 'standard' name if possible, and I feel putting a gpu or something on it is redundant since it's already in the GPU's library dir. To me, the main concern with using libomp was that people might think that it provides the same libomp that the CPU uses.

Should it be named as liboffload or liboffload.devicertl?

@jhuber6
Copy link
Contributor Author

jhuber6 commented Feb 7, 2025

Should it be named as liboffload or liboffload.devicertl?

I don't want offload since I think that should honestly be a separate library. I'll just call it ompdevice and be done with it.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Feb 7, 2025

I figured that calling it libomptarget was fine because we already have libomptarget-amdgpu.bc so what's the big difference in amdgcn-amd-amdhsa/libomptarget.a. But I can change it.

@jhuber6 jhuber6 force-pushed the OpenMPLib branch 2 times, most recently from e3962f9 to 4d325d6 Compare February 7, 2025 15:44
@jhuber6
Copy link
Contributor Author

jhuber6 commented Feb 7, 2025

Copy link
Contributor

We already have a host-side libomptarget:

+1, though as @jhuber6 mentioned, we did name it prefix with libomptarget-nvptx64 or libomptarget-amdgcn before. I think this is a great opportunity to make it more appropriate. However, I don't have strong opinion against it.

@mgorny
Copy link
Member

mgorny commented Apr 19, 2025

Clang was built from the same commit id (ee4c8b5).

@mgorny
Copy link
Member

mgorny commented Apr 19, 2025

How did you disable it? Perhaps it's failing because of the specific error:

$ nvptx-arch                                                                                                            
Failed to 'dlopen' libcuda.so.1

For comparison, amdgpu-arch doesn't print anything and exits with 0 status (nvptx-arch yields 1).

@jhuber6
Copy link
Contributor Author

jhuber6 commented Apr 19, 2025

How did you disable it? Perhaps it's failing because of the specific error:

$ nvptx-arch                                                                                                            
Failed to 'dlopen' libcuda.so.1

For comparison, amdgpu-arch doesn't print anything and exits with 0 status (nvptx-arch yields 1).

I just set CUDA_VISIBLE_DEVICES='' to simulate your machine without a GPU. It's confusing here, because this is direct compilation. It should only be invoking nvptx-arch if the user passed -march=native. So, doing this for some random file doesn't work for you?

$ clang foo.c --target=nvptx64-nvidia-cuda -flto -r -Wl,--lto-emit-llvm -o out.bc

For comparison, amdgpu-arch doesn't print anything and exits with 0 status (nvptx-arch yields 1).

Can't decide if we should indicate why it failed, maybe add a verbose mode or something.

@mgorny
Copy link
Member

mgorny commented Apr 19, 2025

I'm sorry, I've unmerged these packages already after trying an older commit. Lemme build them again.

@mgorny
Copy link
Member

mgorny commented Apr 19, 2025

It should only be invoking nvptx-arch if the user passed -march=native.

Sorry, didn't notice this sentence. Well, I am building with -march=native here — after all, other files are built for a CPU. If I change it to, say, -march=zen2, then it indeed compiles, with a lot of warnings:

'zen2' is not a recognized processor for this target (ignoring processor)
'zen2' is not a recognized processor for this target (ignoring processor)
'zen2' is not a recognized processor for this target (ignoring processor)
'zen2' is not a recognized processor for this target (ignoring processor)
'zen2' is not a recognized processor for this target (ignoring processor)
'zen2' is not a recognized processor for this target (ignoring processor)

To be honest, this only proves my point — you shouldn't be invoking CMAKE_CXX_COMPILER when you're building for a GPU, since it's a compiler configured to be building for the host CPU.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Apr 19, 2025

It should only be invoking nvptx-arch if the user passed -march=native.

Sorry, didn't notice this sentence. Well, I am building with -march=native here — after all, other files are built for a CPU. If I change it to, say, -march=zen2, then it indeed compiles, with a lot of warnings:

'zen2' is not a recognized processor for this target (ignoring processor)
'zen2' is not a recognized processor for this target (ignoring processor)
'zen2' is not a recognized processor for this target (ignoring processor)
'zen2' is not a recognized processor for this target (ignoring processor)
'zen2' is not a recognized processor for this target (ignoring processor)
'zen2' is not a recognized processor for this target (ignoring processor)

To be honest, this only proves my point — you shouldn't be invoking CMAKE_CXX_COMPILER when you're building for a GPU, since it's a compiler configured to be building for the host CPU.

Hence why the follow-up to this will move it to a separate build where you pass https://cmake.org/cmake/help/latest/variable/CMAKE_LANG_COMPILER_TARGET.html or the LLVM runtimes target. Right now I guess it's in an awful worst-of-both-worlds state, but I was hoping to get the functional change to the generated code done so I could just do the infrastructure change.

That lets you actually invoke CMAKE_CXX_COMPILER correctly which means we can just use normal CMake instead of this weird scenario where we pull clang out of thin air and recreate a compilation flow with custom commands.

I think for now if we just pass -march='' it would override whatever the user passed in.

@mgorny
Copy link
Member

mgorny commented Apr 19, 2025

Yeah, appending -march= seems to work.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Apr 19, 2025

Yeah, appending -march= seems to work.

Is this a functional work-around for now?

diff --git a/offload/DeviceRTL/CMakeLists.txt b/offload/DeviceRTL/CMakeLists.txt
index cce360236960..277ad9816411 100644
--- a/offload/DeviceRTL/CMakeLists.txt
+++ b/offload/DeviceRTL/CMakeLists.txt
@@ -132,7 +132,7 @@ function(compileDeviceRTLLibrary target_name target_triple)
     BUILD_RPATH ""
     INSTALL_RPATH ""
     RUNTIME_OUTPUT_NAME libomptarget-${target_name}.bc)
-  target_compile_options(libomptarget-${target_name} PRIVATE "--target=${target_triple}")
+  target_compile_options(libomptarget-${target_name} PRIVATE "--target=${target_triple}" "-march=''")
   target_link_options(libomptarget-${target_name} PRIVATE "--target=${target_triple}"
                       "-r" "-nostdlib" "-flto" "-Wl,--lto-emit-llvm")
   install(TARGETS libomptarget-${target_name}

@mgorny
Copy link
Member

mgorny commented Apr 19, 2025

No, I'm afraid that didn't change anything. However, it did if I added it to target_link_options too.

That said, you want to instead:

--- a/offload/DeviceRTL/CMakeLists.txt
+++ b/offload/DeviceRTL/CMakeLists.txt
@@ -132,7 +132,7 @@ function(compileDeviceRTLLibrary target_name target_triple)
     BUILD_RPATH ""
     INSTALL_RPATH ""
     RUNTIME_OUTPUT_NAME libomptarget-${target_name}.bc)
-  target_compile_options(libomptarget-${target_name} PRIVATE "--target=${target_triple}")
+  target_compile_options(libomptarget-${target_name} PRIVATE "--target=${target_triple}" "-march=")
   target_link_options(libomptarget-${target_name} PRIVATE "--target=${target_triple}"
-                      "-r" "-nostdlib" "-flto" "-Wl,--lto-emit-llvm")
+                      "-r" "-nostdlib" "-flto" "-Wl,--lto-emit-llvm" "-march=")
   install(TARGETS libomptarget-${target_name}

without the '' since that gets passed literally, and triggers warnings about '' being an invalid target :-).

@jhuber6
Copy link
Contributor Author

jhuber6 commented Apr 19, 2025

No, I'm afraid that didn't change anything. However, it did if I added it to target_link_options too.

That said, you want to instead:

--- a/offload/DeviceRTL/CMakeLists.txt
+++ b/offload/DeviceRTL/CMakeLists.txt
@@ -132,7 +132,7 @@ function(compileDeviceRTLLibrary target_name target_triple)
     BUILD_RPATH ""
     INSTALL_RPATH ""
     RUNTIME_OUTPUT_NAME libomptarget-${target_name}.bc)
-  target_compile_options(libomptarget-${target_name} PRIVATE "--target=${target_triple}")
+  target_compile_options(libomptarget-${target_name} PRIVATE "--target=${target_triple}" "-march=")
   target_link_options(libomptarget-${target_name} PRIVATE "--target=${target_triple}"
-                      "-r" "-nostdlib" "-flto" "-Wl,--lto-emit-llvm")
+                      "-r" "-nostdlib" "-flto" "-Wl,--lto-emit-llvm" "-march=")
   install(TARGETS libomptarget-${target_name}

without the '' since that gets passed literally, and triggers warnings about '' being an invalid target :-).

Alright, thanks. Do you want to make a PR for that? Sorry this is a little disruptive but I think this is long overdue. After I land the follow up you'll need to do a separate standalone build for the GPU portions, since they'd be considered different cross-compiling libraries. It's a much more straightforward model I'm trying to move all the existing GPU runtimes to.

@mgorny
Copy link
Member

mgorny commented Apr 19, 2025

Yeah, I can do that. I suppose a separate build would be cleaner in the long run — and hopefully it will let us build the CPU part cleanly with GCC xP.

@mgorny
Copy link
Member

mgorny commented Apr 19, 2025

#136442

mgorny added a commit to mgorny/llvm-project that referenced this pull request Apr 19, 2025
Unset `-march` when invoking the compiler and linker to build the GPU
libraries.  These libraries use GPU targets rather than the CPU targets,
and an incidental `-march=native` causes Clang to be able to determine
the GPU used — which causes the build to fail when there is no GPU
available.  Resetting `-march=` should suffice to revert to building
generic code for the time being.

See the discussion in:
llvm#126143 (comment)
mgorny added a commit that referenced this pull request Apr 20, 2025
Unset `-march` when invoking the compiler and linker to build the GPU
libraries. These libraries use GPU targets rather than the CPU targets,
and an incidental `-march=native` causes Clang to be able to determine
the GPU used — which causes the build to fail when there is no GPU
available. Resetting `-march=` should suffice to revert to building
generic code for the time being.

See the discussion in:
#126143 (comment)
@nikic
Copy link
Contributor

nikic commented Apr 24, 2025

A naive question from someone who is not familiar with this area: Is any of this stuff usable with anything but a matching version of clang? If no, can we place these things in the clang resource directory, where the other version-bound runtimes live?

@jhuber6
Copy link
Contributor Author

jhuber6 commented Apr 24, 2025

A naive question from someone who is not familiar with this area: Is any of this stuff usable with anything but a matching version of clang? If no, can we place these things in the clang resource directory, where the other version-bound runtimes live?

It's not intended, since we tend to use clang features as we add them. My understanding is that language runtimes go in the normal lib/<triple> directory while things relating to the compiler go in the resource directory. If people want to move them we certainly could, but I found this analagous to things like libc libc++ and libomp which normally go in this directory.


add_library(omptarget.${target_name}.all_objs OBJECT IMPORTED)
set_property(TARGET omptarget.${target_name}.all_objs APPEND PROPERTY IMPORTED_OBJECTS
${LIBOMPTARGET_LLVM_LIBRARY_INTDIR}/libomptarget-${target_name}.bc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting

make[5]: *** No rule to make target '/vast/users/yeluo/opt/llvm-clang/build_mirror_offload_nightly/lib/libomptarget-amdgpu.bc', needed by '/vast/users/yeluo/opt/llvm-clang/build_mirror_offload_nightly/lib/amdgcn-amd-amdhsa/libompdevice.a'.  Stop.
make[4]: *** [CMakeFiles/Makefile2:17388: offload/DeviceRTL/CMakeFiles/omptarget.amdgpu.dir/all] Error 2

target files produced within the project, libomptarget-amdgpu.bc in this case, cannot be used as imported because such files may not exist when it is needed, in this case by omptarget.amdgpu target.

ye-luo added a commit that referenced this pull request May 5, 2025
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request May 6, 2025
…442)

Unset `-march` when invoking the compiler and linker to build the GPU
libraries. These libraries use GPU targets rather than the CPU targets,
and an incidental `-march=native` causes Clang to be able to determine
the GPU used — which causes the build to fail when there is no GPU
available. Resetting `-march=` should suffice to revert to building
generic code for the time being.

See the discussion in:
llvm/llvm-project#126143 (comment)
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
…ibrary (llvm#126143)

Summary:
Currently, we build a single `libomptarget.devicertl.a` which is a
fatbinary. It is a host object file that contains the embedded archive
files for both the NVIDIA and AMDGPU targets. This was done primarily as
a convenience due to naming conflicts. Now that the clang driver for the
GPU targets can appropriate link via the per-target runtime-dir, we can
just make two separate static libraries and remove the indirection.

This patch creates two new static libraries that get installed into
```
/lib/amdgcn-amd-amdhsa/libomp.a
/lib/nvptx64-nvidia-cuda/libomp.a
```
for AMDGPU and NVPTX respectively. The link job created by the linker
wrapper now simply needs to do `-lomp` and it will search those
directories and link those static libraries. This requires far less
special handling.

This patch is a precursor to changing the build system entirely to be a
runtimes based one. Soon this target will be a standard `add_library`
and done through the GPU runtime targets.

NOTE that this actually does remove an additional optimization step.
Previously we merged all of the files into a single bitcode object and
forcibly internalized some definitions. This, instead, just treats them
like a normal static library. This may possibly affect performance for
some files, but I think it's better overall to use static library
semantics because it allows us to have an 'include-what-you-use'
relationship with the library.

Performance testing will be required. If we really need the merged blob
then we can simply pack that into a new static library.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
Unset `-march` when invoking the compiler and linker to build the GPU
libraries. These libraries use GPU targets rather than the CPU targets,
and an incidental `-march=native` causes Clang to be able to determine
the GPU used — which causes the build to fail when there is no GPU
available. Resetting `-march=` should suffice to revert to building
generic code for the time being.

See the discussion in:
llvm#126143 (comment)
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
Unset `-march` when invoking the compiler and linker to build the GPU
libraries. These libraries use GPU targets rather than the CPU targets,
and an incidental `-march=native` causes Clang to be able to determine
the GPU used — which causes the build to fail when there is no GPU
available. Resetting `-march=` should suffice to revert to building
generic code for the time being.

See the discussion in:
llvm#126143 (comment)
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
…ibrary (llvm#126143)

Summary:
Currently, we build a single `libomptarget.devicertl.a` which is a
fatbinary. It is a host object file that contains the embedded archive
files for both the NVIDIA and AMDGPU targets. This was done primarily as
a convenience due to naming conflicts. Now that the clang driver for the
GPU targets can appropriate link via the per-target runtime-dir, we can
just make two separate static libraries and remove the indirection.

This patch creates two new static libraries that get installed into
```
/lib/amdgcn-amd-amdhsa/libomp.a
/lib/nvptx64-nvidia-cuda/libomp.a
```
for AMDGPU and NVPTX respectively. The link job created by the linker
wrapper now simply needs to do `-lomp` and it will search those
directories and link those static libraries. This requires far less
special handling.

This patch is a precursor to changing the build system entirely to be a
runtimes based one. Soon this target will be a standard `add_library`
and done through the GPU runtime targets.

NOTE that this actually does remove an additional optimization step.
Previously we merged all of the files into a single bitcode object and
forcibly internalized some definitions. This, instead, just treats them
like a normal static library. This may possibly affect performance for
some files, but I think it's better overall to use static library
semantics because it allows us to have an 'include-what-you-use'
relationship with the library.

Performance testing will be required. If we really need the merged blob
then we can simply pack that into a new static library.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
Unset `-march` when invoking the compiler and linker to build the GPU
libraries. These libraries use GPU targets rather than the CPU targets,
and an incidental `-march=native` causes Clang to be able to determine
the GPU used — which causes the build to fail when there is no GPU
available. Resetting `-march=` should suffice to revert to building
generic code for the time being.

See the discussion in:
llvm#126143 (comment)
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
GeorgeARM pushed a commit to GeorgeARM/llvm-project that referenced this pull request May 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category offload
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants