[SYCL]Link Fallback Device Libraries On Demand #1787

jinge90 · 2020-05-30T07:58:26Z

Signed-off-by: gejin [email protected]

jinge90 · 2020-05-30T08:10:17Z

Hi, all. This patch implements the fallback device library(SPIRV files) on-demand loading.
The idea is:

After llvm-mc linking all kernel code into final device image, sycl-post-link tool will go through all modules to find all undefined "_devicelib*" function calls.
For each undefined "_devicelib*" functional call detected, we will check which fallback device library is required to support it.
After collecting all device libraries required, we will create an uint32_t "device library require mask" whose each bit corresponds to one fallback device library.
We store the "device library require mask" into "SYCL/devicelib req mask" property via sycl toolchain's "property" mechanism.
When sycl runtime loads device images, it will read the "device library require mask" and only load/link necessary fallback device libraries. If program doesn't use any device library functions, no fallback SPIRV file will be linked.
Hi, @vzakhari
Could you help evaluate the potential risk in OMP's perspective, I expect no impact as sycl runtime is not used in OMP offloading.
Thank you very much.

Signed-off-by: gejin <[email protected]>

…ib_on_demand

asavonic

LGTM.

llvm/tools/sycl-post-link/sycl-post-link.cpp

vzakhari · 2020-06-01T21:17:11Z

Hi @jinge90, you are right, there is no interaction with OpenMP currently.

Just curious, have you considered running dead code elimination in sycl-post-link so that it removes unused devicelib wrappers before propagating the property?

Can you please add header comments in devicelib files mentioning internal structures of sycl-post-linlk that need to be updated, if new __devicelib_ APIs are added? It is probably also worth updating devicelib documentation.

kbobrovs · 2020-06-01T22:40:58Z

After collecting all device libraries required, we will create an uint32_t "device library require mask" whose each bit corresponds to one fallback device library.

Why this is uint32_t bitmask rather than a list of symbolic library ids like in ELF? If this is an optimization, I feel that it won't bring tangible speedup, but complicate code.

UPDATE: I withdraw this comment, as integer identification have existed before.

llvm/tools/sycl-post-link/sycl-post-link.cpp

llvm/tools/sycl-post-link/DeviceLibFunctions.h

llvm/tools/sycl-post-link/sycl-post-link.cpp

llvm/tools/sycl-post-link/DeviceLibFunctions.h

llvm/tools/sycl-post-link/sycl-post-link.cpp

sycl/source/detail/program_manager/program_manager.cpp

llvm/tools/sycl-post-link/sycl-post-link.cpp

kbobrovs

see comments in the code

jinge90 · 2020-06-02T01:04:48Z

Hi @jinge90, you are right, there is no interaction with OpenMP currently.

Just curious, have you considered running dead code elimination in sycl-post-link so that it removes unused devicelib wrappers before propagating the property?

Can you please add header comments in devicelib files mentioning internal structures of sycl-post-linlk that need to be updated, if new __devicelib_ APIs are added? It is probably also worth updating devicelib documentation.

Hi, @vzakhari , I have considered dead code elimination and drafted a simple patch in sycl-post-link. I will send the patch to you later.

…ib_on_demand

Signed-off-by: Ge Jin <[email protected]>

…elib_on_demand

Signed-off-by: gejin <[email protected]>

…ib_on_demand

… sycl runtime Signed-off-by: gejin <[email protected]>

Signed-off-by: gejin <[email protected]>

libdevice/fallback-cmath-fp64.cpp

kbobrovs · 2020-06-24T18:46:58Z

llvm/include/llvm/Support/SYCLRTShared.h

@@ -0,0 +1,23 @@
+//=- SYCLRTShared.h - Shared definition between llvm tools and SYCL runtime -=//


Thanks!
@bader, @andykaylor, @romanovvlad - does this approach looks good to you? Is llvm::util::sycl a good namespace?

@jinge90 , I suggest to add a comments what this file is

shared type definitions between llvm tools and SYCL RT library

definitions which introduce SYCL RT - LLVM linkage dependencies should not be added here

This seems OK to me.

Hi, @kbobrovs
I have added the comments in SYCLRTShared.h.

@kbobrovs, thanks for pinging me and sorry for late feedback.

I don't fully understand the design of this "linking on demand" solution (it would be great to extend https://github.com/intel/llvm/blob/sycl/sycl/doc/CompilerAndRuntimeDesign.md document), but this concerns me.
It seems like we bake into LLVM compiler framework the information about Intel specific OpenCL extensions. @andykaylor, do you think this design can be accepted by the LLVM community? I think community prefers scaleable solutions (e.g. framework for linking arbitrary libraries), whereas current implementation is able to address very specific SYCL use case.

It would be great to get a community feedback on this approach through RFC.

It seems like we bake into LLVM compiler framework the information about Intel specific OpenCL extensions.

@bader, under "Intel specific OpenCL extensions" do you mean SYCL as a whole or just the "linking on demand part"?
If the former (SYCL), then in my view we already have baked a lot of SYCL into LLVM :) (for good reason) - take clang FE, for example, offload tools - bundler, wrapper, SYCL specific passes in LLVM itself. Maybe the location "llvm/Support/SYCLRTShared.h" is not ideal and there could be new folder with "SYCL" in it to underline SYCL specifics?

If the latter - then SYCLRTShared.h is not specific to this extension, it serves for sharing some type and string literal definitions between LLVM tools and SYCL runtime.

It would be great to get a community feedback on this approach through RFC.

Do you think this should be done before this PR can be merged or can be done in parallel?

It seems like we bake into LLVM compiler framework the information about Intel specific OpenCL extensions.

@bader, under "Intel specific OpenCL extensions" do you mean SYCL as a whole or just the "linking on demand part"?

cl_intel_devicelib_assert, cl_intel_devicelib_math, cl_intel_devicelib_math_fp64, cl_intel_devicelib_complex, cl_intel_devicelib_complex_fp64 - are Intel specific OpenCL extensions.

If the former (SYCL), then in my view we already have baked a lot of SYCL into LLVM :) (for good reason) - take clang FE, for example, offload tools - bundler, wrapper, SYCL specific passes in LLVM itself. Maybe the location "llvm/Support/SYCLRTShared.h" is not ideal and there could be new folder with "SYCL" in it to underline SYCL specifics?

If the latter - then SYCLRTShared.h is not specific to this extension, it serves for sharing some type and string literal definitions between LLVM tools and SYCL runtime.

Can we apply the same approach as regular C++ compiler does? I.e. link with the whole C++ standard library (using LLVM linker) + provide an compiler option to disable SYCL extension and skip linking with this libraries? This should much easier to implement and maintain as it should not require extending LLVM functionality.

It would be great to get a community feedback on this approach through RFC.

Do you think this should be done before this PR can be merged or can be done in parallel?

Ideally the review should happen before the merge, but if there are any time constraints we can merge and re-do later if needed.

Hi, @kbobrovs @andykaylor @bader
This patch is used to fix some perf regression for cases which has "small" kernel execution time and we are required to fix it ASAP recently, could we merge it and continue to work on the RFC to collect community's feedback?
And I think it is valuable to add a header to share information between sycl runtime and llvm-tools. Currently, the header is located in llvm/Support/SYCLRTShared.h and following things are included: Property constant string such as "SYCL/specialization constants"; DeviceLibExt enum definition such as "cl_intel_devicelib_***"
We need a careful analysis and discussion on following points:

The header's filename and location?

The namespace used in this header?

Which definitions or constants can be placed in this file?

Can intel-specific things be placed in this header?
Thank you very much.

I'm not convinced that this is the right approach to "fix performance regression". Moving some runtime overhead to compile time (e.g. #1398) seems to be a better option.
What is the root cause of the regression? Please, add a problem statement to the PR description.

llvm/tools/sycl-post-link/sycl-post-link.cpp

sycl/include/CL/sycl/detail/pi.h

kbobrovs · 2020-06-24T19:10:02Z

sycl/test/program_manager/env_vars.cpp

@@ -5,15 +5,12 @@
 //
 // RUN: %CPU_RUN_PLACEHOLDER SYCL_PROGRAM_COMPILE_OPTIONS="-g" %t.out
 // RUN: %GPU_RUN_PLACEHOLDER SYCL_PROGRAM_COMPILE_OPTIONS="-g" %t.out
-// RUN: %CPU_RUN_PLACEHOLDER SYCL_PROGRAM_LINK_OPTIONS="-enable-link-options -cl-denorms-are-zero" %t.out
-// RUN: %GPU_RUN_PLACEHOLDER SYCL_PROGRAM_LINK_OPTIONS="-enable-link-options -cl-denorms-are-zero" %t.out


why is this removed?

Hi, @kbobrovs
Currently, all program's kernel code will be linked with our devicelib as default, so sycl runtime will use piProgramLink to link all spv images. However, after we enabled devicelib "on-demand" loading, sycl runtime will use piProgramBuild to build users' device image. For OCL backend, piProgramBuild will be mapped to OCL's clBuildProgram API. According to OCL Spec, we can only pass "Compiler options" to this API and "Link Options" is not used.
To this test, the source code doesn't depend on devicelib, so I remove all test commands using "SYCL_PROGRAM_LINK_OPTIONS".
However, current status is a little tricky, we have to keep loading "assert" fallback spv as default to workaround another separate issue, so sycl runtime will still use piProgramCompile + piProgramLink for all program's device code. Do you think we need to keep those test commands for now?
Thank you very much.

I think the problems you describe are implementation detail and should be fixed. User setting SYCL_PROGRAM_LINK_OPTIONS= should not lead to crashes if link options are good. If that's the case, the implementation should be fixed, I think.

Hi, @kbobrovs
For this case, those link options will lead to crash in OCL CPU runtime, the problem is in "-enable-link-options". If no device libraries will be loaded and linked with user's kernel code, sycl runtime will use piProgramBuild for jit compilation which will be mapped to clBuildProgram. According to OCL Spec, "-enable-link-options" must be specified with "--create-library" option. I think we have 2 ways to deal with the case:

If sycl runtime uses piProgramBuild, the SYCL_PROGRAM_LINK_OPTIONS env variable value will be ignored, we only pass SYCL_PROGRAM_COMPILER_OPTIONS value to underlying OCL runtime.

We keep passing "SYCL_PROGRAM_LINK_OPTIONS" to underlying OCL runtime even if piProgramBuild is used, it's user's duty to set "good" SYCL_PROGRAM_LINK_OPTIONS. We need to fix this case to remove "-enable-link-options".
Which one do you prefer or do you have any other idea?
Thank you very much.

According to OCL Spec, "-enable-link-options" must be specified with "--create-library" option.

I see, thanks. This makes the test invalid, so removal is OK then.
2) seems better solution. But for this PR just removal of the two invalid test cases (as you did before) should be enough

llvm/tools/sycl-post-link/sycl-post-link.cpp

Signed-off-by: gejin <[email protected]>

llvm/tools/sycl-post-link/sycl-post-link.cpp

llvm/include/llvm/Support/SYCLRTShared.h

llvm/include/llvm/Support/PropertySetIO.h

kbobrovs · 2020-06-25T18:26:35Z

llvm/tools/sycl-post-link/sycl-post-link.cpp

+using llvm::util::sycl::cl_intel_devicelib_assert;
+using llvm::util::sycl::cl_intel_devicelib_complex;
+using llvm::util::sycl::cl_intel_devicelib_complex_fp64;
+using llvm::util::sycl::cl_intel_devicelib_math;
+using llvm::util::sycl::cl_intel_devicelib_math_fp64;


somehow my previous comment on this disappeared. Please don't using all the individual enum elements.

BTW, I see that you add merge commits when you implement review comments:

Merge remote-tracking branch 'upstream/sycl' into link_fallback_devic… …

This does not seem right to me. Maybe @bader can comment more. I think this can lead to missing review comments. You should only use git push/git pull when working with the review, so that jinge90:link_fallback_devicelib_on_demand does not have merge commits. Only if you have conflicts, you should rebase your feature branch. I do it like this:

git rebase <hash> --onto intel_llvm/sycl

where is the commit right below the first feature branch commit in git log output.

Hi, @kbobrovs
Thank you for the remind. I updated patch to use c++11 scoped enum for DeviceLibExt, the definition looks like:
enum class DeviceLibExt : std::uint32_t {cl_intel_devicelib_assert,.....}
By doing this, we don't need to "using" all individual enum elements, we only need to "using llvm::util::sycl::DeviceLibExt;" in source code and use "DeviceLibExt::cl_intel_devicelib_*" to refer to all DeviceLibExt enum elements. Is this OK to you?

Yes, thank you.

kbobrovs · 2020-06-25T18:39:58Z

sycl/source/detail/program_manager/program_manager.hpp

+using llvm::util::sycl::cl_intel_devicelib_assert;
+using llvm::util::sycl::cl_intel_devicelib_complex;
+using llvm::util::sycl::cl_intel_devicelib_complex_fp64;
+using llvm::util::sycl::cl_intel_devicelib_math;
+using llvm::util::sycl::cl_intel_devicelib_math_fp64;
+using llvm::util::sycl::DeviceLibExt;


please don't using individual enum elements

kbobrovs · 2020-06-25T18:44:48Z

sycl/test/program_manager/env_vars.cpp

@@ -5,15 +5,12 @@
 //
 // RUN: %CPU_RUN_PLACEHOLDER SYCL_PROGRAM_COMPILE_OPTIONS="-g" %t.out
 // RUN: %GPU_RUN_PLACEHOLDER SYCL_PROGRAM_COMPILE_OPTIONS="-g" %t.out
-// RUN: %CPU_RUN_PLACEHOLDER SYCL_PROGRAM_LINK_OPTIONS="-enable-link-options -cl-denorms-are-zero" %t.out
-// RUN: %GPU_RUN_PLACEHOLDER SYCL_PROGRAM_LINK_OPTIONS="-enable-link-options -cl-denorms-are-zero" %t.out


I think the problems you describe are implementation detail and should be fixed. User setting SYCL_PROGRAM_LINK_OPTIONS= should not lead to crashes if link options are good. If that's the case, the implementation should be fixed, I think.

Signed-off-by: gejin <[email protected]>

kbobrovs

LGTM, thanks

vzakhari

LGTM for devicelib

bader

I'm okay to merge if someone is waiting for this fix. I'd like to have separate discussion to better understand the problem this PR is supposed to solve, because I have serious concerns that we won't be able to upstream this solution.

bader · 2020-07-07T11:40:31Z

@smaslov-intel, @intel/llvm-reviewers-runtime, ping.

jinge90 · 2020-07-07T11:54:45Z

I'm okay to merge if someone is waiting for this fix. I'd like to have separate discussion to better understand the problem this PR is supposed to solve, because I have serious concerns that we won't be able to upstream this solution.

Hi, @bader @kbobrovs @andykaylor
"devicelib on-demand loading" is an optimization for current devicelib infra. And we can implement it without introducing header file to llvm/Support/...(We didn't touch any thing in llvm/ in the original patch) In fact, introducing some mechanism or header files to LLVM framework to share information between SYCL runtime and llvm tools is another separate problem and this problem is much more general than this patch. I agree with Alexey that we need a separate discussion on this problem.
Thank you very much,

kbobrovs · 2020-07-07T17:05:03Z

@jinge90, I'm ok to revert the shared header and return to code duplication if there is no consensus on the solution. If you do this, please restore cross-reference in the comments between the duplicated parts and match naming.

jinge90 · 2020-07-08T06:35:36Z

@jinge90, I'm ok to revert the shared header and return to code duplication if there is no consensus on the solution. If you do this, please restore cross-reference in the comments between the duplicated parts and match naming.

Hi, @kbobrovs
I have reverted the shared header and added the comments to remind sync between sycl runtime and sycl-post-link tool.
Thank you very much.

Signed-off-by: gejin <[email protected]>

smaslov-intel

pi changes look OK

vzakhari

Looks good for devicelib.

jinge90 · 2020-07-10T02:20:36Z

Hi, @kbobrovs @bader @asavonic
I have removed the SYCLRTShared.h to avoid touching LLVM framework, could you help approve again.
Thank you very much.

Original commit: KhronosGroup/SPIRV-LLVM-Translator@74681f6

jinge90 requested review from bader, kbobrovs and smaslov-intel as code owners May 30, 2020 07:58

jinge90 requested review from asavonic, andykaylor and vzakhari May 30, 2020 07:58

jinge90 force-pushed the link_fallback_devicelib_on_demand branch 2 times, most recently from 38dfccb to c6af296 Compare May 31, 2020 07:15

jinge90 added 2 commits May 31, 2020 15:19

[SYCL]Link Fallback Device Libraries On Demand

c6af296

Signed-off-by: gejin <[email protected]>

Merge remote-tracking branch 'origin/sycl' into link_fallback_devicel…

3eea5dc

…ib_on_demand

asavonic reviewed Jun 1, 2020

View reviewed changes

llvm/tools/sycl-post-link/sycl-post-link.cpp Outdated Show resolved Hide resolved

kbobrovs reviewed Jun 1, 2020

View reviewed changes

llvm/tools/sycl-post-link/sycl-post-link.cpp Show resolved Hide resolved

kbobrovs reviewed Jun 1, 2020

View reviewed changes

llvm/tools/sycl-post-link/sycl-post-link.cpp Outdated Show resolved Hide resolved

kbobrovs reviewed Jun 1, 2020

View reviewed changes

llvm/tools/sycl-post-link/DeviceLibFunctions.h Outdated Show resolved Hide resolved

kbobrovs reviewed Jun 1, 2020

View reviewed changes

llvm/tools/sycl-post-link/sycl-post-link.cpp Show resolved Hide resolved

kbobrovs reviewed Jun 1, 2020

View reviewed changes

llvm/tools/sycl-post-link/DeviceLibFunctions.h Outdated Show resolved Hide resolved

andykaylor reviewed Jun 1, 2020

View reviewed changes

kbobrovs reviewed Jun 2, 2020

View reviewed changes

llvm/tools/sycl-post-link/sycl-post-link.cpp Outdated Show resolved Hide resolved

kbobrovs suggested changes Jun 2, 2020

View reviewed changes

jinge90 added 2 commits June 4, 2020 13:08

Merge remote-tracking branch 'origin/sycl' into link_fallback_devicel…

5c59cdf

…ib_on_demand

load all fallback device libraries when loading kernel from file

698b13d

Signed-off-by: Ge Jin <[email protected]>

jinge90 force-pushed the link_fallback_devicelib_on_demand branch from 0dec3ab to 698b13d Compare June 4, 2020 14:11

Merge remote-tracking branch 'upstream/sycl' into link_fallback_devic…

ab9618c

…elib_on_demand

jinge90 requested a review from a team as a code owner June 9, 2020 16:04

jinge90 added 2 commits June 10, 2020 00:08

[SYCL]Remove Link Options when calling piProgramBuild for device code

35ec2df

Signed-off-by: gejin <[email protected]>

Merge remote-tracking branch 'origin/sycl' into link_fallback_devicel…

61943e4

…ib_on_demand

jinge90 added 3 commits June 21, 2020 22:48

Merge remote-tracking branch 'origin/sycl' into link_fallback_devicel…

4da5790

…ib_on_demand

[SYCL] Add SYCLRTShared.h to share definitions between llvm tools and…

8377e11

… sycl runtime Signed-off-by: gejin <[email protected]>

Fix clang-format issue

ac472cb

Signed-off-by: gejin <[email protected]>

kbobrovs suggested changes Jun 24, 2020

View reviewed changes

kbobrovs reviewed Jun 24, 2020

View reviewed changes

llvm/tools/sycl-post-link/sycl-post-link.cpp Outdated Show resolved Hide resolved

Replace std::map with std::unordered_map for DeviceLibFuncMap

c7d459c

Signed-off-by: gejin <[email protected]>

kbobrovs suggested changes Jun 25, 2020

View reviewed changes

jinge90 added 3 commits June 28, 2020 11:33

[SYCL]Use scoped enum for DeviceLibExt

a7e8ea4

Signed-off-by: gejin <[email protected]>

[SYCL] Add comments in SYCLRTShared.h

652d15a

Signed-off-by: gejin <[email protected]>

fix format issue

117bebf

Signed-off-by: gejin <[email protected]>

kbobrovs previously approved these changes Jul 1, 2020

View reviewed changes

vzakhari previously approved these changes Jul 2, 2020

View reviewed changes

bader previously approved these changes Jul 7, 2020

View reviewed changes

jinge90 dismissed stale reviews from bader, vzakhari, and kbobrovs via 5445247 July 8, 2020 06:34

remove SYCLRTShared.h to avoid touch LLVM functionality

5445247

Signed-off-by: gejin <[email protected]>

smaslov-intel approved these changes Jul 8, 2020

View reviewed changes

vzakhari approved these changes Jul 8, 2020

View reviewed changes

bader approved these changes Jul 10, 2020

View reviewed changes

asavonic approved these changes Jul 10, 2020

View reviewed changes

kbobrovs approved these changes Jul 14, 2020

View reviewed changes

bader merged commit 9a8864c into intel:sycl Jul 14, 2020

jinge90 deleted the link_fallback_devicelib_on_demand branch January 26, 2021 01:21

bb-sycl pushed a commit that referenced this pull request Jan 18, 2023

Fix issues reported by static analyzer tool (#1787)

551a56d

Original commit: KhronosGroup/SPIRV-LLVM-Translator@74681f6

		@@ -0,0 +1,23 @@
		//=- SYCLRTShared.h - Shared definition between llvm tools and SYCL runtime -=//

[SYCL]Link Fallback Device Libraries On Demand #1787

[SYCL]Link Fallback Device Libraries On Demand #1787

Conversation

jinge90 commented May 30, 2020

jinge90 commented May 30, 2020

asavonic left a comment

Choose a reason for hiding this comment

vzakhari commented Jun 1, 2020

kbobrovs commented Jun 1, 2020 • edited Loading

kbobrovs left a comment

Choose a reason for hiding this comment

jinge90 commented Jun 2, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kbobrovs Jul 2, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jinge90 Jul 7, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kbobrovs left a comment

Choose a reason for hiding this comment

vzakhari left a comment

Choose a reason for hiding this comment

bader left a comment

Choose a reason for hiding this comment

bader commented Jul 7, 2020

jinge90 commented Jul 7, 2020

kbobrovs commented Jul 7, 2020

jinge90 commented Jul 8, 2020

smaslov-intel left a comment

Choose a reason for hiding this comment

vzakhari left a comment

Choose a reason for hiding this comment

jinge90 commented Jul 10, 2020

kbobrovs commented Jun 1, 2020 •

edited

Loading

kbobrovs Jul 2, 2020 •

edited

Loading

jinge90 Jul 7, 2020 •

edited

Loading