Skip to content

[SYCL]Link Fallback Device Libraries On Demand #1787

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
Jul 14, 2020
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
c6af296
[SYCL]Link Fallback Device Libraries On Demand
jinge90 May 30, 2020
3eea5dc
Merge remote-tracking branch 'origin/sycl' into link_fallback_devicel…
jinge90 Jun 1, 2020
5c59cdf
Merge remote-tracking branch 'origin/sycl' into link_fallback_devicel…
jinge90 Jun 4, 2020
698b13d
load all fallback device libraries when loading kernel from file
jinge90 Jun 4, 2020
ab9618c
Merge remote-tracking branch 'upstream/sycl' into link_fallback_devic…
jinge90 Jun 9, 2020
35ec2df
[SYCL]Remove Link Options when calling piProgramBuild for device code
jinge90 Jun 9, 2020
61943e4
Merge remote-tracking branch 'origin/sycl' into link_fallback_devicel…
jinge90 Jun 10, 2020
893b699
[SYCL]Update FileCheck target string for jit compilation.
jinge90 Jun 10, 2020
8d7ec71
Merge remote-tracking branch 'origin/sycl' into link_fallback_devicel…
jinge90 Jun 10, 2020
7686038
[SYCL]Link libsycl-fallback-assert.spv as default.
jinge90 Jun 12, 2020
b8f1037
Merge remote-tracking branch 'origin/sycl' into link_fallback_devicel…
jinge90 Jun 12, 2020
c093963
[SYCL]Use Map to store devicelib func info
jinge90 Jun 17, 2020
8d20764
[SYCL] add comments to remind to sync between libdevice and sycl-post…
jinge90 Jun 18, 2020
bfa8c2c
[SYCL]Refactor Property Saving
jinge90 Jun 19, 2020
4da5790
Merge remote-tracking branch 'origin/sycl' into link_fallback_devicel…
jinge90 Jun 21, 2020
8377e11
[SYCL] Add SYCLRTShared.h to share definitions between llvm tools and…
jinge90 Jun 24, 2020
ac472cb
Fix clang-format issue
jinge90 Jun 24, 2020
c7d459c
Replace std::map with std::unordered_map for DeviceLibFuncMap
jinge90 Jun 25, 2020
a7e8ea4
[SYCL]Use scoped enum for DeviceLibExt
jinge90 Jun 28, 2020
652d15a
[SYCL] Add comments in SYCLRTShared.h
jinge90 Jun 30, 2020
117bebf
fix format issue
jinge90 Jun 30, 2020
5445247
remove SYCLRTShared.h to avoid touch LLVM functionality
jinge90 Jul 8, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions libdevice/fallback-cmath-fp64.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@

#ifdef __SPIR__

// To support fallback device libraries on-demand loading, please update the
// DeviceLibFuncMap in llvm/tools/sycl-post-link/sycl-post-link.cpp if you add
// or remove any item in this file.
DEVICE_EXTERN_C
double __devicelib_log(double x) { return __spirv_ocl_log(x); }

Expand Down
5 changes: 5 additions & 0 deletions libdevice/fallback-cmath.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,11 @@

#ifdef __SPIR__

// To support fallback device libraries on-demand loading, please update the
// DeviceLibFuncMap in llvm/tools/sycl-post-link/sycl-post-link.cpp if you add
// or remove any item in this file.
// TODO: generate the DeviceLibFuncMap in sycl-post-link.cpp automatically
// during the build based on libdevice to avoid manually sync.
DEVICE_EXTERN_C
float __devicelib_scalbnf(float x, int n) { return __spirv_ocl_ldexp(x, n); }

Expand Down
3 changes: 3 additions & 0 deletions libdevice/fallback-complex-fp64.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,9 @@
#ifdef __SPIR__
#include <cmath>

// To support fallback device libraries on-demand loading, please update the
// DeviceLibFuncMap in llvm/tools/sycl-post-link/sycl-post-link.cpp if you add
// or remove any item in this file.
DEVICE_EXTERN_C
double __devicelib_creal(double __complex__ z) { return __real__(z); }

Expand Down
3 changes: 3 additions & 0 deletions libdevice/fallback-complex.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,9 @@
#ifdef __SPIR__
#include <cmath>

// To support fallback device libraries on-demand loading, please update the
// DeviceLibFuncMap in llvm/tools/sycl-post-link/sycl-post-link.cpp if you add
// or remove any item in this file.
DEVICE_EXTERN_C
float __devicelib_crealf(float __complex__ z) { return __real__(z); }

Expand Down
3 changes: 2 additions & 1 deletion llvm/include/llvm/Support/PropertySetIO.h
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,7 @@ class PropertySetRegistry {
// Specific property category names used by tools.
static constexpr char SYCL_SPECIALIZATION_CONSTANTS[] =
"SYCL/specialization constants";
static constexpr char SYCL_DEVICELIB_REQ_MASK[] = "SYCL/devicelib req mask";

// Function for bulk addition of an entire property set under given category
// (property set name).
Expand Down Expand Up @@ -160,4 +161,4 @@ class PropertySetRegistry {
} // namespace util
} // namespace llvm

#endif // #define LLVM_SUPPORT_PROPERTYSETIO_H
#endif // #define LLVM_SUPPORT_PROPERTYSETIO_H
29 changes: 29 additions & 0 deletions llvm/include/llvm/Support/SYCLRTShared.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
//=- SYCLRTShared.h - Shared definition between llvm tools and SYCL runtime -=//
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!
@bader, @andykaylor, @romanovvlad - does this approach looks good to you? Is llvm::util::sycl a good namespace?

@jinge90 , I suggest to add a comments what this file is

  • shared type definitions between llvm tools and SYCL RT library
  • definitions which introduce SYCL RT - LLVM linkage dependencies should not be added here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems OK to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @kbobrovs
I have added the comments in SYCLRTShared.h.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kbobrovs, thanks for pinging me and sorry for late feedback.

I don't fully understand the design of this "linking on demand" solution (it would be great to extend https://github.com/intel/llvm/blob/sycl/sycl/doc/CompilerAndRuntimeDesign.md document), but this concerns me.
It seems like we bake into LLVM compiler framework the information about Intel specific OpenCL extensions. @andykaylor, do you think this design can be accepted by the LLVM community? I think community prefers scaleable solutions (e.g. framework for linking arbitrary libraries), whereas current implementation is able to address very specific SYCL use case.

It would be great to get a community feedback on this approach through RFC.

Copy link
Contributor

@kbobrovs kbobrovs Jul 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like we bake into LLVM compiler framework the information about Intel specific OpenCL extensions.

@bader, under "Intel specific OpenCL extensions" do you mean SYCL as a whole or just the "linking on demand part"?
If the former (SYCL), then in my view we already have baked a lot of SYCL into LLVM :) (for good reason) - take clang FE, for example, offload tools - bundler, wrapper, SYCL specific passes in LLVM itself. Maybe the location "llvm/Support/SYCLRTShared.h" is not ideal and there could be new folder with "SYCL" in it to underline SYCL specifics?

If the latter - then SYCLRTShared.h is not specific to this extension, it serves for sharing some type and string literal definitions between LLVM tools and SYCL runtime.

It would be great to get a community feedback on this approach through RFC.

Do you think this should be done before this PR can be merged or can be done in parallel?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like we bake into LLVM compiler framework the information about Intel specific OpenCL extensions.

@bader, under "Intel specific OpenCL extensions" do you mean SYCL as a whole or just the "linking on demand part"?

cl_intel_devicelib_assert, cl_intel_devicelib_math, cl_intel_devicelib_math_fp64, cl_intel_devicelib_complex, cl_intel_devicelib_complex_fp64 - are Intel specific OpenCL extensions.

If the former (SYCL), then in my view we already have baked a lot of SYCL into LLVM :) (for good reason) - take clang FE, for example, offload tools - bundler, wrapper, SYCL specific passes in LLVM itself. Maybe the location "llvm/Support/SYCLRTShared.h" is not ideal and there could be new folder with "SYCL" in it to underline SYCL specifics?

If the latter - then SYCLRTShared.h is not specific to this extension, it serves for sharing some type and string literal definitions between LLVM tools and SYCL runtime.

Can we apply the same approach as regular C++ compiler does? I.e. link with the whole C++ standard library (using LLVM linker) + provide an compiler option to disable SYCL extension and skip linking with this libraries? This should much easier to implement and maintain as it should not require extending LLVM functionality.

It would be great to get a community feedback on this approach through RFC.

Do you think this should be done before this PR can be merged or can be done in parallel?

Ideally the review should happen before the merge, but if there are any time constraints we can merge and re-do later if needed.

Copy link
Contributor Author

@jinge90 jinge90 Jul 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @kbobrovs @andykaylor @bader
This patch is used to fix some perf regression for cases which has "small" kernel execution time and we are required to fix it ASAP recently, could we merge it and continue to work on the RFC to collect community's feedback?
And I think it is valuable to add a header to share information between sycl runtime and llvm-tools. Currently, the header is located in llvm/Support/SYCLRTShared.h and following things are included: Property constant string such as "SYCL/specialization constants"; DeviceLibExt enum definition such as "cl_intel_devicelib_***"
We need a careful analysis and discussion on following points:

  1. The header's filename and location?
  2. The namespace used in this header?
  3. Which definitions or constants can be placed in this file?
  4. Can intel-specific things be placed in this header?
    Thank you very much.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced that this is the right approach to "fix performance regression". Moving some runtime overhead to compile time (e.g. #1398) seems to be a better option.
What is the root cause of the regression? Please, add a problem statement to the PR description.

//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
#ifndef LLVM_SUPPORT_SYCLRTSHARED_H
#define LLVM_SUPPORT_SYCLRTSHARED_H
/// Device binary image property set names recognized by the SYCL runtime.
/// Name must be consistent with
/// PropertySetRegistry::SYCL_SPECIALIZATION_CONSTANTS defined in
/// PropertySetIO.h
#define PI_PROPERTY_SET_SPEC_CONST_MAP "SYCL/specialization constants"
#define PI_PROPERTY_SET_DEVICELIB_REQ_MASK "SYCL/devicelib req mask"
namespace llvm {
namespace util {
namespace sycl {
enum DeviceLibExt {
cl_intel_devicelib_assert = 0,
cl_intel_devicelib_math,
cl_intel_devicelib_math_fp64,
cl_intel_devicelib_complex,
cl_intel_devicelib_complex_fp64
};
}
} // namespace util
} // namespace llvm
#endif
1 change: 1 addition & 0 deletions llvm/lib/Support/PropertySetIO.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -112,5 +112,6 @@ template <> PropertyValue::Type PropertyValue::getTypeTag<uint32_t>() {
}

constexpr char PropertySetRegistry::SYCL_SPECIALIZATION_CONSTANTS[];
constexpr char PropertySetRegistry::SYCL_DEVICELIB_REQ_MASK[];
} // namespace util
} // namespace llvm
2 changes: 1 addition & 1 deletion llvm/lib/Support/SimpleTable.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ int SimpleTable::getColumnId(StringRef ColName) const {

Error SimpleTable::addColumnName(StringRef ColName) {
if (ColumnName2Num.find(ColName) != ColumnName2Num.end())
return makeError("column already exists" + ColName);
return makeError("column already exists " + ColName);
ColumnNames.emplace_back(ColName.str());
ColumnName2Num[ColumnNames.back()] = static_cast<int>(ColumnNames.size()) - 1;
ColumnNum2Name.push_back(std::prev(ColumnNames.end()));
Expand Down
227 changes: 210 additions & 17 deletions llvm/tools/sycl-post-link/sycl-post-link.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
#include "llvm/Support/InitLLVM.h"
#include "llvm/Support/Path.h"
#include "llvm/Support/PropertySetIO.h"
#include "llvm/Support/SYCLRTShared.h"
#include "llvm/Support/SimpleTable.h"
#include "llvm/Support/SystemUtils.h"
#include "llvm/Support/WithColor.h"
Expand All @@ -36,6 +37,11 @@
#include <memory>

using namespace llvm;
using llvm::util::sycl::cl_intel_devicelib_assert;
using llvm::util::sycl::cl_intel_devicelib_complex;
using llvm::util::sycl::cl_intel_devicelib_complex_fp64;
using llvm::util::sycl::cl_intel_devicelib_math;
using llvm::util::sycl::cl_intel_devicelib_math_fp64;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

somehow my previous comment on this disappeared. Please don't using all the individual enum elements.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, I see that you add merge commits when you implement review comments:

Merge remote-tracking branch 'upstream/sycl' into link_fallback_devic… …

This does not seem right to me. Maybe @bader can comment more. I think this can lead to missing review comments. You should only use git push/git pull when working with the review, so that jinge90:link_fallback_devicelib_on_demand does not have merge commits. Only if you have conflicts, you should rebase your feature branch. I do it like this:

git rebase <hash> --onto intel_llvm/sycl

where is the commit right below the first feature branch commit in git log output.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @kbobrovs
Thank you for the remind. I updated patch to use c++11 scoped enum for DeviceLibExt, the definition looks like:
enum class DeviceLibExt : std::uint32_t {cl_intel_devicelib_assert,.....}
By doing this, we don't need to "using" all individual enum elements, we only need to "using llvm::util::sycl::DeviceLibExt;" in source code and use "DeviceLibExt::cl_intel_devicelib_*" to refer to all DeviceLibExt enum elements. Is this OK to you?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thank you.


using string_vector = std::vector<std::string>;
using SpecIDMapTy = std::map<StringRef, unsigned>;
Expand All @@ -47,7 +53,7 @@ cl::OptionCategory PostLinkCat{"sycl-post-link options"};
static constexpr char COL_CODE[] = "Code";
static constexpr char COL_SYM[] = "Symbols";
static constexpr char COL_PROPS[] = "Properties";

static constexpr char DEVICELIB_FUNC_PREFIX[] = "__devicelib_";
// InputFilename - The filename to read from.
static cl::opt<std::string> InputFilename{
cl::Positional, cl::desc("<input bitcode file>"), cl::init("-"),
Expand Down Expand Up @@ -104,6 +110,143 @@ static cl::opt<SpecConstMode> SpecConstLower{
"set spec constants to C++ defaults")),
cl::cat(PostLinkCat)};

struct ImagePropSaveInfo {
bool NeedDeviceLibReqMask;
bool DoSpecConst;
bool SetSpecConstAtRT;
bool SpecConstsMet;
};
// Please update DeviceLibFuncMap if any item is added to or removed from
// fallback device libraries in libdevice.
static std::unordered_map<std::string, uint32_t> DeviceLibFuncMap = {
{"__devicelib_acosf", cl_intel_devicelib_math},
{"__devicelib_acoshf", cl_intel_devicelib_math},
{"__devicelib_asinf", cl_intel_devicelib_math},
{"__devicelib_asinhf", cl_intel_devicelib_math},
{"__devicelib_atan2f", cl_intel_devicelib_math},
{"__devicelib_atanf", cl_intel_devicelib_math},
{"__devicelib_atanhf", cl_intel_devicelib_math},
{"__devicelib_cbrtf", cl_intel_devicelib_math},
{"__devicelib_cosf", cl_intel_devicelib_math},
{"__devicelib_coshf", cl_intel_devicelib_math},
{"__devicelib_erfcf", cl_intel_devicelib_math},
{"__devicelib_erff", cl_intel_devicelib_math},
{"__devicelib_exp2f", cl_intel_devicelib_math},
{"__devicelib_expf", cl_intel_devicelib_math},
{"__devicelib_expm1f", cl_intel_devicelib_math},
{"__devicelib_fdimf", cl_intel_devicelib_math},
{"__devicelib_fmaf", cl_intel_devicelib_math},
{"__devicelib_fmodf", cl_intel_devicelib_math},
{"__devicelib_frexpf", cl_intel_devicelib_math},
{"__devicelib_hypotf", cl_intel_devicelib_math},
{"__devicelib_ilogbf", cl_intel_devicelib_math},
{"__devicelib_ldexpf", cl_intel_devicelib_math},
{"__devicelib_lgammaf", cl_intel_devicelib_math},
{"__devicelib_log10f", cl_intel_devicelib_math},
{"__devicelib_log1pf", cl_intel_devicelib_math},
{"__devicelib_log2f", cl_intel_devicelib_math},
{"__devicelib_logbf", cl_intel_devicelib_math},
{"__devicelib_logf", cl_intel_devicelib_math},
{"__devicelib_modff", cl_intel_devicelib_math},
{"__devicelib_nextafterf", cl_intel_devicelib_math},
{"__devicelib_powf", cl_intel_devicelib_math},
{"__devicelib_remainderf", cl_intel_devicelib_math},
{"__devicelib_remquof", cl_intel_devicelib_math},
{"__devicelib_sinf", cl_intel_devicelib_math},
{"__devicelib_sinhf", cl_intel_devicelib_math},
{"__devicelib_sqrtf", cl_intel_devicelib_math},
{"__devicelib_tanf", cl_intel_devicelib_math},
{"__devicelib_tanhf", cl_intel_devicelib_math},
{"__devicelib_tgammaf", cl_intel_devicelib_math},
{"__devicelib_acos", cl_intel_devicelib_math_fp64},
{"__devicelib_acosh", cl_intel_devicelib_math_fp64},
{"__devicelib_asin", cl_intel_devicelib_math_fp64},
{"__devicelib_asinh", cl_intel_devicelib_math_fp64},
{"__devicelib_atan", cl_intel_devicelib_math_fp64},
{"__devicelib_atan2", cl_intel_devicelib_math_fp64},
{"__devicelib_atanh", cl_intel_devicelib_math_fp64},
{"__devicelib_cbrt", cl_intel_devicelib_math_fp64},
{"__devicelib_cos", cl_intel_devicelib_math_fp64},
{"__devicelib_cosh", cl_intel_devicelib_math_fp64},
{"__devicelib_erf", cl_intel_devicelib_math_fp64},
{"__devicelib_erfc", cl_intel_devicelib_math_fp64},
{"__devicelib_exp", cl_intel_devicelib_math_fp64},
{"__devicelib_exp2", cl_intel_devicelib_math_fp64},
{"__devicelib_expm1", cl_intel_devicelib_math_fp64},
{"__devicelib_fdim", cl_intel_devicelib_math_fp64},
{"__devicelib_fma", cl_intel_devicelib_math_fp64},
{"__devicelib_fmod", cl_intel_devicelib_math_fp64},
{"__devicelib_frexp", cl_intel_devicelib_math_fp64},
{"__devicelib_hypot", cl_intel_devicelib_math_fp64},
{"__devicelib_ilogb", cl_intel_devicelib_math_fp64},
{"__devicelib_ldexp", cl_intel_devicelib_math_fp64},
{"__devicelib_lgamma", cl_intel_devicelib_math_fp64},
{"__devicelib_log", cl_intel_devicelib_math_fp64},
{"__devicelib_log10", cl_intel_devicelib_math_fp64},
{"__devicelib_log1p", cl_intel_devicelib_math_fp64},
{"__devicelib_log2", cl_intel_devicelib_math_fp64},
{"__devicelib_logb", cl_intel_devicelib_math_fp64},
{"__devicelib_modf", cl_intel_devicelib_math_fp64},
{"__devicelib_nextafter", cl_intel_devicelib_math_fp64},
{"__devicelib_pow", cl_intel_devicelib_math_fp64},
{"__devicelib_remainder", cl_intel_devicelib_math_fp64},
{"__devicelib_remquo", cl_intel_devicelib_math_fp64},
{"__devicelib_sin", cl_intel_devicelib_math_fp64},
{"__devicelib_sinh", cl_intel_devicelib_math_fp64},
{"__devicelib_sqrt", cl_intel_devicelib_math_fp64},
{"__devicelib_tan", cl_intel_devicelib_math_fp64},
{"__devicelib_tanh", cl_intel_devicelib_math_fp64},
{"__devicelib_tgamma", cl_intel_devicelib_math_fp64},
{"__devicelib___divsc3", cl_intel_devicelib_complex},
{"__devicelib___mulsc3", cl_intel_devicelib_complex},
{"__devicelib_cabsf", cl_intel_devicelib_complex},
{"__devicelib_cacosf", cl_intel_devicelib_complex},
{"__devicelib_cacoshf", cl_intel_devicelib_complex},
{"__devicelib_cargf", cl_intel_devicelib_complex},
{"__devicelib_casinf", cl_intel_devicelib_complex},
{"__devicelib_casinhf", cl_intel_devicelib_complex},
{"__devicelib_catanf", cl_intel_devicelib_complex},
{"__devicelib_catanhf", cl_intel_devicelib_complex},
{"__devicelib_ccosf", cl_intel_devicelib_complex},
{"__devicelib_ccoshf", cl_intel_devicelib_complex},
{"__devicelib_cexpf", cl_intel_devicelib_complex},
{"__devicelib_cimagf", cl_intel_devicelib_complex},
{"__devicelib_clogf", cl_intel_devicelib_complex},
{"__devicelib_cpolarf", cl_intel_devicelib_complex},
{"__devicelib_cpowf", cl_intel_devicelib_complex},
{"__devicelib_cprojf", cl_intel_devicelib_complex},
{"__devicelib_crealf", cl_intel_devicelib_complex},
{"__devicelib_csinf", cl_intel_devicelib_complex},
{"__devicelib_csinhf", cl_intel_devicelib_complex},
{"__devicelib_csqrtf", cl_intel_devicelib_complex},
{"__devicelib_ctanf", cl_intel_devicelib_complex},
{"__devicelib_ctanhf", cl_intel_devicelib_complex},
{"__devicelib___divdc3", cl_intel_devicelib_complex_fp64},
{"__devicelib___muldc3", cl_intel_devicelib_complex_fp64},
{"__devicelib_cabs", cl_intel_devicelib_complex_fp64},
{"__devicelib_cacos", cl_intel_devicelib_complex_fp64},
{"__devicelib_cacosh", cl_intel_devicelib_complex_fp64},
{"__devicelib_carg", cl_intel_devicelib_complex_fp64},
{"__devicelib_casin", cl_intel_devicelib_complex_fp64},
{"__devicelib_casinh", cl_intel_devicelib_complex_fp64},
{"__devicelib_catan", cl_intel_devicelib_complex_fp64},
{"__devicelib_catanh", cl_intel_devicelib_complex_fp64},
{"__devicelib_ccos", cl_intel_devicelib_complex_fp64},
{"__devicelib_ccosh", cl_intel_devicelib_complex_fp64},
{"__devicelib_cexp", cl_intel_devicelib_complex_fp64},
{"__devicelib_cimag", cl_intel_devicelib_complex_fp64},
{"__devicelib_clog", cl_intel_devicelib_complex_fp64},
{"__devicelib_cpolar", cl_intel_devicelib_complex_fp64},
{"__devicelib_cpow", cl_intel_devicelib_complex_fp64},
{"__devicelib_cproj", cl_intel_devicelib_complex_fp64},
{"__devicelib_creal", cl_intel_devicelib_complex_fp64},
{"__devicelib_csin", cl_intel_devicelib_complex_fp64},
{"__devicelib_csinh", cl_intel_devicelib_complex_fp64},
{"__devicelib_csqrt", cl_intel_devicelib_complex_fp64},
{"__devicelib_ctan", cl_intel_devicelib_complex_fp64},
{"__devicelib_ctanh", cl_intel_devicelib_complex_fp64},
};

static void error(const Twine &Msg) {
errs() << "sycl-post-link: " << Msg << '\n';
exit(1);
Expand Down Expand Up @@ -295,20 +438,74 @@ saveResultModules(std::vector<std::unique_ptr<Module>> &ResModules) {
return Res;
}

static string_vector
saveSpecConstantIDMaps(const std::vector<SpecIDMapTy> &Maps) {
string_vector Res;
// Each fallback device library corresponds to one bit in "require mask" which
// is an unsigned int32. getDeviceLibBit checks which fallback device library
// is required for FuncName and returns the corresponding bit. The corresponding
// mask for each fallback device library is:
// fallback-cassert: 0x1
// fallback-cmath: 0x2
// fallback-cmath-fp64: 0x4
// fallback-complex: 0x8
// fallback-complex-fp64: 0x10
static uint32_t getDeviceLibBits(const std::string &FuncName) {
auto DeviceLibFuncIter = DeviceLibFuncMap.find(FuncName);
return ((DeviceLibFuncIter == DeviceLibFuncMap.end())
? 0
: 0x1 << (DeviceLibFuncIter->second - cl_intel_devicelib_assert));
}

// For each device image module, we go through all functions which meets
// 1. The function name has prefix "__devicelib_"
// 2. The function has SPIR_FUNC calling convention
// 3. The function is declaration which means it doesn't have function body
static uint32_t getModuleReqMask(const Module &M) {
// Device libraries will be enabled only for spir-v module.
if (M.getTargetTriple().substr(0, 7) != "spir64-")
return 0;
// 0x1 means sycl runtime will link and load libsycl-fallback-assert.spv as
// default. In fact, default link assert spv is not necessary but dramatic
// perf regression is observed if we don't link any device library. The perf
// regression is caused by a clang issue.
uint32_t ReqMask = 0x1;
for (const Function &SF : M) {
if (SF.getName().startswith(DEVICELIB_FUNC_PREFIX) &&
(SF.getCallingConv() == CallingConv::SPIR_FUNC) && SF.isDeclaration()) {
uint32_t DeviceLibBits = getDeviceLibBits(SF.getName().str());
ReqMask |= DeviceLibBits;
}
}
return ReqMask;
}

for (size_t I = 0; I < Maps.size(); ++I) {
static string_vector saveDeviceImageProperty(
const std::vector<std::unique_ptr<Module>> &ResultModules,
const ImagePropSaveInfo &ImgPSInfo) {
string_vector Res;
for (size_t I = 0; I < ResultModules.size(); ++I) {
std::string SCFile = makeResultFileName(".prop", I);
llvm::util::PropertySetRegistry PropSet;
PropSet.add(llvm::util::PropertySetRegistry::SYCL_SPECIALIZATION_CONSTANTS,
Maps[I]);
if (ImgPSInfo.NeedDeviceLibReqMask) {
uint32_t MRMask = getModuleReqMask(*ResultModules[I]);
std::map<StringRef, uint32_t> RMEntry = {{"DeviceLibReqMask", MRMask}};
PropSet.add(llvm::util::PropertySetRegistry::SYCL_DEVICELIB_REQ_MASK,
RMEntry);
}
if (ImgPSInfo.DoSpecConst && ImgPSInfo.SetSpecConstAtRT) {
// extract spec constant maps per each module
SpecIDMapTy TmpSpecIDMap;
if (ImgPSInfo.SpecConstsMet)
SpecConstantsPass::collectSpecConstantMetadata(*ResultModules[I].get(),
TmpSpecIDMap);
PropSet.add(
llvm::util::PropertySetRegistry::SYCL_SPECIALIZATION_CONSTANTS,
TmpSpecIDMap);
}
std::error_code EC;
raw_fd_ostream SCOut(SCFile, EC);
PropSet.write(SCOut);
Res.emplace_back(std::move(SCFile));
}

return Res;
}

Expand Down Expand Up @@ -412,7 +609,6 @@ int main(int argc, char **argv) {
}

std::vector<std::unique_ptr<Module>> ResultModules;
std::vector<SpecIDMapTy> ResultSpecIDMaps;
string_vector ResultSymbolsLists;

util::SimpleTable Table;
Expand Down Expand Up @@ -456,18 +652,15 @@ int main(int argc, char **argv) {
Error Err = Table.addColumn(COL_CODE, Files);
CHECK_AND_EXIT(Err);
}
if (DoSpecConst && SetSpecConstAtRT) {
// extract spec constant maps per each module
for (auto &MUptr : ResultModules) {
ResultSpecIDMaps.emplace_back(SpecIDMapTy());
if (SpecConstsMet)
SpecConstantsPass::collectSpecConstantMetadata(*MUptr.get(),
ResultSpecIDMaps.back());
}
string_vector Files = saveSpecConstantIDMaps(ResultSpecIDMaps);

{
ImagePropSaveInfo ImgPSInfo = {true, DoSpecConst, SetSpecConstAtRT,
SpecConstsMet};
string_vector Files = saveDeviceImageProperty(ResultModules, ImgPSInfo);
Error Err = Table.addColumn(COL_PROPS, Files);
CHECK_AND_EXIT(Err);
}

if (DoSymGen) {
// extract symbols per each module
collectSymbolsLists(GlobalsSet, ResultSymbolsLists);
Expand Down
Loading