Skip to content

[SYCL] Adding caching when using interop constructor #3327

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Mar 15, 2021

Conversation

maximdimakov
Copy link
Contributor

When interop constructor is used it creates new impl every time. This patch allows to use already created impls in the constructor.

@maximdimakov maximdimakov requested a review from a team as a code owner March 9, 2021 09:24
Comment on lines 140 to 143
static std::unordered_map<cl_platform_id,
std::weak_ptr<detail::platform_impl>>
platform_impls;
static std::mutex platform_mutex;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't going to work properly. Both unordered_map and mutex are non-trivial classes. See this document for more info.

Comment on lines 116 to 121
auto cl_platform =
reinterpret_cast<typename interop<BackendName, platform>::type>(
getNative());
std::lock_guard<std::mutex> lock(platform_mutex);
platform_impls[cl_platform] = impl;
return cl_platform;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit

Suggested change
auto cl_platform =
reinterpret_cast<typename interop<BackendName, platform>::type>(
getNative());
std::lock_guard<std::mutex> lock(platform_mutex);
platform_impls[cl_platform] = impl;
return cl_platform;
auto ClPlatform =
reinterpret_cast<typename interop<BackendName, platform>::type>(
getNative());
std::lock_guard<std::mutex> Lock(platform_mutex);
platform_impls[ClPlatform] = impl;
return ClPlatform;

Comment on lines 198 to 200
static std::unordered_map<cl_device_id, std::weak_ptr<detail::device_impl>>
device_impls;
static std::mutex device_mutex;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment for the same piece in platform class.

Comment on lines 40 to 52
std::lock_guard<std::mutex> lock(device_mutex);
auto it = device_impls.find(deviceId);
if (it != device_impls.end() && !it->second.expired())
impl = it->second.lock();
else {
impl = std::make_shared<detail::device_impl>(
detail::pi::cast<pi_native_handle>(deviceId),
RT::getPlugin<backend::opencl>());
if (it == device_impls.end())
device_impls[deviceId] = impl;
else
it->second = impl;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit

Suggested change
std::lock_guard<std::mutex> lock(device_mutex);
auto it = device_impls.find(deviceId);
if (it != device_impls.end() && !it->second.expired())
impl = it->second.lock();
else {
impl = std::make_shared<detail::device_impl>(
detail::pi::cast<pi_native_handle>(deviceId),
RT::getPlugin<backend::opencl>());
if (it == device_impls.end())
device_impls[deviceId] = impl;
else
it->second = impl;
}
std::lock_guard<std::mutex> Lock(device_mutex);
auto It = DeviceImpls.find(DeviceId);
if (It != DeviceImpls.end() && !It->second.expired())
impl = It->second.lock();
else {
impl = std::make_shared<detail::device_impl>(
detail::pi::cast<pi_native_handle>(DeviceId),
RT::getPlugin<backend::opencl>());
if (it == DeviceImpls.end())
DeviceImpls[DeviceId] = impl;
else
It->second = impl;
}

@alexbatashev alexbatashev requested a review from romanovvlad March 9, 2021 09:53
@@ -192,6 +195,10 @@ class __SYCL_EXPORT device {
shared_ptr_class<detail::device_impl> impl;
device(shared_ptr_class<detail::device_impl> impl) : impl(impl) {}

static std::unordered_map<cl_device_id, std::weak_ptr<detail::device_impl>>
device_impls;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to reuse already existing cache, see platform_impl.hpp:199 (MDeviceCache) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is possible, testing it. I didn't see that when trying to implement caching with map

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed another solution with reusing an existing code

alexbatashev
alexbatashev previously approved these changes Mar 12, 2021
Copy link
Contributor

@alexbatashev alexbatashev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM

@@ -16,7 +16,6 @@
#include <CL/sycl/platform.hpp>
#include <CL/sycl/stl.hpp>

#include <memory>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memory was actually used for shared_ptr.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was deleted when I tested map. But memory is included in stl.hpp, that included higher. Do I need to return it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. You can't rely on CL/sycl/stl.hpp to include memory header for you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returned it

Signed-off-by: mdimakov <[email protected]>
@bader bader merged commit d3aeb4a into intel:sycl Mar 15, 2021
alexbatashev pushed a commit to alexbatashev/llvm that referenced this pull request Mar 19, 2021
* upstream/sycl: (1804 commits)
  [SYCL] SYCL 2020 backend interoperability part 1 (intel#3354)
  [Driver][SYCL] Address issue when unbundling for non-FPGA archive (intel#3366)
  [SYCL][Doc] Update compiler options descriptions (intel#3340)
  [SYCL] Update ABI dump tool to disable checks with libcxx by default (intel#3370)
  [SYCL] Update the way we handle -sycl-std= based on community review feedback (intel#3371)
  [SYCL] Move tests to llvm-test-suite (intel#3359)
  [SYCL][PI][L0] Revert copy batching from intel#3232 (intel#3363)
  [SYCL] Remove unsupported option.
  [SYCL] Fix pragma setting in sycl-post-link (intel#3358)
  [SYCL] Add ITT annotation instructions (intel#3299)
  [SYCL] Propagate attributes of original kernel to wrapper kernel generated for range-rounding (intel#3306)
  [BuildBot] Uplift GPU RT version for Linux to 21.09.19150 (intel#3316)
  [SYCL] Retain PI events until they have signaled (intel#3350)
  [SYCL] Add caching when using interop constructor (intel#3327)
  Add ITT stubs and wrappers for SPIR-V devices (intel#3279)
  [SYCL] Add zero argument version of buffer::reinterpret() for SYCL 2020 (intel#3333)
  [SYCL] Restore old behavior of get() method (intel#3356)
  [Driver][SYCL][FPGA] Improve FPGA AOT when using Triple (intel#3330)
  [SYCL] Revert support for pinned_host_memory extension in Level-Zero backend. Make it a NOP (intel#3349)
  [SYCL] Remove redundant build options processing (intel#3342)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants