Skip to content

[SYCL] Adding caching when using interop constructor #3327

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Mar 15, 2021
11 changes: 9 additions & 2 deletions sycl/include/CL/sycl/device.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
#include <CL/sycl/platform.hpp>
#include <CL/sycl/stl.hpp>

#include <memory>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memory was actually used for shared_ptr.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was deleted when I tested map. But memory is included in stl.hpp, that included higher. Do I need to return it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. You can't rely on CL/sycl/stl.hpp to include memory header for you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returned it

#include <unordered_map>
#include <utility>

__SYCL_INLINE_NAMESPACE(cl) {
Expand Down Expand Up @@ -177,7 +177,10 @@ class __SYCL_EXPORT device {
/// \return a native handle, the type of which defined by the backend.
template <backend BackendName>
auto get_native() const -> typename interop<BackendName, device>::type {
return (typename interop<BackendName, device>::type)getNative();
auto cl_device = (typename interop<BackendName, device>::type)getNative();
std::lock_guard<std::mutex> lock(device_mutex);
device_impls[cl_device] = impl;
return cl_device;
}

/// Indicates if the SYCL device has the given feature.
Expand All @@ -192,6 +195,10 @@ class __SYCL_EXPORT device {
shared_ptr_class<detail::device_impl> impl;
device(shared_ptr_class<detail::device_impl> impl) : impl(impl) {}

static std::unordered_map<cl_device_id, std::weak_ptr<detail::device_impl>>
device_impls;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to reuse already existing cache, see platform_impl.hpp:199 (MDeviceCache) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is possible, testing it. I didn't see that when trying to implement caching with map

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed another solution with reusing an existing code

static std::mutex device_mutex;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment for the same piece in platform class.


pi_native_handle getNative() const;

template <class Obj>
Expand Down
14 changes: 12 additions & 2 deletions sycl/include/CL/sycl/platform.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
#include <CL/sycl/stl.hpp>

// 4.6.2 Platform class
#include <unordered_map>
#include <utility>
__SYCL_INLINE_NAMESPACE(cl) {
namespace sycl {
Expand Down Expand Up @@ -112,8 +113,12 @@ class __SYCL_EXPORT platform {
/// \return a native handle, the type of which defined by the backend.
template <backend BackendName>
auto get_native() const -> typename interop<BackendName, platform>::type {
return reinterpret_cast<typename interop<BackendName, platform>::type>(
getNative());
auto cl_platform =
reinterpret_cast<typename interop<BackendName, platform>::type>(
getNative());
std::lock_guard<std::mutex> lock(platform_mutex);
platform_impls[cl_platform] = impl;
return cl_platform;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit

Suggested change
auto cl_platform =
reinterpret_cast<typename interop<BackendName, platform>::type>(
getNative());
std::lock_guard<std::mutex> lock(platform_mutex);
platform_impls[cl_platform] = impl;
return cl_platform;
auto ClPlatform =
reinterpret_cast<typename interop<BackendName, platform>::type>(
getNative());
std::lock_guard<std::mutex> Lock(platform_mutex);
platform_impls[ClPlatform] = impl;
return ClPlatform;

}

/// Indicates if all of the SYCL devices on this platform have the
Expand All @@ -132,6 +137,11 @@ class __SYCL_EXPORT platform {
shared_ptr_class<detail::platform_impl> impl;
platform(shared_ptr_class<detail::platform_impl> impl) : impl(impl) {}

static std::unordered_map<cl_platform_id,
std::weak_ptr<detail::platform_impl>>
platform_impls;
static std::mutex platform_mutex;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't going to work properly. Both unordered_map and mutex are non-trivial classes. See this document for more info.


template <class T>
friend T detail::createSyclObjFromImpl(decltype(T::impl) ImplObj);
template <class Obj>
Expand Down
24 changes: 20 additions & 4 deletions sycl/source/device.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -29,12 +29,28 @@ void force_type(info::device_type &t, const info::device_type &ft) {

device::device() : impl(detail::device_impl::getHostDeviceImpl()) {}

device::device(cl_device_id deviceId)
: impl(std::make_shared<detail::device_impl>(
detail::pi::cast<pi_native_handle>(deviceId),
RT::getPlugin<backend::opencl>())) {
std::unordered_map<cl_device_id, std::weak_ptr<detail::device_impl>>
device::device_impls;
std::mutex device::device_mutex;

device::device(cl_device_id deviceId) {
// The implementation constructor takes ownership of the native handle so we
// must retain it in order to adhere to SYCL 1.2.1 spec (Rev6, section 4.3.1.)
{
std::lock_guard<std::mutex> lock(device_mutex);
auto it = device_impls.find(deviceId);
if (it != device_impls.end() && !it->second.expired())
impl = it->second.lock();
else {
impl = std::make_shared<detail::device_impl>(
detail::pi::cast<pi_native_handle>(deviceId),
RT::getPlugin<backend::opencl>());
if (it == device_impls.end())
device_impls[deviceId] = impl;
else
it->second = impl;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit

Suggested change
std::lock_guard<std::mutex> lock(device_mutex);
auto it = device_impls.find(deviceId);
if (it != device_impls.end() && !it->second.expired())
impl = it->second.lock();
else {
impl = std::make_shared<detail::device_impl>(
detail::pi::cast<pi_native_handle>(deviceId),
RT::getPlugin<backend::opencl>());
if (it == device_impls.end())
device_impls[deviceId] = impl;
else
it->second = impl;
}
std::lock_guard<std::mutex> Lock(device_mutex);
auto It = DeviceImpls.find(DeviceId);
if (It != DeviceImpls.end() && !It->second.expired())
impl = It->second.lock();
else {
impl = std::make_shared<detail::device_impl>(
detail::pi::cast<pi_native_handle>(DeviceId),
RT::getPlugin<backend::opencl>());
if (it == DeviceImpls.end())
DeviceImpls[DeviceId] = impl;
else
It->second = impl;
}

}
clRetainDevice(deviceId);
}

Expand Down
23 changes: 19 additions & 4 deletions sycl/source/platform.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,25 @@ namespace sycl {

platform::platform() : impl(detail::platform_impl::getHostPlatformImpl()) {}

platform::platform(cl_platform_id PlatformId)
: impl(std::make_shared<detail::platform_impl>(
detail::pi::cast<detail::RT::PiPlatform>(PlatformId),
RT::getPlugin<backend::opencl>())) {}
std::unordered_map<cl_platform_id, std::weak_ptr<detail::platform_impl>>
platform::platform_impls;
std::mutex platform::platform_mutex;

platform::platform(cl_platform_id PlatformId) {
std::lock_guard<std::mutex> lock(platform_mutex);
auto it = platform_impls.find(PlatformId);
if (it != platform_impls.end() && !it->second.expired())
impl = it->second.lock();
else {
impl = std::make_shared<detail::platform_impl>(
detail::pi::cast<detail::RT::PiPlatform>(PlatformId),
RT::getPlugin<backend::opencl>());
if (it == platform_impls.end())
platform_impls[PlatformId] = impl;
else
it->second = impl;
}
}

platform::platform(const device_selector &dev_selector) {
*this = dev_selector.select_device().get_platform();
Expand Down