Skip to content

[SYCL][USM] Refactor indirect access calls to minimize invocations. #2185

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 29, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions sycl/source/detail/program_impl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -416,6 +416,11 @@ RT::PiKernel program_impl::get_pi_kernel(const string_class &KernelName) const {
Err);
}
Plugin.checkPiResult(Err);

// Some PI Plugins (like OpenCL) require this call to enable USM
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: if a kernel uses only accessors, will this call lead to unnecessary overhead during its invocation? If not known, maybe a TODO to check this could be added.

// For others, PI will turn this into a NOP.
Plugin.call<PiApiKind::piKernelSetExecInfo>(Kernel, PI_USM_INDIRECT_ACCESS,
sizeof(pi_bool), &PI_TRUE);
}

return Kernel;
Expand Down
5 changes: 5 additions & 0 deletions sycl/source/detail/program_manager/program_manager.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -442,6 +442,11 @@ ProgramManager::getOrCreateKernel(OSModuleHandle M, const context &Context,
Plugin.call<PiApiKind::piKernelCreate>(Program, KernelName.c_str(),
&Result);

// Some PI Plugins (like OpenCL) require this call to enable USM
// For others, PI will turn this into a NOP.
Plugin.call<PiApiKind::piKernelSetExecInfo>(Result, PI_USM_INDIRECT_ACCESS,
sizeof(pi_bool), &PI_TRUE);

return Result;
};

Expand Down
5 changes: 0 additions & 5 deletions sycl/source/detail/scheduler/commands.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1691,11 +1691,6 @@ pi_result ExecCGCommand::SetKernelParamsAndLaunch(
adjustNDRangePerKernel(NDRDesc, Kernel,
*(detail::getSyclObjImpl(MQueue->get_device())));

// Some PI Plugins (like OpenCL) require this call to enable USM
// For others, PI will turn this into a NOP.
Plugin.call<PiApiKind::piKernelSetExecInfo>(Kernel, PI_USM_INDIRECT_ACCESS,
sizeof(pi_bool), &PI_TRUE);

// Remember this information before the range dimensions are reversed
const bool HasLocalSize = (NDRDesc.LocalSize[0] != 0);

Expand Down
9 changes: 9 additions & 0 deletions sycl/unittests/program/KernelRelease.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,13 @@ pi_result redefinedKernelGetInfo(pi_kernel kernel, pi_kernel_info param_name,
return PI_SUCCESS;
}

pi_result redefinedKernelSetExecInfo(pi_kernel kernel,
pi_kernel_exec_info param_name,
size_t param_value_size,
const void *param_value) {
return PI_SUCCESS;
}

TEST(KernelReleaseTest, GetKernelRelease) {
platform Plt{default_selector()};
if (Plt.is_host()) {
Expand All @@ -85,6 +92,8 @@ TEST(KernelReleaseTest, GetKernelRelease) {
Mock.redefine<detail::PiApiKind::piKernelRetain>(redefinedKernelRetain);
Mock.redefine<detail::PiApiKind::piKernelRelease>(redefinedKernelRelease);
Mock.redefine<detail::PiApiKind::piKernelGetInfo>(redefinedKernelGetInfo);
Mock.redefine<detail::PiApiKind::piKernelSetExecInfo>(
redefinedKernelSetExecInfo);

context Ctx{Plt};
TestContext.reset(new TestCtx(Ctx));
Expand Down