Propose adding a new API "urKernelSuggestGroupSize" #1270

AllanZyne · 2024-01-23T05:47:49Z

For the sanitizer layer, we need to calculate the total number of workgroups in urEnqueueKernelLaunch. But sometimes, users omit to pass "pLocalWorkSize" parameter, which causes us to be unable to calculate workgroups.

Therefore, I propose adding a new API "urKernelSuggestGroupSize" which will return the LocalWorkSize which is the same as urEnqueueKernelLaunch.

LevelZero has "zeKernelSuggestGroupSize", and OpenCL has "clGetKernelSuggestedLocalWorkSizeKHR". But currently, OpenCL is WIP.

LevelZero

ze_result_t ZE_APICALL
zeKernelSuggestGroupSize(
    ze_kernel_handle_t hKernel,                     ///< [in] handle of the kernel object
    uint32_t globalSizeX,                           ///< [in] global width for X dimension
    uint32_t globalSizeY,                           ///< [in] global width for Y dimension
    uint32_t globalSizeZ,                           ///< [in] global width for Z dimension
    uint32_t* groupSizeX,                           ///< [out] recommended size of group for X dimension
    uint32_t* groupSizeY,                           ///< [out] recommended size of group for Y dimension
    uint32_t* groupSizeZ                            ///< [out] recommended size of group for Z dimension
    );

OpenCL

cl_int clGetKernelSuggestedLocalWorkSizeKHR(
cl_command_queue command_queue,
cl_kernel kernel,
cl_uint work_dim,
const size_t* global_work_offset,
const size_t* global_work_size,
size_t* suggested_local_work_size);

Proposed prototype:

ur_result_t UR_APICALL urKernelSuggestGroupSize(
    ur_queue_handle_t hQueue, ur_kernel_handle_t hKernel, uint32_t workDim,
    const size_t *pGlobalWorkOffset, const size_t *pGlobalWorkSize,
    size_t *pSuggestedLocalWorkSize);

kbenzie · 2024-01-23T12:18:54Z

We also need to consider if CUDA and HIP could support this API.

npmiller · 2024-01-24T18:05:15Z

This seems fairly straightforward, I don't envision any issues implementing this for CUDA and HIP since we already need to be able to pick local work size when enqueuing kernels, this would use the same mechanism.

yingcong-wu · 2024-03-06T06:00:56Z

Hi all, I drafted a PR to implement the API in #1385, would you care to review that PR? Thank you.

kbenzie · 2024-03-06T14:09:39Z

Might be best to take it out of draft, then the reviewer groups will be notified @yingcong-wu.

yingcong-wu · 2024-03-07T01:03:53Z

Sure.

AllanZyne · 2024-07-05T06:13:55Z

Close issue since #1385 has been merged

AllanZyne changed the title ~~Propose adding a new API "urGetKernelSuggestedLocalWorkSize"~~ Propose adding a new API "urKernelSuggestGroupSize" Jan 23, 2024

kbenzie added the needs-discussion This needs further discussion label Jan 23, 2024

kbenzie added the specification Changes or additions to the specification label Jan 25, 2024

AllanZyne mentioned this issue Jan 25, 2024

[DeviceSanitizer] Support detecting out-of-bounds error on CPU Device & Static Local Memory #1210

Merged

yingcong-wu mentioned this issue Mar 4, 2024

Implement urKernelGetSuggestedLocalWorkSize #1385

Merged

kbenzie removed the needs-discussion This needs further discussion label Apr 9, 2024

AllanZyne closed this as completed Jul 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Propose adding a new API "urKernelSuggestGroupSize" #1270

Propose adding a new API "urKernelSuggestGroupSize" #1270

AllanZyne commented Jan 23, 2024 •

edited

Loading

kbenzie commented Jan 23, 2024

Uh oh!

npmiller commented Jan 24, 2024

Uh oh!

yingcong-wu commented Mar 6, 2024

Uh oh!

kbenzie commented Mar 6, 2024

Uh oh!

yingcong-wu commented Mar 7, 2024

Uh oh!

AllanZyne commented Jul 5, 2024

Uh oh!

Propose adding a new API "urKernelSuggestGroupSize" #1270

Propose adding a new API "urKernelSuggestGroupSize" #1270

Comments

AllanZyne commented Jan 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

kbenzie commented Jan 23, 2024

Uh oh!

npmiller commented Jan 24, 2024

Uh oh!

yingcong-wu commented Mar 6, 2024

Uh oh!

kbenzie commented Mar 6, 2024

Uh oh!

yingcong-wu commented Mar 7, 2024

Uh oh!

AllanZyne commented Jul 5, 2024

Uh oh!

AllanZyne commented Jan 23, 2024 •

edited

Loading