Skip to content

[SYCL] Enable querying kernel's number of registers #4665

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Oct 14, 2021

Conversation

joeatodd
Copy link
Contributor

This patch defines a new member of kernel_device_specific info ext_num_regs which returns the number of registers used by the given compiled kernel. This is implemented for the CUDA plugin, but not for HIP or level zero.

@joeatodd joeatodd requested review from smaslov-intel and a team as code owners September 29, 2021 15:22
Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting! Some small comments, but otherwise it looks good.

These changes don't add anything to the OpenCL backend, which is technically fine as clGetKernelWorkGroupInfo should fail with CL_INVALID_VALUE when not recognizing a descriptor. However, other query functions with an extended set of descriptors (like piDeviceGetInfo) define cases for unsupported descriptors. In principle the effect is the same, but it makes it easier to keep track of which queries are unimplemented/unsupported in the backend. Would you mind giving piKernelGetGroupInfo a similar treatment?

@@ -250,6 +250,7 @@ enum class kernel_device_specific : cl_kernel_work_group_info {
preferred_work_group_size_multiple =
CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE,
private_mem_size = CL_KERNEL_PRIVATE_MEM_SIZE,
ext_num_regs = PI_KERNEL_GROUP_INFO_NUM_REGS,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other extension info descriptors have ext_intel_ or ext_oneapi_ as their prefix. To keep this consistent, could this be renamed to something like ext_oneapi_num_regs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is motivated by oneAPI project then I'd also expect to see the Level-Zero support added.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed ext_num_regs to ext_codeplay_num_regs. For Level-Zero, I don't think that it exposes this information, so I'm not sure how I could implement it in the level zero PI.

@@ -346,7 +346,8 @@ typedef enum {
PI_KERNEL_GROUP_INFO_LOCAL_MEM_SIZE = CL_KERNEL_LOCAL_MEM_SIZE,
PI_KERNEL_GROUP_INFO_PREFERRED_WORK_GROUP_SIZE_MULTIPLE =
CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE,
PI_KERNEL_GROUP_INFO_PRIVATE_MEM_SIZE = CL_KERNEL_PRIVATE_MEM_SIZE
PI_KERNEL_GROUP_INFO_PRIVATE_MEM_SIZE = CL_KERNEL_PRIVATE_MEM_SIZE,
PI_KERNEL_GROUP_INFO_NUM_REGS = 0x10112
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this one doesn't have analog in OpenCL it needs to be documented here: what exactly is being queried/returned.
Also let's follow

PI_ERROR_UNKNOWN = -999
and add new values from the "top" of the values set to avoid accidental overlap.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a comment to explain what is being returned. Could you explain a little more what you mean about adding new values from the top? I thought I had done this - the largest value I found was

PI_DEVICE_INFO_ATOMIC_MEMORY_ORDER_CAPABILITIES = 0x10111
so I added one after that. Should I have left more of a gap?

@joeatodd
Copy link
Contributor Author

@steffenlarsen Thanks for your feedback! I've added piKernelGetGroupInfo in pi_opencl.cpp, perhaps you could take a look? I think PI_KERNEL_GROUP_INFO_NUM_REGS is the only value not currently implemented.

@steffenlarsen
Copy link
Contributor

@steffenlarsen Thanks for your feedback! I've added piKernelGetGroupInfo in pi_opencl.cpp, perhaps you could take a look? I think PI_KERNEL_GROUP_INFO_NUM_REGS is the only value not currently implemented.

Looks good, thanks!

steffenlarsen
steffenlarsen previously approved these changes Oct 1, 2021
Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@joeatodd
Copy link
Contributor Author

joeatodd commented Oct 7, 2021

I've added the new symbols to the ABI. I did this automatically on Linux but I don't have access to a windows environment to add the Windows symbol, so I added it manually. I put it in lexicographical order - I hope that's right.

@againull againull merged commit 97d33b7 into intel:sycl Oct 14, 2021
@joeatodd joeatodd deleted the joe/sycl_num_regs branch December 7, 2021 15:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants