Skip to content

[SYCL][Coverity] Fix MISSING_LOCK coverity hits in the program manager #18121

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Apr 24, 2025

Conversation

ianayl
Copy link
Contributor

@ianayl ianayl commented Apr 21, 2025

This PR guards accesses to m_DeviceImage and m_VFSet2BinImage caches/maps in the Program Manager using m_KernelIDsMutex in situations where multiple threads may be accessing the maps at the same time.

I am opting to use m_KernelIDsMutex here as both maps are commonly accessed while holding said mutex: In fact, a lot of maps/caches in the program manager use m_KernelIDsMutex (This is not the most optimal solution, but that is out of the scope of this PR, see #18165)

This PR addresses Coverity hits that needs to have fixes submitted before the code cutoff.

@ianayl ianayl temporarily deployed to WindowsCILock April 21, 2025 18:35 — with GitHub Actions Inactive
@ianayl ianayl temporarily deployed to WindowsCILock April 21, 2025 19:13 — with GitHub Actions Inactive
@ianayl ianayl temporarily deployed to WindowsCILock April 21, 2025 19:29 — with GitHub Actions Inactive
@ianayl ianayl marked this pull request as ready for review April 21, 2025 20:34
@ianayl ianayl requested a review from a team as a code owner April 21, 2025 20:34
@ianayl ianayl requested a review from aelovikov-intel April 21, 2025 20:34
@ianayl
Copy link
Contributor Author

ianayl commented Apr 21, 2025

@aelovikov-intel Sergey's on vacation and wont be able to review in time, is there someone that can review it sooner?

@aelovikov-intel
Copy link
Contributor

@aelovikov-intel Sergey's on vacation and wont be able to review in time, is there someone that can review it sooner?

Maybe @KseniyaTikhomirova or @steffenlarsen ... I definitely don't think that going with a hack is the right thing to do here. Also, I don't know if there might be some other locks up the call stack that might make this a false positive.

@ianayl
Copy link
Contributor Author

ianayl commented Apr 21, 2025

Would you prefer an additional mutex? I thought about adding another mutex, but decided against it since every other access to these maps has used m_KernelIDsMutex anyway. Additionally, a lot of maps and caches in the program manager use m_KernelIDsMutex even if they are not specifically caching/mapping kernel IDs, so I figured it wasn't that much of a hack.

I checked the call stack up to getOrCreateKernel and didn't see another lock, although it is not impossible that I may be wrong.

@aelovikov-intel
Copy link
Contributor

#16836 seems to be related.

Copy link
Contributor

@aelovikov-intel aelovikov-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, you need to reword your description then. Be more assertive that this is the right change that follows what pre-existing codes does and drop any hesitation from it.

@ianayl ianayl requested a review from steffenlarsen April 22, 2025 14:19
Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine, but please open a tracker to improve the locking. Also, I agree with @aelovikov-intel that the locking in removeImages should be done outside the top-level loop, with a check that there are binaries before locking (or an early exit.)

@ianayl
Copy link
Contributor Author

ianayl commented Apr 23, 2025

Tracker created in #18165

@ianayl ianayl temporarily deployed to WindowsCILock April 23, 2025 20:08 — with GitHub Actions Inactive
Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@ianayl
Copy link
Contributor Author

ianayl commented Apr 24, 2025

@intel/llvm-gatekeepers PR ready for merge, thanks!

@martygrant martygrant merged commit af0e95b into intel:sycl Apr 24, 2025
37 of 38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants