[DeviceSanitizer] Support multiple error reports (-fsanitize-recover=address) #13948

AllanZyne · 2024-05-29T01:55:51Z

In kernel, we save at most ASAN_MAX_NUM_REPORTS (default: 10) number of SanitizerReport.
We select the index of SanitizerReport by WG_LINEAR_ID % ASAN_MAX_NUM_REPORTS.

When -fsanitize-recover=address is passed in compiler flag, asan_loadX_noabort/asan_storeX_noabort is used, we use is_recover = true flag to distinguish this case.
If is_recover is true, the UR will print out all the error reports and continue (use at your own risk).
If is_recover is false (default case), the UR will print out only one error report and exit.

steffenlarsen

Overall I think it looks reasonable, but I worry that the register pressure could make this unusable on certain application.

steffenlarsen · 2024-05-29T15:08:46Z

libdevice/include/asan_libdevice.hpp

-  LocalArgsInfo *LocalArgs = nullptr; // ordered by ArgIndex
+  LocalArgsInfo *LocalArgs = nullptr; // Ordered by ArgIndex
+
+  DeviceSanitizerReport SanitizerReport[ASAN_MAX_NUM_REPORTS];


Are there any concerns about register-pressure related to this? Would it make sense to somehow let users use a more light-weight version with fewer reports?

LaunchInfo is allocated in shared USM, so I think it won't cause register-pressure here.

clang/lib/Driver/SanitizerArgs.cpp

wenju-he · 2024-06-11T07:51:49Z

libdevice/sanitizer_utils.cpp

+      __spirv_BuiltInWorkgroupId.z;
+
+  auto &SanitizerReport = ((__SYCL_GLOBAL__ LaunchInfo *)__AsanLaunchInfo)
+                              ->SanitizerReport[WG_LID % ASAN_MAX_NUM_REPORTS];


Given an error in a workgroup and another error in an another workgroup, these two errors may be saved to the same index and either of them has a chance to be kept.
Program behavior is probably more predicable if we use an atomically increasing index instead.

Program behavior is probably more predicable if we use an atomically increasing index instead.

What does predicable mean here?
Even using atomic counter, we can't guarantee that we can save all error reports in your case, and their order are also undetermined.
I didn't try to save as most as possible error reports because I think it's not necessary (as most of them are likely pointing to the same error location).

Maybe it would be better to save more different types of error, but it's too complicated to implement now.

Even using atomic counter, we can't guarantee that we can save all error reports in your case, and their order are also undetermined.

Right, order is indeterministic. Could you please elaborate why we can't guarantee that all of the two errors are not reported in my case?

Given an error in a workgroup and another error in an another workgroup, these two errors may be saved to the same index and either of them has a chance to be kept.

I don't understand, why does either of them has a chance to be kept?
They can save into specific index of reports unless that index is already used.

Because the order of workgroups execution is indeterministic.
When there is clash in indexing, we can't say which one is saved since the second error won't be save after the first one is saved.

Because the order of workgroups execution is indeterministic.
When there is clash in indexing, we can't say which one is saved since the second error won't be save after the first one is saved.

Yep. I think it's okay to save either one.

wenju-he

LGTM

AllanZyne · 2024-06-21T08:48:37Z

@intel/unified-runtime-reviewers @intel/dpcpp-sanitizers-review Please review.
Thanks very much!

wenju-he

LGTM

sycl/plugins/unified_runtime/CMakeLists.txt

callumfare · 2024-06-27T09:43:44Z

@AllanZyne oneapi-src/unified-runtime#1677 has been merged, please merge the latest sycl branch in and update the UR tag as suggested

Co-authored-by: Callum Fare <[email protected]>

AllanZyne · 2024-06-27T09:47:58Z

@AllanZyne oneapi-src/unified-runtime#1677 has been merged, please merge the latest sycl branch in and update the UR tag as suggested

Done

callumfare · 2024-06-27T12:18:02Z

@intel/llvm-gatekeepers Please merge

Support multiple error reports

9615480

AllanZyne requested review from a team as code owners May 29, 2024 01:55

AllanZyne requested a review from steffenlarsen May 29, 2024 01:55

AllanZyne mentioned this pull request May 29, 2024

[DeviceSanitizer] Support multiple error reports (-fsanitize-recover=address) oneapi-src/unified-runtime#1677

Merged

fix tests

90cfa79

AllanZyne had a problem deploying to WindowsCILock May 29, 2024 01:58 — with GitHub Actions Error

update ur repo

d094485

AllanZyne requested a review from a team as a code owner May 29, 2024 01:59

AllanZyne temporarily deployed to WindowsCILock May 29, 2024 02:00 — with GitHub Actions Inactive

AllanZyne temporarily deployed to WindowsCILock May 29, 2024 02:32 — with GitHub Actions Inactive

steffenlarsen approved these changes May 29, 2024

View reviewed changes

mdtoguchi reviewed Jun 6, 2024

View reviewed changes

clang/lib/Driver/SanitizerArgs.cpp Show resolved Hide resolved

Merge branch 'sycl' into review/yang/multiple_reports

2e3b590

AllanZyne had a problem deploying to WindowsCILock June 11, 2024 07:36 — with GitHub Actions Error

add driver test

9dca074

AllanZyne had a problem deploying to WindowsCILock June 11, 2024 07:38 — with GitHub Actions Failure

wenju-he reviewed Jun 11, 2024

View reviewed changes

AllanZyne had a problem deploying to WindowsCILock June 11, 2024 08:22 — with GitHub Actions Failure

AllanZyne requested a review from zhaomaosu June 11, 2024 08:54

mdtoguchi approved these changes Jun 11, 2024

View reviewed changes

AllanZyne requested a review from wenju-he June 12, 2024 01:44

Merge branch 'sycl' into review/yang/multiple_reports

2c85b6f

AllanZyne temporarily deployed to WindowsCILock June 13, 2024 06:02 — with GitHub Actions Inactive

AllanZyne temporarily deployed to WindowsCILock June 13, 2024 06:35 — with GitHub Actions Inactive

wenju-he approved these changes Jun 19, 2024

View reviewed changes

Merge branch 'sycl' into review/yang/multiple_reports

da0db46

AllanZyne requested a review from a team as a code owner June 20, 2024 03:36

AllanZyne had a problem deploying to WindowsCILock June 20, 2024 03:44 — with GitHub Actions Failure

AllanZyne temporarily deployed to WindowsCILock June 20, 2024 04:27 — with GitHub Actions Inactive

wenju-he approved these changes Jun 25, 2024

View reviewed changes

callumfare reviewed Jun 27, 2024

View reviewed changes

sycl/plugins/unified_runtime/CMakeLists.txt Outdated Show resolved Hide resolved

callumfare reviewed Jun 27, 2024

View reviewed changes

sycl/plugins/unified_runtime/CMakeLists.txt Outdated Show resolved Hide resolved

AllanZyne and others added 2 commits June 27, 2024 17:44

Update sycl/plugins/unified_runtime/CMakeLists.txt

9a37182

Co-authored-by: Callum Fare <[email protected]>

Merge branch 'sycl' into review/yang/multiple_reports

fe8247d

AllanZyne temporarily deployed to WindowsCILock June 27, 2024 09:47 — with GitHub Actions Inactive

callumfare approved these changes Jun 27, 2024

View reviewed changes

AllanZyne temporarily deployed to WindowsCILock June 27, 2024 10:22 — with GitHub Actions Inactive

sommerlukas merged commit 7b4fbac into sycl Jun 27, 2024
14 checks passed

AllanZyne deleted the review/yang/multiple_reports branch June 28, 2024 05:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DeviceSanitizer] Support multiple error reports (-fsanitize-recover=address) #13948

[DeviceSanitizer] Support multiple error reports (-fsanitize-recover=address) #13948

AllanZyne commented May 29, 2024 •

edited

Loading

steffenlarsen left a comment

steffenlarsen May 29, 2024

AllanZyne Jun 11, 2024

wenju-he Jun 11, 2024

AllanZyne Jun 12, 2024

wenju-he Jun 12, 2024

AllanZyne Jun 13, 2024

wenju-he Jun 13, 2024 •

edited

Loading

AllanZyne Jun 19, 2024

wenju-he left a comment

AllanZyne commented Jun 21, 2024

wenju-he left a comment

callumfare commented Jun 27, 2024

AllanZyne commented Jun 27, 2024

callumfare commented Jun 27, 2024

[DeviceSanitizer] Support multiple error reports (-fsanitize-recover=address) #13948

[DeviceSanitizer] Support multiple error reports (-fsanitize-recover=address) #13948

Conversation

AllanZyne commented May 29, 2024 • edited Loading

steffenlarsen left a comment

Choose a reason for hiding this comment

steffenlarsen May 29, 2024

Choose a reason for hiding this comment

AllanZyne Jun 11, 2024

Choose a reason for hiding this comment

wenju-he Jun 11, 2024

Choose a reason for hiding this comment

AllanZyne Jun 12, 2024

Choose a reason for hiding this comment

wenju-he Jun 12, 2024

Choose a reason for hiding this comment

AllanZyne Jun 13, 2024

Choose a reason for hiding this comment

wenju-he Jun 13, 2024 • edited Loading

Choose a reason for hiding this comment

AllanZyne Jun 19, 2024

Choose a reason for hiding this comment

wenju-he left a comment

Choose a reason for hiding this comment

AllanZyne commented Jun 21, 2024

wenju-he left a comment

Choose a reason for hiding this comment

callumfare commented Jun 27, 2024

AllanZyne commented Jun 27, 2024

callumfare commented Jun 27, 2024

AllanZyne commented May 29, 2024 •

edited

Loading

wenju-he Jun 13, 2024 •

edited

Loading