-
Notifications
You must be signed in to change notification settings - Fork 769
[SYCL] Add ITT annotation instructions #3299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] Add ITT annotation instructions #3299
Conversation
This patch introduces InstrumentalAnnotationsPass pass to sycl-post-link tool. The pass is being included to sycl-post-link chain only when '-fsycl_device_code_add_instrumentation_calls' option is passed to clang driver. Current version of the pass create instruction to notify profiling tools that: work item has started/finised or resumed (in case if barrier was called) and it annotates the barrier call itself and for annotation of atomic instructions. Signed-off-by: Dmitry Sidorov <[email protected]>
Signed-off-by: Dmitry Sidorov <[email protected]>
|
Signed-off-by: Dmitry Sidorov <[email protected]>
else if (AtomicName.contains(SPIRV_ATOMIC_STORE)) | ||
AtomicOp = ConstantInt::get(Int32Ty, 1); | ||
else | ||
AtomicOp = ConstantInt::get(Int32Ty, 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to add an assertion that name contains something like exchange
or compare_exchange
here
// TODO: Third parameter of Atomic Start/Finish annotation is an ordering | ||
// semanticof the instruction, encoded into a value of enum, defined like this | ||
// on user's/profiler's side: | ||
// enum __itt_atomic_mem_order_t | ||
// { | ||
// __itt_mem_order_relaxed = 0, | ||
// __itt_mem_order_acquire = 1, | ||
// __itt_mem_order_release = 2 | ||
// } | ||
// which isn't 1:1 mapped on SPIR-V memory ordering mask, need to align it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tagging @vzakhari and @MichaelTutin here for awareness
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that combination of AcquireRelease semantic is missing in itt annotations. SequentiallyConsistent and release-consume semantics (if SPIR-V has some form of it) are not required for analysis at the moment.
I think we can either change this enum to a set of flags (acquire, release) or to add one more value to the enum. Another option would be to pass SPIR-V or C++ ordering value as is, but that will require extra argument to pass API name and tool will need to extract required details in runtime (a bit of extra overhead).
What is your opinion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My vote goes to
to add one more value to the enum.
with that done, I can parse SPIR-V Memory Semantic mask to enum value without semantic changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I've updated the document and ITT header file. The enum was changed to:
enum __itt_atomic_mem_order_t
{
__itt_mem_order_relaxed = 0,
__itt_mem_order_acquire = 1,
__itt_mem_order_release = 2,
__itt_mem_order_acquire_release = 3
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please address this question first before getting to my other comments. From the first glance I don't see why this should be part of sycl-post-link.
Is there a reason why this transformation needs to be done on linked code and can't be done on object code? Please state it here. Otherwise the transformation would have to go to BackendUtils.cpp or elsewhere.
Signed-off-by: Dmitry Sidorov <[email protected]>
Not clear what it is about. After some Googling, this seems to be related to Intel VTune: https://software.intel.com/content/www/us/en/develop/documentation/vtune-help/top/api-support/instrumentation-and-tracing-technology-apis/basic-usage-and-configuration/minimizing-itt-api-overhead.html |
+1. Could folks working on this feature document the design in our internal documentation first, please? It's hard to follow changes in #3299 and #3279. |
Signed-off-by: Dmitry Sidorov <[email protected]>
Signed-off-by: Dmitry Sidorov <[email protected]>
Signed-off-by: Dmitry Sidorov <[email protected]>
Signed-off-by: Dmitry Sidorov <[email protected]>
Push new patch set to unblock review. This comment is yet unaddressed: #3299 (comment) . Also there is an issue with passing of CodeGen option from the driver to BackendUtils. |
And there are merge conflicts with the |
Signed-off-by: Dmitry Sidorov <[email protected]>
…rumentalAnnotations
Signed-off-by: Dmitry Sidorov <[email protected]>
Signed-off-by: Dmitry Sidorov <[email protected]>
Signed-off-by: Dmitry Sidorov <[email protected]>
Signed-off-by: Dmitry Sidorov <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
LGTM |
llvm/include/llvm/Transforms/Instrumentation/SYCLITTAnnotations.h
Outdated
Show resolved
Hide resolved
llvm/include/llvm/Transforms/Instrumentation/SYCLITTAnnotations.h
Outdated
Show resolved
Hide resolved
Signed-off-by: Dmitry Sidorov <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, update comments accordingly.
llvm/include/llvm/Transforms/Instrumentation/SPIRITTAnnotations.h
Outdated
Show resolved
Hide resolved
llvm/include/llvm/Transforms/Instrumentation/SPIRITTAnnotations.h
Outdated
Show resolved
Hide resolved
Signed-off-by: Dmitry Sidorov <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm okay with LLVM pass.
Signed-off-by: Dmitry Sidorov <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
@bader can we consider the review to be done, as we have 3 LGTMs from the appropriate code owners and 3 from Vlad, Slava and Konst? |
* upstream/sycl: (1804 commits) [SYCL] SYCL 2020 backend interoperability part 1 (intel#3354) [Driver][SYCL] Address issue when unbundling for non-FPGA archive (intel#3366) [SYCL][Doc] Update compiler options descriptions (intel#3340) [SYCL] Update ABI dump tool to disable checks with libcxx by default (intel#3370) [SYCL] Update the way we handle -sycl-std= based on community review feedback (intel#3371) [SYCL] Move tests to llvm-test-suite (intel#3359) [SYCL][PI][L0] Revert copy batching from intel#3232 (intel#3363) [SYCL] Remove unsupported option. [SYCL] Fix pragma setting in sycl-post-link (intel#3358) [SYCL] Add ITT annotation instructions (intel#3299) [SYCL] Propagate attributes of original kernel to wrapper kernel generated for range-rounding (intel#3306) [BuildBot] Uplift GPU RT version for Linux to 21.09.19150 (intel#3316) [SYCL] Retain PI events until they have signaled (intel#3350) [SYCL] Add caching when using interop constructor (intel#3327) Add ITT stubs and wrappers for SPIR-V devices (intel#3279) [SYCL] Add zero argument version of buffer::reinterpret() for SYCL 2020 (intel#3333) [SYCL] Restore old behavior of get() method (intel#3356) [Driver][SYCL][FPGA] Improve FPGA AOT when using Triple (intel#3330) [SYCL] Revert support for pinned_host_memory extension in Level-Zero backend. Make it a NOP (intel#3349) [SYCL] Remove redundant build options processing (intel#3342) ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some post-commit comments: mostly minor ones, but the one about non-constant memory semantic might be important
if (AtomicName.contains(SPIRV_ATOMIC_LOAD)) | ||
AtomicOp = ConstantInt::get(Int32Ty, 0); | ||
else if (AtomicName.contains(SPIRV_ATOMIC_STORE)) | ||
AtomicOp = ConstantInt::get(Int32Ty, 1); | ||
else | ||
AtomicOp = ConstantInt::get(Int32Ty, 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you change your prefixes, you could use llvm::StringSwitch
here: I mean use _Z[number]__spirv_AtomicLoad
and llvm::StringSwitch::StartsWith
uint64_t MemFlag = dyn_cast<ConstantInt>(AtomicFun->getArgOperand(2)) | ||
->getValue() | ||
.getZExtValue(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please note that in general this is not required to be a constant value.
Also, you are using dyn_cast
, but don't check if it returned nullptr
This patch introduces InstrumentalAnnotationsPass pass that
adds ITT instrumentations for barrier and barrier-like functions
and atomic instructions. The pass is being included to
compiler flow only when "-fsycl-instrument-device-code" option
passed to clang driver.
Following annotations were added:
Notify tools work-item execution has started
Notify tools work-item execution resumed (e.g. after barrier)
Notify tools work-item execution has finished
Notify tools work-item has reached a barrier
Atomic operation markup
Atomic operation markup
Signed-off-by: Dmitry Sidorov [email protected]