Skip to content

[SYCL] Add ITT annotation instructions #3299

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Mar 16, 2021

Conversation

MrSidims
Copy link
Contributor

@MrSidims MrSidims commented Mar 3, 2021

This patch introduces InstrumentalAnnotationsPass pass that
adds ITT instrumentations for barrier and barrier-like functions
and atomic instructions. The pass is being included to
compiler flow only when "-fsycl-instrument-device-code" option
passed to clang driver.

Following annotations were added:

  • __itt_offload_wi_start_wrapper
    Notify tools work-item execution has started
  • __itt_offload_wi_resume_wrapper
    Notify tools work-item execution resumed (e.g. after barrier)
  • __itt_offload_wi_finish_wrapper
    Notify tools work-item execution has finished
  • __itt_offload_wg_barrier
    Notify tools work-item has reached a barrier
  • __itt_offload_atomic_op_start
    Atomic operation markup
  • __itt_offload_atomic_op_finish
    Atomic operation markup

Signed-off-by: Dmitry Sidorov [email protected]

This patch introduces InstrumentalAnnotationsPass pass to sycl-post-link
tool. The pass is being included to sycl-post-link chain only when
'-fsycl_device_code_add_instrumentation_calls' option is passed to
clang driver.

Current version of the pass create instruction to notify profiling tools
that: work item has started/finised or resumed (in case if barrier was
called) and it annotates the barrier call itself and for annotation
of atomic instructions.

Signed-off-by: Dmitry Sidorov <[email protected]>
Signed-off-by: Dmitry Sidorov <[email protected]>
@MrSidims
Copy link
Contributor Author

MrSidims commented Mar 3, 2021

emitCall function will be moved to a separate header in a separate patch

@MrSidims MrSidims requested a review from vzakhari March 3, 2021 20:32
Signed-off-by: Dmitry Sidorov <[email protected]>
else if (AtomicName.contains(SPIRV_ATOMIC_STORE))
AtomicOp = ConstantInt::get(Int32Ty, 1);
else
AtomicOp = ConstantInt::get(Int32Ty, 2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to add an assertion that name contains something like exchange or compare_exchange here

Comment on lines 146 to 155
// TODO: Third parameter of Atomic Start/Finish annotation is an ordering
// semanticof the instruction, encoded into a value of enum, defined like this
// on user's/profiler's side:
// enum __itt_atomic_mem_order_t
// {
// __itt_mem_order_relaxed = 0,
// __itt_mem_order_acquire = 1,
// __itt_mem_order_release = 2
// }
// which isn't 1:1 mapped on SPIR-V memory ordering mask, need to align it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tagging @vzakhari and @MichaelTutin here for awareness

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that combination of AcquireRelease semantic is missing in itt annotations. SequentiallyConsistent and release-consume semantics (if SPIR-V has some form of it) are not required for analysis at the moment.

I think we can either change this enum to a set of flags (acquire, release) or to add one more value to the enum. Another option would be to pass SPIR-V or C++ ordering value as is, but that will require extra argument to pass API name and tool will need to extract required details in runtime (a bit of extra overhead).
What is your opinion?

Copy link
Contributor Author

@MrSidims MrSidims Mar 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My vote goes to

to add one more value to the enum.

with that done, I can parse SPIR-V Memory Semantic mask to enum value without semantic changes.

Copy link

@MichaelTutin MichaelTutin Mar 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I've updated the document and ITT header file. The enum was changed to:

enum __itt_atomic_mem_order_t
{
    __itt_mem_order_relaxed = 0,
    __itt_mem_order_acquire = 1,
    __itt_mem_order_release = 2,
    __itt_mem_order_acquire_release = 3
}

Copy link
Contributor

@kbobrovs kbobrovs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please address this question first before getting to my other comments. From the first glance I don't see why this should be part of sycl-post-link.

Is there a reason why this transformation needs to be done on linked code and can't be done on object code? Please state it here. Otherwise the transformation would have to go to BackendUtils.cpp or elsewhere.

Signed-off-by: Dmitry Sidorov <[email protected]>
@keryell
Copy link
Contributor

keryell commented Mar 6, 2021

Not clear what it is about. After some Googling, this seems to be related to Intel VTune: https://software.intel.com/content/www/us/en/develop/documentation/vtune-help/top/api-support/instrumentation-and-tracing-technology-apis/basic-usage-and-configuration/minimizing-itt-api-overhead.html
Is it something already used in Clang/LLVM upstream?
Otherwise some documentation is always appreciated. :-)
For example to add other markers, or to recycle the infrastructure for other instrumentation frameworks.

@bader
Copy link
Contributor

bader commented Mar 7, 2021

Not clear what it is about. After some Googling, this seems to be related to Intel VTune: https://software.intel.com/content/www/us/en/develop/documentation/vtune-help/top/api-support/instrumentation-and-tracing-technology-apis/basic-usage-and-configuration/minimizing-itt-api-overhead.html
Is it something already used in Clang/LLVM upstream?
Otherwise some documentation is always appreciated. :-)
For example to add other markers, or to recycle the infrastructure for other instrumentation frameworks.

+1. Could folks working on this feature document the design in our internal documentation first, please? It's hard to follow changes in #3299 and #3279.

Signed-off-by: Dmitry Sidorov <[email protected]>
Signed-off-by: Dmitry Sidorov <[email protected]>
Signed-off-by: Dmitry Sidorov <[email protected]>
@MrSidims
Copy link
Contributor Author

Push new patch set to unblock review. This comment is yet unaddressed: #3299 (comment) . Also there is an issue with passing of CodeGen option from the driver to BackendUtils.

@bader
Copy link
Contributor

bader commented Mar 11, 2021

Push new patch set to unblock review. This comment is yet unaddressed: #3299 (comment) . Also there is an issue with passing of CodeGen option from the driver to BackendUtils.

And there are merge conflicts with the sycl branch.

@MrSidims MrSidims requested a review from kbobrovs March 11, 2021 17:54
Signed-off-by: Dmitry Sidorov <[email protected]>
Signed-off-by: Dmitry Sidorov <[email protected]>
@MrSidims MrSidims requested a review from kbobrovs March 12, 2021 12:04
Signed-off-by: Dmitry Sidorov <[email protected]>
Signed-off-by: Dmitry Sidorov <[email protected]>
Copy link
Contributor

@romanovvlad romanovvlad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@vzakhari
Copy link
Contributor

LGTM

Signed-off-by: Dmitry Sidorov <[email protected]>
@MrSidims MrSidims requested a review from bader March 15, 2021 15:51
Copy link
Contributor

@bader bader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, update comments accordingly.

Signed-off-by: Dmitry Sidorov <[email protected]>
@MrSidims MrSidims requested a review from bader March 15, 2021 17:55
bader
bader previously approved these changes Mar 15, 2021
Copy link
Contributor

@bader bader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with LLVM pass.

mdtoguchi
mdtoguchi previously approved these changes Mar 15, 2021
kbobrovs
kbobrovs previously approved these changes Mar 15, 2021
Signed-off-by: Dmitry Sidorov <[email protected]>
@MrSidims MrSidims dismissed stale reviews from kbobrovs, mdtoguchi, and bader via a41f48a March 15, 2021 20:35
Copy link
Contributor

@elizabethandrews elizabethandrews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@MrSidims
Copy link
Contributor Author

@bader can we consider the review to be done, as we have 3 LGTMs from the appropriate code owners and 3 from Vlad, Slava and Konst?

@bader bader merged commit be18b1d into intel:sycl Mar 16, 2021
alexbatashev pushed a commit to alexbatashev/llvm that referenced this pull request Mar 19, 2021
* upstream/sycl: (1804 commits)
  [SYCL] SYCL 2020 backend interoperability part 1 (intel#3354)
  [Driver][SYCL] Address issue when unbundling for non-FPGA archive (intel#3366)
  [SYCL][Doc] Update compiler options descriptions (intel#3340)
  [SYCL] Update ABI dump tool to disable checks with libcxx by default (intel#3370)
  [SYCL] Update the way we handle -sycl-std= based on community review feedback (intel#3371)
  [SYCL] Move tests to llvm-test-suite (intel#3359)
  [SYCL][PI][L0] Revert copy batching from intel#3232 (intel#3363)
  [SYCL] Remove unsupported option.
  [SYCL] Fix pragma setting in sycl-post-link (intel#3358)
  [SYCL] Add ITT annotation instructions (intel#3299)
  [SYCL] Propagate attributes of original kernel to wrapper kernel generated for range-rounding (intel#3306)
  [BuildBot] Uplift GPU RT version for Linux to 21.09.19150 (intel#3316)
  [SYCL] Retain PI events until they have signaled (intel#3350)
  [SYCL] Add caching when using interop constructor (intel#3327)
  Add ITT stubs and wrappers for SPIR-V devices (intel#3279)
  [SYCL] Add zero argument version of buffer::reinterpret() for SYCL 2020 (intel#3333)
  [SYCL] Restore old behavior of get() method (intel#3356)
  [Driver][SYCL][FPGA] Improve FPGA AOT when using Triple (intel#3330)
  [SYCL] Revert support for pinned_host_memory extension in Level-Zero backend. Make it a NOP (intel#3349)
  [SYCL] Remove redundant build options processing (intel#3342)
  ...
Copy link
Contributor

@AlexeySachkov AlexeySachkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some post-commit comments: mostly minor ones, but the one about non-constant memory semantic might be important

Comment on lines +192 to +197
if (AtomicName.contains(SPIRV_ATOMIC_LOAD))
AtomicOp = ConstantInt::get(Int32Ty, 0);
else if (AtomicName.contains(SPIRV_ATOMIC_STORE))
AtomicOp = ConstantInt::get(Int32Ty, 1);
else
AtomicOp = ConstantInt::get(Int32Ty, 2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you change your prefixes, you could use llvm::StringSwitch here: I mean use _Z[number]__spirv_AtomicLoad and llvm::StringSwitch::StartsWith

Comment on lines +212 to +214
uint64_t MemFlag = dyn_cast<ConstantInt>(AtomicFun->getArgOperand(2))
->getValue()
.getZExtValue();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note that in general this is not required to be a constant value.

Also, you are using dyn_cast, but don't check if it returned nullptr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants