Skip to content

Commit 300bba0

Browse files
JackAKirkProGTX
andauthored
[SYCL] Deprecate discard_events queue prop, make impl no-op (#18059)
Deprecates sycl_ext_oneapi_discard_queue_events in favour of the sycl_ext_oneapi_enqueue_functions extension. In order to simplify/allow further simplifications to the handler/scheduler code the implementation of the `discard_events` queue property is also made a no-op immediately. The main consequence of this is that users of `discard_events` will have to ensure they use the new `submit_without_event` / `nd_launch` etc APIs from https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_oneapi_enqueue_functions.asciidoc `sycl::event` is very lightweight already in the level_zero v2 adapter and discard_events already doesn't have a very big performance advantage for this case. Therefore the biggest consequence of this PR from a performance point of view is in the cuda/hip backends. Users of these backends need to be aware of the performance implications of making the `discard_events` no-op, and should ensure that they switch to using the `sycl_ext_oneapi_enqueue_functions`. This PR leaves all other functionality working correctly. --------- Signed-off-by: JackAKirk <[email protected]> Co-authored-by: Peter Žužek <[email protected]>
1 parent a5c7d88 commit 300bba0

30 files changed

+268
-1219
lines changed
Lines changed: 212 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,212 @@
1+
= sycl_ext_oneapi_discard_queue_events
2+
:source-highlighter: coderay
3+
:coderay-linenums-mode: table
4+
5+
// This section needs to be after the document title.
6+
:doctype: book
7+
:toc2:
8+
:toc: left
9+
:encoding: utf-8
10+
:lang: en
11+
:dpcpp: pass:[DPC++]
12+
13+
:blank: pass:[ +]
14+
15+
// Set the default source code type in this document to C++,
16+
// for syntax highlighting purposes. This is needed because
17+
// docbook uses c++ and html5 uses cpp.
18+
:language: {basebackend@docbook:c++:cpp}
19+
20+
// This is necessary for asciidoc, but not for asciidoctor
21+
:cpp: C++
22+
23+
== Introduction
24+
25+
IMPORTANT: This specification is a draft.
26+
27+
NOTE: Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are
28+
trademarks of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc.
29+
used by permission by Khronos.
30+
31+
This document describes an extension that introduces a `discard_events` property for
32+
SYCL queues. This property enables developers to inform a SYCL implementation that
33+
the events returned from queue operations will not be used.
34+
35+
== Notice
36+
37+
Copyright (c) 2021 Intel Corporation. All rights reserved.
38+
39+
== Status
40+
41+
This extension has been deprecated. This extension no longer provides any
42+
benefit. Although the interfaces defined in this specification are still
43+
supported in {dpcpp}, we expect that they will be removed in an upcoming {dpcpp}
44+
release. The optimizations enabled by these interfaces have already been
45+
disabled in the compiler. The functionality of this extension has been
46+
replaced by the sycl_ext_oneapi_enqueue_functions extension: see link:../experimental/sycl_ext_oneapi_enqueue_functions.asciidoc[here].
47+
*Shipping software products should stop using APIs defined in this
48+
specification and use this alternative instead.*
49+
50+
== Version
51+
52+
Revision: 1
53+
54+
== Contributors
55+
56+
Alexander Flegontov, Intel +
57+
Greg Lueck, Intel +
58+
John Pennycook, Intel +
59+
Vlad Romanov, Intel
60+
61+
== Dependencies
62+
63+
This extension is written against the SYCL 2020 specification, Revision 4.
64+
65+
== Feature Test Macro
66+
67+
This extension provides a feature-test macro as described in the core SYCL
68+
specification section 6.3.3 "Feature test macros". Therefore, an
69+
implementation supporting this extension must predefine the macro
70+
`SYCL_EXT_ONEAPI_DISCARD_QUEUE_EVENTS` to one of the values defined in the table below.
71+
Applications can test for the existence of this macro to determine if the
72+
implementation supports this feature, or applications can test the macro's
73+
value to determine which of the extension's APIs the implementation supports.
74+
75+
[%header,cols="1,5"]
76+
|===
77+
|Value |Description
78+
|1 |Initial extension version. Base features are supported.
79+
|===
80+
81+
== Overview
82+
83+
This extension adds `ext::oneapi::property::queue::discard_events` property for `sycl::queue`,
84+
by using this property the application informs a SYCL implementation that it will not use the event
85+
returned by any of the `queue` member functions. (i.e submit, parallel_for, copy, memset and others.)
86+
When the application creates a queue with this property,
87+
the implementation may be able to optimize some operations on the `queue`.
88+
The `discard_events` property is incompatible with `enable_profiling`.
89+
Attempts to construct a `queue` with both properties raises `errc::invalid`.
90+
91+
Below is a usage example:
92+
[source,c++]
93+
----
94+
sycl::property_list props{ext::oneapi::property::queue::discard_events{},
95+
property::queue::in_order{}};
96+
sycl::queue Queue( props );
97+
98+
// some USM preparations ..
99+
100+
sycl::event e1, e2, e3;
101+
102+
// returning "invalid" events from each submission function:
103+
e1 = Queue.parallel_for(NDRange, [=](nd_item<1> item){ do_smth1(); });
104+
105+
e2 = Queue.single_task([=](){ do_smth2(); });
106+
107+
e3 = Queue.submit([&](handler &CGH) { CGH.parallel_for(NDRange, [=](nd_item<1> item){ do_smth3(); }); });
108+
109+
Queue.wait();
110+
----
111+
112+
In the example above, the application doesn't use sycl events: `e1`, `e2`, `e3`
113+
and is waiting for the end of work by `queue::wait()`.
114+
When the queue is created with the `discard_events` property,
115+
the returned events will be _invalid_ events, which are `sycl::event` objects that have limited capability.
116+
See the description of behavior for this event below for details.
117+
118+
Here, only those member functions for the _invalid_ event are described that have behavior different from the default event behavior:
119+
[source,c++]
120+
----
121+
// must throw an exception with the errc::invalid error code.
122+
std::vector<event> get_wait_list();
123+
124+
// must throw an exception with the errc::invalid error code.
125+
void wait();
126+
127+
// if invalid event is passed into the function, must throw an exception with the errc::invalid error code.
128+
static void wait(const std::vector<event> &eventList);
129+
130+
// must throw an exception with the errc::invalid error code.
131+
void wait_and_throw();
132+
133+
// if invalid event is passed into the function, must throw an exception with the errc::invalid error code.
134+
static void wait_and_throw(const std::vector<event> &eventList);
135+
136+
// must return info::event_command_status::ext_oneapi_unknown
137+
get_info<info::event::command_execution_status>() const;
138+
----
139+
140+
The behavior when _invalid_ event is passed into handler API:
141+
[source,c++]
142+
----
143+
// must throw an exception with the errc::invalid error code.
144+
handler::depends_on(event Event)
145+
146+
// must throw an exception with the errc::invalid error code.
147+
handler::depends_on(const std::vector<event> &Events)
148+
----
149+
150+
A new enumerator value is also added to the `info::event_command_status` enumeration,
151+
which is returned by `get_info<info::event::command_execution_status>()` as described above:
152+
[source,c++]
153+
----
154+
namespace sycl {
155+
namespace info {
156+
157+
enum class event_command_status : int {
158+
// ...
159+
ext_oneapi_unknown
160+
};
161+
162+
} // namespace info
163+
} // namespace sycl
164+
----
165+
166+
== Optimization behavior for DPC++
167+
168+
This non-normative section describes the conditions when the DPC++ implementation provides an optimization benefit* for the `discard_events` property.
169+
170+
- The queue must be constructed with the `in_order` property.
171+
- A kernel submitted to the queue must not use the link:../supported/sycl_ext_oneapi_assert.asciidoc[fallback assert feature].
172+
- A queue operation submitted to the queue must not use streams or buffer / image accessors. However, local accessors do not inhibit optimization.
173+
- Any queue operations using Level Zero backend temporarily work without optimization.
174+
175+
*The benefit is that a low-level event is not created from backend, thereby saving time.
176+
177+
See the behavior details for each condition below:
178+
179+
=== Using out-of-order queue
180+
181+
No optimization if a queue is created with the `discard_events` property and
182+
the property list does not include `in_order` property.
183+
184+
=== Using fallback assert feature
185+
186+
No optimization if the application calls the `assert` macro from a command that is submitted to the queue unless
187+
the device has native support for assertions (as specified by `aspect::ext_oneapi_native_assert`).
188+
189+
=== Using streams or buffer / image accessors (excluding local accessors)
190+
191+
No optimization if a queue operation that uses stream objects or buffer / image accessors is submitted to a queue created with
192+
the `discard_events` property. But using local accessors does not affect optimization.
193+
194+
=== Using Level Zero backend
195+
196+
Since Level Zero adapter support is required to be able to not create a low-level event,
197+
any queue operations using the Level Zero backend temporarily work without optimization.
198+
199+
200+
== Issues
201+
202+
None.
203+
204+
== Revision History
205+
206+
[cols="5,15,15,70"]
207+
[grid="rows"]
208+
[options="header"]
209+
|========================================
210+
|Rev|Date|Author|Changes
211+
|1|2021-11-09|Alexander Flegontov |*Initial public working draft*
212+
|========================================

0 commit comments

Comments
 (0)