Skip to content

Commit 6cfd2cb

Browse files
author
Mike Kinsner
authored
[SYCL][Doc] Release SYCL_INTEL_enqueue_barrier extension document (#1199)
Signed-off-by: michael.kinsner <[email protected]>
1 parent d596593 commit 6cfd2cb

File tree

1 file changed

+315
-0
lines changed

1 file changed

+315
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,315 @@
1+
= SYCL_INTEL_enqueue_barrier
2+
:source-highlighter: coderay
3+
:coderay-linenums-mode: table
4+
5+
// This section needs to be after the document title.
6+
:doctype: book
7+
:toc2:
8+
:toc: left
9+
:encoding: utf-8
10+
:lang: en
11+
12+
:blank: pass:[ +]
13+
14+
// Set the default source code type in this document to C++,
15+
// for syntax highlighting purposes. This is needed because
16+
// docbook uses c++ and html5 uses cpp.
17+
:language: {basebackend@docbook:c++:cpp}
18+
19+
== Introduction
20+
IMPORTANT: This specification is a draft.
21+
22+
NOTE: Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are trademarks of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. used by permission by Khronos.
23+
24+
NOTE: This document is better viewed when rendered as html with asciidoctor. GitHub does not render image icons.
25+
26+
This document presents a series of changes proposed for a future version of the SYCL Specification. The goal of this proposal is to provide non-blocking APIs that provide synchronization on SYCL command queue for programmers.
27+
28+
29+
== Name Strings
30+
31+
+SYCL_INTEL_enqueue_barrier+
32+
33+
== Notice
34+
35+
Copyright (c) 2019-2020 Intel Corporation. All rights reserved.
36+
37+
== Status
38+
39+
Final Draft
40+
41+
This is a preview extension specification, intended to provide early access to a feature for review and community feedback. When the feature matures, this specification may be released as a formal extension.
42+
43+
Because the interfaces defined by this specification are not final and are subject to change they are not intended to be used by shipping software products.
44+
45+
== Version
46+
47+
Built On: {docdate} +
48+
Revision: 1
49+
50+
== Contact
51+
Please open an issue in the https://github.com/intel/llvm/tree/sycl/sycl/doc/extensions/[extensions repository]
52+
53+
== Dependencies
54+
55+
This extension is written against the SYCL 1.2.1 specification, Revision v1.2.1-6.
56+
57+
== Overview
58+
59+
SYCL 1.2.1 defines a graph-based task execution model, based on kernels or explicit memory operations submitted to out-of-order queues. Dependencies between these kernels are represented by
60+
accessors that form data dependence edges in the execution graph. The USM extension <<usmlink,[1]>> doesn't have accessors, so instead solves
61+
this by defining `handler::depends_on` methods to specify event-based control dependencies between command groups.
62+
63+
There are situations where defining dependencies based on events is more explicit than desired or required by an application. For instance, the user may know that a given task depends on all previously submitted tasks. Instead of explicitly adding all the required depends_on calls, the user could express this intent via a single call, making the program more concise and explicit.
64+
65+
To simplify the interface, this extension adds an enqueued barrier that provides synchronization on SYCL command
66+
queues, with the following properties:
67+
68+
1. Enqueued barriers are non-blocking from the host perspective. The barrier is enqueued, so operates as part of the execution graph asynchronously from host program execution
69+
2. Command groups submitted to the same queue after the barrier is enqueued are not scheduled for execution until the barrier wait conditions are satisfied
70+
71+
This proposal adds two new members to the `handler` class, and
72+
two new members to the `queue` class:
73+
74+
[cols="70,70"]
75+
[grid="rows"]
76+
[options="header"]
77+
|========================================
78+
|*handler::barrier*|*queue::submit_barrier*
79+
|`void barrier()` | `event submit_barrier()`
80+
|`void barrier( const vector_class<event> &waitList )` | `event submit_barrier( const vector_class<event> &waitList )`
81+
|========================================
82+
83+
The first variant of the barrier takes no parameters, and waits for all previously submitted commands to the queue to enter the `info::event_command_status::complete` state before any command later submitted to the same queue is allowed to execute. A second variant of the barrier accepts a list of events, with the behavior that no commands submitted to the same queue after barrier submission may execute until all events in the `waitList` have entered the `info::event_command_status::complete` state. Both variants are non-blocking from the host program perspective, in that they do not wait for the barrier conditions to have been met before returning.
84+
85+
The new barrier operations implicitly add dependence edges to the SYCL task execution graph, and do not have other side effects.
86+
87+
Some forms of the new barrier methods return an `event`, which can be used to perform other synchronization operations (e.g. `depends_on()`). The event from one of the enqueued barrier operations enters the `info::event_command_status::complete` state when all events that the barrier is dependent on (explicitly or implicitly) have entered the `info::event_command_status::complete` state.
88+
89+
90+
== Example Scenarios
91+
92+
=== Scenario 1: Enqueued barrier dependent on all commands previously submitted to the same queue
93+
94+
CG4 doesn't execute until all previous command groups submitted to the same queue (CG1, CG2, CG3) have entered the completed state.
95+
96+
==== 1. Using `handler::barrier()`:
97+
98+
[source,c++,NoName,linenums]
99+
----
100+
...
101+
Queue.submit([&](cl::sycl::handler& cgh) {
102+
// CG1
103+
});
104+
Queue.submit([&](cl::sycl::handler& cgh) {
105+
// CG2
106+
});
107+
Queue.submit([&](cl::sycl::handler& cgh) {
108+
// CG3
109+
});
110+
111+
Queue.submit([&](cl::sycl::handler& cgh) {
112+
cgh.barrier();
113+
});
114+
115+
Queue.submit([&](cl::sycl::handler& cgh) {
116+
// CG4
117+
});
118+
...
119+
----
120+
121+
==== 2. Using `queue::submit_barrier()`:
122+
123+
[source,c++,NoName,linenums]
124+
----
125+
...
126+
Queue.submit([&](cl::sycl::handler& cgh) {
127+
// CG1
128+
});
129+
Queue.submit([&](cl::sycl::handler& cgh) {
130+
// CG2
131+
});
132+
Queue.submit([&](cl::sycl::handler& cgh) {
133+
// CG3
134+
});
135+
136+
Queue.submit_barrier();
137+
138+
Queue.submit([&](cl::sycl::handler& cgh) {
139+
// CG4
140+
});
141+
...
142+
----
143+
144+
145+
=== Scenario 2: Enqueued barrier dependent on specific events from previously submitted commands
146+
147+
CG3 requires CG1 (in Queue1) and CG2 (in Queue2) to have completed before it (CG3) begins execution.
148+
149+
==== 1. Using `handler::barrier()`:
150+
151+
[source,c++,NoName,linenums]
152+
----
153+
...
154+
auto event_barrier1 = Queue1.submit([&](cl::sycl::handler& cgh) {
155+
// CG1
156+
});
157+
158+
auto event_barrier2 = Queue2.submit([&](cl::sycl::handler& cgh) {
159+
// CG2
160+
});
161+
162+
Queue3.submit([&](cl::sycl::handler& cgh) {
163+
cgh.barrier( vector_class<event>{event_barrier1, event_barrier2} );
164+
});
165+
166+
Queue3.submit([&](cl::sycl::handler& cgh) {
167+
// CG3
168+
});
169+
...
170+
----
171+
172+
==== 2. Using `queue::submit_barrier()`:
173+
174+
[source,c++,NoName,linenums]
175+
----
176+
...
177+
auto event_barrier1 = Queue1.submit([&](cl::sycl::handler& cgh) {
178+
// CG1
179+
});
180+
181+
auto event_barrier2 = Queue2.submit([&](cl::sycl::handler& cgh) {
182+
// CG2
183+
});
184+
185+
Queue3.submit_barrier( vector_class<event>{event_barrier1, event_barrier2} );
186+
187+
Queue3.submit([&](cl::sycl::handler& cgh) {
188+
// CG3
189+
});
190+
...
191+
----
192+
193+
== Specification changes
194+
195+
=== Modify part of Section 4.6.5.1
196+
197+
*Change from:*
198+
[source,c++,NoName,linenums]
199+
----
200+
...
201+
template <typename T>
202+
event submit(T cgf, const queue &secondaryQueue);
203+
204+
void wait();
205+
...
206+
----
207+
*To:*
208+
[source,c++,NoName,linenums]
209+
----
210+
...
211+
template <typename T>
212+
event submit(T cgf, const queue &secondaryQueue);
213+
214+
event submit_barrier();
215+
216+
event submit_barrier( const vector_class<event> &waitList );
217+
218+
void wait();
219+
...
220+
----
221+
=== Add rows to Table 4.22
222+
223+
[cols="70,300"]
224+
[grid="rows"]
225+
[options="header"]
226+
|========================================
227+
|*Member functions*|*Description*
228+
|`event submit_barrier()` | Same effect as submitting a `handler::barrier()` within a command group to this `queue`. The returned event enters the `info::event_command_status::complete` state when all events that the barrier is dependent on (implicitly from all previously submitted commands to the same queue) have entered the `info::event_command_status::complete` state.
229+
|`event submit_barrier( const vector_class<event> &waitList )` | Same effect as submitting a `handler:barrier( const vector_class<event> &waitList )` within a command group to this `queue`. The returned event enters the `info::event_command_status::complete` state when all events that the barrier is dependent on (explicitly from `waitList`) have entered the `info::event_command_status::complete` state.
230+
|========================================
231+
232+
233+
=== Modify Section 4.8.2
234+
235+
==== Change first sentence from:
236+
A command group scope in SYCL, as it is defined in Section 3.4.1, consists of a single kernel or explicit memory
237+
operation (handler methods such as copy, update_host, fill), together with its requirements.
238+
239+
==== To:
240+
241+
A command group scope in SYCL, as it is defined in Section 3.4.1, consists of a single kernel, explicit memory
242+
operation (handler methods such as copy, update_host, fill) or barrier, together with its requirements.
243+
244+
=== Modify part of Section 4.8.3
245+
246+
*Change from:*
247+
[source,c++,NoName,linenums]
248+
----
249+
...
250+
template<typename T, int dim, access::mode mode, access::target tgt>
251+
void fill(accessor<T, dim, mode, tgt> dest, const T& src);
252+
253+
};
254+
...
255+
----
256+
257+
*To:*
258+
[source,c++,NoName,linenums]
259+
----
260+
...
261+
template<typename T, int dim, access::mode mode, access::target tgt>
262+
void fill(accessor<T, dim, mode, tgt> dest, const T& src);
263+
264+
void barrier();
265+
266+
void barrier( const vector_class<event> &waitList );
267+
268+
};
269+
...
270+
----
271+
272+
=== Add a new section between Section 4.8.6 and 4.8.7
273+
274+
4.8.X SYCL functions for enqueued synchronization barriers
275+
276+
Barriers may be submitted to a queue, with the effect that they prevent later operations submitted to the same queue from executing until the barrier wait conditions have been satisfied. The wait conditions can be explicitly described by `waitList` or implicitly from all previously submitted commands to the same queue. There are no constraints on the context from which queues may participate in the `waitList`. Enqueued barriers do not block host program execution, but instead form additional dependence edges with the execution task graph.
277+
278+
Barriers can be created by two members of the `handler` class that force synchronization on the SYCL command queue. The first variant of the `handler` barrier (`handler::barrier()`) takes no parameters, and waits for all previously submitted commands to the queue to enter the `info::event_command_status::complete` state before any command later submitted to the same queue is allowed to execute. The second variant of the `handler` barrier (`handler::barrier( const vector_class<event> &waitList )`) accepts a list of events, with the behavior that no commands submitted to the same queue after barrier submission may execute until all events in the waitList have entered the `info::event_command_status::complete` state.
279+
280+
=== Add a new table in the new section between 4.8.6 and 4.8.7: Member functions of the handler class.
281+
282+
[cols="70,300"]
283+
[grid="rows"]
284+
[options="header"]
285+
|========================================
286+
|*Member functions*|*Description*
287+
|`void barrier()` | Prevents any commands submitted afterward to this queue from executing until all commands previously submitted to this queue have entered the `info::event_command_status::complete` state.
288+
|`void barrier( const vector_class<event> &waitList` ) | Prevents any commands submitted afterward to this queue from executing until all events in `waitList` have entered the `info::event_command_status::complete` state. If `waitList` is empty, then the barrier has no effect.
289+
|========================================
290+
291+
== References
292+
1. [[usmlink]]https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/USM/USM.adoc
293+
294+
== Issues
295+
296+
None.
297+
298+
== Revision History
299+
300+
[cols="5,15,15,70"]
301+
[grid="rows"]
302+
[options="header"]
303+
|========================================
304+
|Rev|Date|Author|Changes
305+
|1|2020-02-26|Ye Ting|*Initial public release*
306+
|========================================
307+
308+
//************************************************************************
309+
//Other formatting suggestions:
310+
//
311+
//* Use *bold* text for host APIs, or [source] syntax highlighting.
312+
//* Use +mono+ text for device APIs, or [source] syntax highlighting.
313+
//* Use +mono+ text for extension names, types, or enum values.
314+
//* Use _italics_ for parameters.
315+
//************************************************************************

0 commit comments

Comments
 (0)