Skip to content

Commit d9b178f

Browse files
authored
[SYCL][Doc] Update sub-group extension docs (#1330)
Splits sub-group functionality into three extensions: - SubGroup (sub_group class and device queries) - SubGroupAlgorithms (GroupAlgorithm support and permute) - GroupMask (sub_group::mask_type and ballot) Signed-off-by: John Pennycook <[email protected]>
1 parent b18a566 commit d9b178f

File tree

7 files changed

+741
-284
lines changed

7 files changed

+741
-284
lines changed
+3
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# SYCL_INTEL_group_mask
2+
3+
A new `group_mask` class providing an ability to efficiently represent subsets of work-items in a group for which a given Boolean condition holds.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,290 @@
1+
= SYCL_INTEL_group_mask
2+
:source-highlighter: coderay
3+
:coderay-linenums-mode: table
4+
5+
// This section needs to be after the document title.
6+
:doctype: book
7+
:toc2:
8+
:toc: left
9+
:encoding: utf-8
10+
:lang: en
11+
12+
:blank: pass:[ +]
13+
14+
// Set the default source code type in this document to C++,
15+
// for syntax highlighting purposes. This is needed because
16+
// docbook uses c++ and html5 uses cpp.
17+
:language: {basebackend@docbook:c++:cpp}
18+
19+
== Introduction
20+
IMPORTANT: This specification is a draft.
21+
22+
NOTE: Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are trademarks of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. used by permission by Khronos.
23+
24+
NOTE: This document is better viewed when rendered as html with asciidoctor. GitHub does not render image icons.
25+
26+
This document describes an extension which adds a +group_mask+ type. Such a mask can be used to efficiently represent subsets of work-items in a group for which a given Boolean condition holds. Group mask functionality is currently limited to groups that are instances of the +sub_group+ class.
27+
28+
== Name Strings
29+
30+
+SYCL_INTEL_group_mask+
31+
32+
== Notice
33+
34+
Copyright (c) 2020 Intel Corporation. All rights reserved.
35+
36+
== Status
37+
38+
Working Draft
39+
40+
This is a preview extension specification, intended to provide early access to a feature for review and community feedback. When the feature matures, this specification may be released as a formal extension.
41+
42+
Because the interfaces defined by this specification are not final and are subject to change they are not intended to be used by shipping software products.
43+
44+
== Version
45+
46+
Built On: {docdate} +
47+
Revision: 1
48+
49+
== Contact
50+
John Pennycook, Intel (john 'dot' pennycook 'at' intel 'dot' com)
51+
52+
== Dependencies
53+
54+
This extension is written against the SYCL 1.2.1 specification, Revision 6 and the following extensions:
55+
56+
- +SYCL_INTEL_sub_group+
57+
58+
== Overview
59+
60+
A group mask is an integral type sized such that each work-item in the group is represented by a single bit. Such a mask can be used to efficiently represent subsets of work-items in a group for which a given Boolean condition holds.
61+
62+
Group mask functionality is currently limited to groups that are instances of the +sub_group+ class, but this limitation may be lifted in a future version of the specification.
63+
64+
=== Ballot
65+
66+
The +ballot+ algorithm converts a Boolean condition from each work-item in the group into a group mask. Like other group algorithms, +ballot+ must be encountered by all work-items in the group in converged control flow.
67+
68+
|===
69+
|Member Functions|Description
70+
71+
|+template <typename Group> Group::mask_type ballot(bool predicate = true) const+
72+
|Return a +group_mask+ representing the set of work-items in the group for which _predicate_ is +true+.
73+
|===
74+
75+
=== Group Masks
76+
77+
The group mask type is an opaque type, permitting implementations to use any mask representation subject to the following restrictions:
78+
79+
- The size and alignment of the mask type must be the same on the host and device
80+
- A SYCL implementation supporting OpenCL interoperability must use a 128-bit mask convertible to a +vec<uint,4>+
81+
82+
Functions declared in the +mask+ class can be called independently by different work-items in the same group. An instance of a group class (e.g. +group+ or +sub_group+) is not required to manipulate a group mask.
83+
84+
The mask is defined such that the least significant bit (LSB) corresponds to the work-item with id 0, and the most significant bit (MSB) corresponds to the work-item with the id +max_local_range()-1+.
85+
86+
|===
87+
|Member Function|Description
88+
89+
|+bool operator[](id<1> id) const+
90+
|Return +true+ if the bit corresponding to the specified _id_ is set in the mask.
91+
92+
|+mask::reference operator[](id<1> id) const+
93+
|Return a reference to the bit corresponding to the specified _id_ in the mask.
94+
95+
|+bool test(id<1> id) const+
96+
|Return +true+ if the bit corresponding to the specified _id_ is set in the mask.
97+
98+
|+bool all() const+
99+
|Return +true+ if all bits in the mask are set.
100+
101+
|+bool any() const+
102+
|Return +true+ if any bits in the mask are set.
103+
104+
|+bool none() const+
105+
|Return +true+ if none of the bits in the mask are set.
106+
107+
|+uint32_t count() const+
108+
|Return the number of bits set in the mask.
109+
110+
|+uint32_t size() const+
111+
|Return the number of bits in the mask.
112+
113+
|+id<1> find_low() const+
114+
|Return the lowest +id+ with a corresponding bit set in the mask. If no bits are set, the return value is equal to `size()`.
115+
116+
|+id<1> find_high() const+
117+
|Return the highest +id+ with a corresponding bit set in the mask. If no bits are set, the return value is equal to `size()`.
118+
119+
|+template <typename T = vec<uint32_t,4>> void insert_bits(T bits, id<1> pos = 0)+
120+
|Insert `CHAR_BIT * sizeof(T)` bits into the mask, starting from _pos_. `T` must be an integral type or a SYCL vector of integral types. _pos_ must be a multiple of `CHAR_BIT * sizeof(T)` in the range [0, `size()`). If _pos_ + `CHAR_BIT * sizeof(T)` is greater than `size()`, the final `size()` - (_pos_ + `CHAR_BIT * sizeof(T)`) bits are ignored.
121+
122+
|+template <typename T = vec<uint32_t,4>> T extract_bits(id<1> pos = 0) const+
123+
|Return `CHAR_BIT * sizeof(T)` bits from the mask, starting from _pos_. `T` must be an integral type or a SYCL vector of integral types. _pos_ must be a multiple of `CHAR_BIT * sizeof(T)` in the range [0, `size()`). If _pos_ + `CHAR_BIT * sizeof(T)` is greater than `size()`, the final `size()` - (_pos_ + `CHAR_BIT * sizeof(T)`) bits of the return value are zero.
124+
125+
|+void set()+
126+
|Set all bits in the mask to true.
127+
128+
|+void set(id<1> id, bool value = true)+
129+
|Set the bit corresponding to the specified _id_ to the value specified by _value_.
130+
131+
|+void reset()+
132+
|Reset all bits in the mask.
133+
134+
|+void reset(id<1> id)+
135+
|Reset the bit corresponding to the specified _id_.
136+
137+
|+void reset_low()+
138+
|Reset the bit for the lowest +id+ with a corresponding bit set in the mask. Functionally equivalent to +reset(find_low())+.
139+
140+
|+void reset_high()+
141+
|Reset the bit for the highest +id+ with a corresponding bit set in the mask. Functionally equivalent to +reset(find_high())+.
142+
143+
|+void flip()+
144+
|Toggle the values of all bits in the mask.
145+
146+
|+void flip(id<1> id)+
147+
|Toggle the value of the bit corresponding to the specified _id_.
148+
149+
|+bool operator==(group_mask rhs) const+
150+
|Return true if each bit in this mask is equal to the corresponding bit in +rhs+.
151+
152+
|+bool operator!=(group_mask rhs) const+
153+
|Return true if any bit in this mask is not equal to the corresponding bit in +rhs+.
154+
155+
|+group_mask operator &=(group_mask rhs)+
156+
|Set the bits of this mask to the result of performing a bitwise AND with this mask and +rhs+.
157+
158+
|+group_mask operator |=(group_mask rhs)+
159+
|Set the bits of this mask to the result of performing a bitwise OR with this mask and +rhs+.
160+
161+
|+group_mask operator ^=(group_mask rhs)+
162+
|Set the bits of this mask to the result of performing a bitwise XOR with this mask and +rhs+.
163+
164+
|+group_mask operator <<=(size_t shift)+
165+
|Set the bits of this mask to the result of shifting its bits _shift_ positions to the left using a logical shift. Bits that are shifted out to the left are discarded, and zeroes are shifted in from the right.
166+
167+
|+group_mask operator >>=(size_t shift)+
168+
|Set the bits of this mask to the result of shifting its bits _shift_ positions to the right using a logical shift. Bits that are shifted out to the right are discarded, and zeroes are shifted in from the left.
169+
170+
|+group_mask operator ~() const+
171+
|Return a mask representing the result of flipping all the bits in this mask.
172+
173+
|+group_mask operator <<(size_t shift)+
174+
|Return a mask representing the result of shifting its bits _shift_ positions to the left using a logical shift. Bits that are shifted out to the left are discarded, and zeroes are shifted in from the right.
175+
176+
|+group_mask operator >>(size_t shift)+
177+
|Return a mask representing the result of shifting its bits _shift_ positions to the right using a logical shift. Bits that are shifted out to the right are discarded, and zeroes are shifted in from the left.
178+
|===
179+
180+
|===
181+
|Function|Description
182+
183+
|+group_mask operator &(const group_mask& lhs, const group_mask& rhs)+
184+
|Return a mask representing the result of performing a bitwise AND of +lhs+ and +rhs+.
185+
186+
|+group_mask operator |(const group_mask& lhs, const group_mask& rhs)+
187+
|Return a mask representing the result of performing a bitwise OR of +lhs+ and +rhs+.
188+
189+
|+group_mask operator ^(const group_mask& lhs, const group_mask& rhs)+
190+
|Return a mask representing the result of performing a bitwise XOR of +lhs+ and +rhs+.
191+
192+
|===
193+
194+
==== Sample Header
195+
196+
[source, c++]
197+
----
198+
namespace cl {
199+
namespace sycl {
200+
namespace intel {
201+
202+
struct group_mask {
203+
204+
// enable reference to individual bit
205+
struct reference {
206+
reference& operator=(bool x);
207+
reference& operator=(const reference& x);
208+
bool operator~() const;
209+
operator bool() const;
210+
reference& flip();
211+
};
212+
213+
bool operator[](id<1> id) const;
214+
reference operator[](id<1> id) const;
215+
bool test(id<1> id) const;
216+
bool all() const;
217+
bool any() const;
218+
bool none() const;
219+
uint32_t count() const;
220+
uint32_t size() const;
221+
id<1> find_low() const;
222+
id<1> find_high() const;
223+
224+
template <typename T = vec<uint32_t,4>>
225+
void insert_bits(T bits, id<1> pos = 0);
226+
227+
template <typename T = vec<uint32_t,4>>
228+
T extract_bits(id<1> pos = 0);
229+
230+
void set();
231+
void set(id<1> id, bool value = true);
232+
void reset();
233+
void reset(id<1> id);
234+
void reset_low();
235+
void reset_high();
236+
void flip();
237+
void flip(id<1> id);
238+
239+
bool operator==(group_mask rhs) const;
240+
bool operator!=(group_mask rhs) const;
241+
242+
group_mask operator &=(group_mask rhs);
243+
group_mask operator |=(group_mask rhs);
244+
group_mask operator ^=(group_mask rhs);
245+
group_mask operator <<=(size_t);
246+
group_mask operator >>=(size_t rhs);
247+
248+
group_mask operator ~() const;
249+
group_mask operator <<(size_t) const;
250+
group_mask operator >>(size_t) const;
251+
252+
};
253+
254+
group_mask operator &(const group_mask& lhs, const group_mask& rhs);
255+
group_mask operator |(const group_mask& lhs, const group_mask& rhs);
256+
group_mask operator ^(const group_mask& lhs, const group_mask& rhs);
257+
258+
} // intel
259+
} // sycl
260+
} // cl
261+
----
262+
263+
== Issues
264+
265+
None.
266+
267+
//. asd
268+
//+
269+
//--
270+
//*RESOLUTION*: Not resolved.
271+
//--
272+
273+
== Revision History
274+
275+
[cols="5,15,15,70"]
276+
[grid="rows"]
277+
[options="header"]
278+
|========================================
279+
|Rev|Date|Author|Changes
280+
|1|2020-03-16|John Pennycook|*Initial public working draft*
281+
|========================================
282+
283+
//************************************************************************
284+
//Other formatting suggestions:
285+
//
286+
//* Use *bold* text for host APIs, or [source] syntax highlighting.
287+
//* Use +mono+ text for device APIs, or [source] syntax highlighting.
288+
//* Use +mono+ text for extension names, types, or enum values.
289+
//* Use _italics_ for parameters.
290+
//************************************************************************
+4
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
# SYCL_INTEL_sub_group
2+
3+
A new `sub_group` class representing an implementation-defined grouping of work-items in a work-group.
4+

0 commit comments

Comments
 (0)