|
| 1 | += SYCL_INTEL_group_mask |
| 2 | +:source-highlighter: coderay |
| 3 | +:coderay-linenums-mode: table |
| 4 | + |
| 5 | +// This section needs to be after the document title. |
| 6 | +:doctype: book |
| 7 | +:toc2: |
| 8 | +:toc: left |
| 9 | +:encoding: utf-8 |
| 10 | +:lang: en |
| 11 | + |
| 12 | +:blank: pass:[ +] |
| 13 | + |
| 14 | +// Set the default source code type in this document to C++, |
| 15 | +// for syntax highlighting purposes. This is needed because |
| 16 | +// docbook uses c++ and html5 uses cpp. |
| 17 | +:language: {basebackend@docbook:c++:cpp} |
| 18 | + |
| 19 | +== Introduction |
| 20 | +IMPORTANT: This specification is a draft. |
| 21 | + |
| 22 | +NOTE: Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are trademarks of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. used by permission by Khronos. |
| 23 | + |
| 24 | +NOTE: This document is better viewed when rendered as html with asciidoctor. GitHub does not render image icons. |
| 25 | + |
| 26 | +This document describes an extension which adds a +group_mask+ type. Such a mask can be used to efficiently represent subsets of work-items in a group for which a given Boolean condition holds. Group mask functionality is currently limited to groups that are instances of the +sub_group+ class. |
| 27 | + |
| 28 | +== Name Strings |
| 29 | + |
| 30 | ++SYCL_INTEL_group_mask+ |
| 31 | + |
| 32 | +== Notice |
| 33 | + |
| 34 | +Copyright (c) 2020 Intel Corporation. All rights reserved. |
| 35 | + |
| 36 | +== Status |
| 37 | + |
| 38 | +Working Draft |
| 39 | + |
| 40 | +This is a preview extension specification, intended to provide early access to a feature for review and community feedback. When the feature matures, this specification may be released as a formal extension. |
| 41 | + |
| 42 | +Because the interfaces defined by this specification are not final and are subject to change they are not intended to be used by shipping software products. |
| 43 | + |
| 44 | +== Version |
| 45 | + |
| 46 | +Built On: {docdate} + |
| 47 | +Revision: 1 |
| 48 | + |
| 49 | +== Contact |
| 50 | +John Pennycook, Intel (john 'dot' pennycook 'at' intel 'dot' com) |
| 51 | + |
| 52 | +== Dependencies |
| 53 | + |
| 54 | +This extension is written against the SYCL 1.2.1 specification, Revision 6 and the following extensions: |
| 55 | + |
| 56 | +- +SYCL_INTEL_sub_group+ |
| 57 | + |
| 58 | +== Overview |
| 59 | + |
| 60 | +A group mask is an integral type sized such that each work-item in the group is represented by a single bit. Such a mask can be used to efficiently represent subsets of work-items in a group for which a given Boolean condition holds. |
| 61 | + |
| 62 | +Group mask functionality is currently limited to groups that are instances of the +sub_group+ class, but this limitation may be lifted in a future version of the specification. |
| 63 | + |
| 64 | +=== Ballot |
| 65 | + |
| 66 | +The +ballot+ algorithm converts a Boolean condition from each work-item in the group into a group mask. Like other group algorithms, +ballot+ must be encountered by all work-items in the group in converged control flow. |
| 67 | + |
| 68 | +|=== |
| 69 | +|Member Functions|Description |
| 70 | + |
| 71 | +|+template <typename Group> Group::mask_type ballot(bool predicate = true) const+ |
| 72 | +|Return a +group_mask+ representing the set of work-items in the group for which _predicate_ is +true+. |
| 73 | +|=== |
| 74 | + |
| 75 | +=== Group Masks |
| 76 | + |
| 77 | +The group mask type is an opaque type, permitting implementations to use any mask representation subject to the following restrictions: |
| 78 | + |
| 79 | +- The size and alignment of the mask type must be the same on the host and device |
| 80 | +- A SYCL implementation supporting OpenCL interoperability must use a 128-bit mask convertible to a +vec<uint,4>+ |
| 81 | + |
| 82 | +Functions declared in the +mask+ class can be called independently by different work-items in the same group. An instance of a group class (e.g. +group+ or +sub_group+) is not required to manipulate a group mask. |
| 83 | + |
| 84 | +The mask is defined such that the least significant bit (LSB) corresponds to the work-item with id 0, and the most significant bit (MSB) corresponds to the work-item with the id +max_local_range()-1+. |
| 85 | + |
| 86 | +|=== |
| 87 | +|Member Function|Description |
| 88 | + |
| 89 | +|+bool operator[](id<1> id) const+ |
| 90 | +|Return +true+ if the bit corresponding to the specified _id_ is set in the mask. |
| 91 | + |
| 92 | +|+mask::reference operator[](id<1> id) const+ |
| 93 | +|Return a reference to the bit corresponding to the specified _id_ in the mask. |
| 94 | + |
| 95 | +|+bool test(id<1> id) const+ |
| 96 | +|Return +true+ if the bit corresponding to the specified _id_ is set in the mask. |
| 97 | + |
| 98 | +|+bool all() const+ |
| 99 | +|Return +true+ if all bits in the mask are set. |
| 100 | + |
| 101 | +|+bool any() const+ |
| 102 | +|Return +true+ if any bits in the mask are set. |
| 103 | + |
| 104 | +|+bool none() const+ |
| 105 | +|Return +true+ if none of the bits in the mask are set. |
| 106 | + |
| 107 | +|+uint32_t count() const+ |
| 108 | +|Return the number of bits set in the mask. |
| 109 | + |
| 110 | +|+uint32_t size() const+ |
| 111 | +|Return the number of bits in the mask. |
| 112 | + |
| 113 | +|+id<1> find_low() const+ |
| 114 | +|Return the lowest +id+ with a corresponding bit set in the mask. If no bits are set, the return value is equal to `size()`. |
| 115 | + |
| 116 | +|+id<1> find_high() const+ |
| 117 | +|Return the highest +id+ with a corresponding bit set in the mask. If no bits are set, the return value is equal to `size()`. |
| 118 | + |
| 119 | +|+template <typename T = vec<uint32_t,4>> void insert_bits(T bits, id<1> pos = 0)+ |
| 120 | +|Insert `CHAR_BIT * sizeof(T)` bits into the mask, starting from _pos_. `T` must be an integral type or a SYCL vector of integral types. _pos_ must be a multiple of `CHAR_BIT * sizeof(T)` in the range [0, `size()`). If _pos_ + `CHAR_BIT * sizeof(T)` is greater than `size()`, the final `size()` - (_pos_ + `CHAR_BIT * sizeof(T)`) bits are ignored. |
| 121 | + |
| 122 | +|+template <typename T = vec<uint32_t,4>> T extract_bits(id<1> pos = 0) const+ |
| 123 | +|Return `CHAR_BIT * sizeof(T)` bits from the mask, starting from _pos_. `T` must be an integral type or a SYCL vector of integral types. _pos_ must be a multiple of `CHAR_BIT * sizeof(T)` in the range [0, `size()`). If _pos_ + `CHAR_BIT * sizeof(T)` is greater than `size()`, the final `size()` - (_pos_ + `CHAR_BIT * sizeof(T)`) bits of the return value are zero. |
| 124 | + |
| 125 | +|+void set()+ |
| 126 | +|Set all bits in the mask to true. |
| 127 | + |
| 128 | +|+void set(id<1> id, bool value = true)+ |
| 129 | +|Set the bit corresponding to the specified _id_ to the value specified by _value_. |
| 130 | + |
| 131 | +|+void reset()+ |
| 132 | +|Reset all bits in the mask. |
| 133 | + |
| 134 | +|+void reset(id<1> id)+ |
| 135 | +|Reset the bit corresponding to the specified _id_. |
| 136 | + |
| 137 | +|+void reset_low()+ |
| 138 | +|Reset the bit for the lowest +id+ with a corresponding bit set in the mask. Functionally equivalent to +reset(find_low())+. |
| 139 | + |
| 140 | +|+void reset_high()+ |
| 141 | +|Reset the bit for the highest +id+ with a corresponding bit set in the mask. Functionally equivalent to +reset(find_high())+. |
| 142 | + |
| 143 | +|+void flip()+ |
| 144 | +|Toggle the values of all bits in the mask. |
| 145 | + |
| 146 | +|+void flip(id<1> id)+ |
| 147 | +|Toggle the value of the bit corresponding to the specified _id_. |
| 148 | + |
| 149 | +|+bool operator==(group_mask rhs) const+ |
| 150 | +|Return true if each bit in this mask is equal to the corresponding bit in +rhs+. |
| 151 | + |
| 152 | +|+bool operator!=(group_mask rhs) const+ |
| 153 | +|Return true if any bit in this mask is not equal to the corresponding bit in +rhs+. |
| 154 | + |
| 155 | +|+group_mask operator &=(group_mask rhs)+ |
| 156 | +|Set the bits of this mask to the result of performing a bitwise AND with this mask and +rhs+. |
| 157 | + |
| 158 | +|+group_mask operator |=(group_mask rhs)+ |
| 159 | +|Set the bits of this mask to the result of performing a bitwise OR with this mask and +rhs+. |
| 160 | + |
| 161 | +|+group_mask operator ^=(group_mask rhs)+ |
| 162 | +|Set the bits of this mask to the result of performing a bitwise XOR with this mask and +rhs+. |
| 163 | + |
| 164 | +|+group_mask operator <<=(size_t shift)+ |
| 165 | +|Set the bits of this mask to the result of shifting its bits _shift_ positions to the left using a logical shift. Bits that are shifted out to the left are discarded, and zeroes are shifted in from the right. |
| 166 | + |
| 167 | +|+group_mask operator >>=(size_t shift)+ |
| 168 | +|Set the bits of this mask to the result of shifting its bits _shift_ positions to the right using a logical shift. Bits that are shifted out to the right are discarded, and zeroes are shifted in from the left. |
| 169 | + |
| 170 | +|+group_mask operator ~() const+ |
| 171 | +|Return a mask representing the result of flipping all the bits in this mask. |
| 172 | + |
| 173 | +|+group_mask operator <<(size_t shift)+ |
| 174 | +|Return a mask representing the result of shifting its bits _shift_ positions to the left using a logical shift. Bits that are shifted out to the left are discarded, and zeroes are shifted in from the right. |
| 175 | + |
| 176 | +|+group_mask operator >>(size_t shift)+ |
| 177 | +|Return a mask representing the result of shifting its bits _shift_ positions to the right using a logical shift. Bits that are shifted out to the right are discarded, and zeroes are shifted in from the left. |
| 178 | +|=== |
| 179 | + |
| 180 | +|=== |
| 181 | +|Function|Description |
| 182 | + |
| 183 | +|+group_mask operator &(const group_mask& lhs, const group_mask& rhs)+ |
| 184 | +|Return a mask representing the result of performing a bitwise AND of +lhs+ and +rhs+. |
| 185 | + |
| 186 | +|+group_mask operator |(const group_mask& lhs, const group_mask& rhs)+ |
| 187 | +|Return a mask representing the result of performing a bitwise OR of +lhs+ and +rhs+. |
| 188 | + |
| 189 | +|+group_mask operator ^(const group_mask& lhs, const group_mask& rhs)+ |
| 190 | +|Return a mask representing the result of performing a bitwise XOR of +lhs+ and +rhs+. |
| 191 | + |
| 192 | +|=== |
| 193 | + |
| 194 | +==== Sample Header |
| 195 | + |
| 196 | +[source, c++] |
| 197 | +---- |
| 198 | +namespace cl { |
| 199 | +namespace sycl { |
| 200 | +namespace intel { |
| 201 | +
|
| 202 | +struct group_mask { |
| 203 | +
|
| 204 | + // enable reference to individual bit |
| 205 | + struct reference { |
| 206 | + reference& operator=(bool x); |
| 207 | + reference& operator=(const reference& x); |
| 208 | + bool operator~() const; |
| 209 | + operator bool() const; |
| 210 | + reference& flip(); |
| 211 | + }; |
| 212 | +
|
| 213 | + bool operator[](id<1> id) const; |
| 214 | + reference operator[](id<1> id) const; |
| 215 | + bool test(id<1> id) const; |
| 216 | + bool all() const; |
| 217 | + bool any() const; |
| 218 | + bool none() const; |
| 219 | + uint32_t count() const; |
| 220 | + uint32_t size() const; |
| 221 | + id<1> find_low() const; |
| 222 | + id<1> find_high() const; |
| 223 | +
|
| 224 | + template <typename T = vec<uint32_t,4>> |
| 225 | + void insert_bits(T bits, id<1> pos = 0); |
| 226 | +
|
| 227 | + template <typename T = vec<uint32_t,4>> |
| 228 | + T extract_bits(id<1> pos = 0); |
| 229 | +
|
| 230 | + void set(); |
| 231 | + void set(id<1> id, bool value = true); |
| 232 | + void reset(); |
| 233 | + void reset(id<1> id); |
| 234 | + void reset_low(); |
| 235 | + void reset_high(); |
| 236 | + void flip(); |
| 237 | + void flip(id<1> id); |
| 238 | +
|
| 239 | + bool operator==(group_mask rhs) const; |
| 240 | + bool operator!=(group_mask rhs) const; |
| 241 | +
|
| 242 | + group_mask operator &=(group_mask rhs); |
| 243 | + group_mask operator |=(group_mask rhs); |
| 244 | + group_mask operator ^=(group_mask rhs); |
| 245 | + group_mask operator <<=(size_t); |
| 246 | + group_mask operator >>=(size_t rhs); |
| 247 | +
|
| 248 | + group_mask operator ~() const; |
| 249 | + group_mask operator <<(size_t) const; |
| 250 | + group_mask operator >>(size_t) const; |
| 251 | +
|
| 252 | +}; |
| 253 | +
|
| 254 | +group_mask operator &(const group_mask& lhs, const group_mask& rhs); |
| 255 | +group_mask operator |(const group_mask& lhs, const group_mask& rhs); |
| 256 | +group_mask operator ^(const group_mask& lhs, const group_mask& rhs); |
| 257 | +
|
| 258 | +} // intel |
| 259 | +} // sycl |
| 260 | +} // cl |
| 261 | +---- |
| 262 | + |
| 263 | +== Issues |
| 264 | + |
| 265 | +None. |
| 266 | + |
| 267 | +//. asd |
| 268 | +//+ |
| 269 | +//-- |
| 270 | +//*RESOLUTION*: Not resolved. |
| 271 | +//-- |
| 272 | +
|
| 273 | +== Revision History |
| 274 | +
|
| 275 | +[cols="5,15,15,70"] |
| 276 | +[grid="rows"] |
| 277 | +[options="header"] |
| 278 | +|======================================== |
| 279 | +|Rev|Date|Author|Changes |
| 280 | +|1|2020-03-16|John Pennycook|*Initial public working draft* |
| 281 | +|======================================== |
| 282 | +
|
| 283 | +//************************************************************************ |
| 284 | +//Other formatting suggestions: |
| 285 | +// |
| 286 | +//* Use *bold* text for host APIs, or [source] syntax highlighting. |
| 287 | +//* Use +mono+ text for device APIs, or [source] syntax highlighting. |
| 288 | +//* Use +mono+ text for extension names, types, or enum values. |
| 289 | +//* Use _italics_ for parameters. |
| 290 | +//************************************************************************ |
0 commit comments