Skip to content

[SYCL] Align some extensions with SYCL 2020 #4432

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions sycl/doc/CompilerAndRuntimeDesign.md
Original file line number Diff line number Diff line change
Expand Up @@ -918,8 +918,8 @@ space attributes in SYCL mode:
| Address space attribute | SYCL address_space enumeration |
|-------------------------|--------------------------------|
| `__attribute__((opencl_global))` | global_space, constant_space |
| `__attribute__((opencl_global_host))` | global_host_space |
| `__attribute__((opencl_global_device))` | global_device_space |
| `__attribute__((opencl_global_host))` | ext_intel_global_host_space |
| `__attribute__((opencl_global_device))` | ext_intel_global_device_space |
| `__attribute__((opencl_local))` | local_space |
| `__attribute__((opencl_private))` | private_space |
| `__attribute__((opencl_constant))` | N/A
Expand Down
59 changes: 38 additions & 21 deletions sycl/doc/extensions/EnqueueBarrier/enqueue_barrier.asciidoc
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
= SYCL_INTEL_enqueue_barrier
= SYCL_EXT_INTEL_ENQUEUE_BARRIER
:source-highlighter: coderay
:coderay-linenums-mode: table

Expand Down Expand Up @@ -28,7 +28,7 @@ This document presents a series of changes proposed for a future version of the

== Name Strings

+SYCL_INTEL_enqueue_barrier+
+SYCL_EXT_INTEL_ENQUEUE_BARRIER+

== Notice

Expand All @@ -50,6 +50,22 @@ Revision: 1
== Contact
Please open an issue in the https://github.com/intel/llvm/tree/sycl/sycl/doc/extensions/[extensions repository]

== Feature Test Macro

This extension provides a feature-test macro as described in the core SYCL
specification section 6.3.3 "Feature test macros". Therefore, an
implementation supporting this extension must predefine the macro
`SYCL_EXT_INTEL_ENQUEUE_BARRIER` to one of the values defined in the table below.
Applications can test for the existence of this macro to determine if the
implementation supports this feature, or applications can test the macro's
value to determine which of the extension's APIs the implementation supports.

[%header,cols="1,5"]
|===
|Value |Description
|1 |Initial extension version. Base features are supported.
|===

== Dependencies

This extension is written against the SYCL 1.2.1 specification, Revision v1.2.1-6.
Expand All @@ -75,9 +91,9 @@ two new members to the `queue` class:
[grid="rows"]
[options="header"]
|========================================
|*handler::barrier*|*queue::submit_barrier*
|`void barrier()` | `event submit_barrier()`
|`void barrier( const vector_class<event> &waitList )` | `event submit_barrier( const vector_class<event> &waitList )`
|*handler::ext_intel_barrier*|*queue::ext_intel_submit_barrier*
|`void ext_intel_barrier()` | `event ext_intel_submit_barrier()`
|`void ext_intel_barrier( const vector_class<event> &waitList )` | `event ext_intel_submit_barrier( const vector_class<event> &waitList )`
|========================================

The first variant of the barrier takes no parameters, and waits for all previously submitted commands to the queue to enter the `info::event_command_status::complete` state before any command later submitted to the same queue is allowed to execute. A second variant of the barrier accepts a list of events, with the behavior that no commands submitted to the same queue after barrier submission may execute until all events in the `waitList` have entered the `info::event_command_status::complete` state. Both variants are non-blocking from the host program perspective, in that they do not wait for the barrier conditions to have been met before returning.
Expand All @@ -93,7 +109,7 @@ Some forms of the new barrier methods return an `event`, which can be used to pe

CG4 doesn't execute until all previous command groups submitted to the same queue (CG1, CG2, CG3) have entered the completed state.

==== 1. Using `handler::barrier()`:
==== 1. Using `handler::ext_intel_barrier()`:

[source,c++,NoName,linenums]
----
Expand All @@ -109,7 +125,7 @@ Queue.submit([&](cl::sycl::handler& cgh) {
});

Queue.submit([&](cl::sycl::handler& cgh) {
cgh.barrier();
cgh.ext_intel_barrier();
});

Queue.submit([&](cl::sycl::handler& cgh) {
Expand All @@ -118,7 +134,7 @@ Queue.submit([&](cl::sycl::handler& cgh) {
...
----

==== 2. Using `queue::submit_barrier()`:
==== 2. Using `queue::ext_intel_submit_barrier()`:

[source,c++,NoName,linenums]
----
Expand All @@ -133,7 +149,7 @@ Queue.submit([&](cl::sycl::handler& cgh) {
// CG3
});

Queue.submit_barrier();
Queue.ext_intel_submit_barrier();

Queue.submit([&](cl::sycl::handler& cgh) {
// CG4
Expand All @@ -146,7 +162,7 @@ Queue.submit([&](cl::sycl::handler& cgh) {

CG3 requires CG1 (in Queue1) and CG2 (in Queue2) to have completed before it (CG3) begins execution.

==== 1. Using `handler::barrier()`:
==== 1. Using `handler::ext_intel_barrier()`:

[source,c++,NoName,linenums]
----
Expand All @@ -160,7 +176,7 @@ auto event_barrier2 = Queue2.submit([&](cl::sycl::handler& cgh) {
});

Queue3.submit([&](cl::sycl::handler& cgh) {
cgh.barrier( vector_class<event>{event_barrier1, event_barrier2} );
cgh.ext_intel_barrier( vector_class<event>{event_barrier1, event_barrier2} );
});

Queue3.submit([&](cl::sycl::handler& cgh) {
Expand All @@ -169,7 +185,7 @@ Queue3.submit([&](cl::sycl::handler& cgh) {
...
----

==== 2. Using `queue::submit_barrier()`:
==== 2. Using `queue::ext_intel_submit_barrier()`:

[source,c++,NoName,linenums]
----
Expand All @@ -182,7 +198,7 @@ auto event_barrier2 = Queue2.submit([&](cl::sycl::handler& cgh) {
// CG2
});

Queue3.submit_barrier( vector_class<event>{event_barrier1, event_barrier2} );
Queue3.ext_intel_submit_barrier( vector_class<event>{event_barrier1, event_barrier2} );

Queue3.submit([&](cl::sycl::handler& cgh) {
// CG3
Expand Down Expand Up @@ -211,9 +227,9 @@ void wait();
template <typename T>
event submit(T cgf, const queue &secondaryQueue);

event submit_barrier();
event ext_intel_submit_barrier();

event submit_barrier( const vector_class<event> &waitList );
event ext_intel_submit_barrier( const vector_class<event> &waitList );

void wait();
...
Expand All @@ -225,8 +241,8 @@ void wait();
[options="header"]
|========================================
|*Member functions*|*Description*
|`event submit_barrier()` | Same effect as submitting a `handler::barrier()` within a command group to this `queue`. The returned event enters the `info::event_command_status::complete` state when all events that the barrier is dependent on (implicitly from all previously submitted commands to the same queue) have entered the `info::event_command_status::complete` state.
|`event submit_barrier( const vector_class<event> &waitList )` | Same effect as submitting a `handler:barrier( const vector_class<event> &waitList )` within a command group to this `queue`. The returned event enters the `info::event_command_status::complete` state when all events that the barrier is dependent on (explicitly from `waitList`) have entered the `info::event_command_status::complete` state.
|`event ext_intel_submit_barrier()` | Same effect as submitting a `handler::ext_intel_barrier()` within a command group to this `queue`. The returned event enters the `info::event_command_status::complete` state when all events that the barrier is dependent on (implicitly from all previously submitted commands to the same queue) have entered the `info::event_command_status::complete` state.
|`event ext_intel_submit_barrier( const vector_class<event> &waitList )` | Same effect as submitting a `handler:ext_intel_barrier( const vector_class<event> &waitList )` within a command group to this `queue`. The returned event enters the `info::event_command_status::complete` state when all events that the barrier is dependent on (explicitly from `waitList`) have entered the `info::event_command_status::complete` state.
|========================================


Expand Down Expand Up @@ -261,9 +277,9 @@ void fill(accessor<T, dim, mode, tgt> dest, const T& src);
template<typename T, int dim, access::mode mode, access::target tgt>
void fill(accessor<T, dim, mode, tgt> dest, const T& src);

void barrier();
void ext_intel_barrier();

void barrier( const vector_class<event> &waitList );
void ext_intel_barrier( const vector_class<event> &waitList );

};
...
Expand All @@ -284,8 +300,8 @@ Barriers can be created by two members of the `handler` class that force synchro
[options="header"]
|========================================
|*Member functions*|*Description*
|`void barrier()` | Prevents any commands submitted afterward to this queue from executing until all commands previously submitted to this queue have entered the `info::event_command_status::complete` state.
|`void barrier( const vector_class<event> &waitList` ) | Prevents any commands submitted afterward to this queue from executing until all events in `waitList` have entered the `info::event_command_status::complete` state. If `waitList` is empty, then the barrier has no effect.
|`void ext_intel_barrier()` | Prevents any commands submitted afterward to this queue from executing until all commands previously submitted to this queue have entered the `info::event_command_status::complete` state.
|`void ext_intel_barrier( const vector_class<event> &waitList` ) | Prevents any commands submitted afterward to this queue from executing until all events in `waitList` have entered the `info::event_command_status::complete` state. If `waitList` is empty, then the barrier has no effect.
|========================================

== References
Expand All @@ -303,6 +319,7 @@ None.
|========================================
|Rev|Date|Author|Changes
|1|2020-02-26|Ye Ting|*Initial public release*
|2|2021-08-30|Dmitry Vodopyanov|*Updated according to SYCL 2020 reqs for extensions*
|========================================

//************************************************************************
Expand Down
13 changes: 9 additions & 4 deletions sycl/doc/extensions/LevelZeroBackend/LevelZeroBackend.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ The Level-Zero backend is added to the cl::sycl::backend enumeration:
``` C++
enum class backend {
// ...
level_zero,
ext_oneapi_level_zero,
// ...
};
```
Expand Down Expand Up @@ -55,7 +55,7 @@ and they must be included in the order shown:

``` C++
#include "level_zero/ze_api.h"
#include "sycl/backend/level_zero.hpp"
#include "sycl/ext/oneapi/backend/level_zero.hpp"
```
### 4.1 Mapping of SYCL objects to Level-Zero handles

Expand All @@ -71,7 +71,7 @@ These SYCL objects encapsulate the corresponding Level-Zero handles:

### 4.2 Obtaining of native Level-Zero handles from SYCL objects

The ```get_native<cl::sycl::backend::level_zero>()``` member function is how a raw native Level-Zero handle can be obtained
The ```get_native<cl::sycl::backend::ext_oneapi_level_zero>()``` member function is how a raw native Level-Zero handle can be obtained
for a specific SYCL object. It is currently supported for SYCL ```platform```, ```device```, ```context```, ```queue```, ```event```
and ```program``` classes. There is also a free-function defined in ```cl::sycl``` namespace that can be used instead of the member function:
``` C++
Expand All @@ -81,7 +81,7 @@ auto get_native(const SyclObjectT &Obj) ->
```
### 4.3 Construct a SYCL object from a Level-Zero handle

The following free functions defined in the ```cl::sycl::level_zero``` namespace allow an application to create
The following free functions defined in the ```cl::sycl::ext::oneapi::level_zero``` namespace allow an application to create
a SYCL object that encapsulates a corresponding Level-Zero object:

| Level-Zero interoperability function |Description|
Expand All @@ -103,11 +103,15 @@ some interoperability API supports overriding this behavior and keep the ownersh
Use this enumeration for explicit specification of the ownership:
``` C++
namespace sycl {
namespace ext {
namespace oneapi {
namespace level_zero {

enum class ownership { transfer, keep };

} // namespace level_zero
} // namespace oneapi
} // namespace ext
} // namespace sycl
```

Expand Down Expand Up @@ -193,3 +197,4 @@ struct free_memory {
|3|2021-04-13|James Brodman|Free Memory Query
|4|2021-07-06|Rehana Begam|Introduced explicit ownership for queue
|5|2021-07-25|Sergey Maslov|Introduced SYCL interop for events
|6|2021-08-30|Dmitry Vodopyanov|Updated according to SYCL 2020 reqs for extensions
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
= SYCL_INTEL_mem_channel_property
= SYCL_EXT_INTEL_MEM_CHANNEL_PROPERTY

== Introduction
NOTE: Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are trademarks of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. used by permission by Khronos.
Expand Down Expand Up @@ -31,6 +31,22 @@ This extension is written against the SYCL 2020 provisional specification, Revis

The use of this extension requires a target that supports cl_intel_mem_channel_property or equivalent if OpenCL is used as the underlying device runtime.

== Feature Test Macro

This extension provides a feature-test macro as described in the core SYCL
specification section 6.3.3 "Feature test macros". Therefore, an
implementation supporting this extension must predefine the macro
`SYCL_EXT_INTEL_MEM_CHANNEL_PROPERTY` to one of the values defined in the table below.
Applications can test for the existence of this macro to determine if the
implementation supports this feature, or applications can test the macro's
value to determine which of the extension's APIs the implementation supports.

[%header,cols="1,5"]
|===
|Value |Description
|1 |Initial extension version. Base features are supported.
|===

== Overview

On some targets manual assignment of buffers to memory regions can improve memory bandwidth. This extension adds a buffer property to indicate in which memory channel a particular buffer should be allocated. This information is an optimization hint to the runtime and thus it is legal to ignore.
Expand All @@ -45,7 +61,7 @@ Add a new property to Table 4.33: Properties supported by the SYCL buffer class
[options="header"]
|===
| Property | Description
| property::buffer::mem_channel | The `mem_channel` property is a hint to the SYCL runtime that the buffer should be stored in a particular memory channel provided to the property.
| property::buffer::ext_intel_mem_channel | The `ext_intel_mem_channel` property is a hint to the SYCL runtime that the buffer should be stored in a particular memory channel provided to the property.
|===
--

Expand All @@ -55,7 +71,7 @@ Add a new constructor to Table 4.34: Constructors of the buffer property classes
[options="header"]
|===
| Constructor | Description
| property::buffer::mem_channel::mem_channel(cl_uint channel) | Constructs a SYCL `mem_channel` property instance with the specified channel ID. The range of valid values depends on the target and is implementation defined. Invalid values do not need to result in an error as the property is only a hint.
| property::buffer::ext_intel_mem_channel::ext_intel_mem_channel(cl_uint channel) | Constructs a SYCL `ext_intel_mem_channel` property instance with the specified channel ID. The range of valid values depends on the target and is implementation defined. Invalid values do not need to result in an error as the property is only a hint.
|===
--

Expand All @@ -65,7 +81,7 @@ Add a new member function to Table 4.35: Member functions of the buffer property
[options="header"]
|===
| Member function | Description
| cl_uint property::buffer::mem_channel::get_channel() const | Returns the cl_uint which was specified when constructing this SYCL `mem_channel` property.
| cl_uint property::buffer::ext_intel_mem_channel::get_channel() const | Returns the cl_uint which was specified when constructing this SYCL `ext_intel_mem_channel` property.
|===
--

Expand Down Expand Up @@ -107,4 +123,6 @@ Add an entry for the new aspect to Table 4.20: Device aspects defined by the cor
|========================================
|Rev|Date|Author|Changes
|1|2020-10-26|Joe Garvey|*Initial public draft*
|2|2021-08-30|Dmitry Vodopyanov|*Updated according to SYCL 2020 reqs for extensions*

|========================================
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
= SYCL_INTEL_usm_address_spaces
= SYCL_EXT_INTEL_USM_ADDRESS_SPACES

== Introduction
This extension introduces two new address spaces and their corresponding multi_ptr specializations.
Expand All @@ -12,7 +12,7 @@ NOTE: This document is better viewed when rendered as html with asciidoctor. Gi
This document describes an extension to the SYCL USM extension that adds new explicit address spaces for the possible locations that USM pointers can be allocated. Users can create pointers that point into these address spaces explicitly in order to pass additional information to their compiler so as to enable optimizations.

== Name Strings
+SYCL_INTEL_usm_address_spaces+
+SYCL_EXT_INTEL_USM_ADDRESS_SPACES+

== Notice
Copyright (c) 2020 Intel Corporation. All rights reserved.
Expand All @@ -36,6 +36,22 @@ This extension is written against the SYCL 1.2.1 specification, Revision 7. It

If SPIR-V is used by the implementation, this extension also requires support for the SPV_INTEL_usm_storage_classes SPIR-V extension.

== Feature Test Macro

This extension provides a feature-test macro as described in the core SYCL
specification section 6.3.3 "Feature test macros". Therefore, an
implementation supporting this extension must predefine the macro
`SYCL_EXT_INTEL_USM_ADDRESS_SPACES` to one of the values defined in the table below.
Applications can test for the existence of this macro to determine if the
implementation supports this feature, or applications can test the macro's
value to determine which of the extension's APIs the implementation supports.

[%header,cols="1,5"]
|===
|Value |Description
|1 |Initial extension version. Base features are supported.
|===

== Overview

This extension adds two new address spaces: device and host that are subsets of the global address space.
Expand Down Expand Up @@ -71,15 +87,15 @@ enum class address_space : int {
local_space,
constant_space,
private_space,
device_space,
host_space
ext_intel_global_device_space,
ext_intel_global_host_space
};
```

Add the following new conversion operator:
```c++
// Explicit conversion to global_space
// Only available if Space == address_space::device_space || Space == address_space::host_space
// Only available if Space == address_space::ext_intel_global_device_space || Space == address_space::ext_intel_global_host_space
explicit operator multi_ptr<ElementType, access::address_space::global_space>() const;
```

Expand All @@ -93,12 +109,12 @@ a|
```c++
template<typename ElementType, access::
address_space Space = access::address_space::
device_space>
ext_intel_global_device_space>
template <int dimensions, access::mode Mode>
multi_ptr(
accessor<ElementType, dimensions, Mode, access::
target::global_buffer>)
``` | Constructs a multi_ptr<ElementType, access::address_space::device_space> from an accessor of access::target::global_buffer.
``` | Constructs a multi_ptr<ElementType, access::address_space::ext_intel_global_device_space> from an accessor of access::target::global_buffer.
|===
--

Expand All @@ -107,10 +123,10 @@ device_space>
Add device_ptr and host_ptr aliases to the list of multi_ptr aliases as follows:
```c++
template<typename ElementType>
using device_ptr = multi_ptr<ElementType, access::address_space::device_space>
using device_ptr = multi_ptr<ElementType, access::address_space::ext_intel_global_device_space>

template<typename ElementType>
using host_ptr = multi_ptr<ElementType, access::address_space::host_space>
using host_ptr = multi_ptr<ElementType, access::address_space::ext_intel_global_host_space>
```

== Revision History
Expand All @@ -121,4 +137,5 @@ using host_ptr = multi_ptr<ElementType, access::address_space::host_space>
|========================================
|Rev|Date|Author|Changes
|A|2020-06-18|Joe Garvey|Initial public draft
|B|2021-08-30|Dmitry Vodopyanov|Updated according to SYCL 2020 reqs for extensions
|========================================
Loading