Skip to content

Commit 86efa23

Browse files
rho180steffenlarsenGarveyJoetiwaria1bowenxue-intel
authored
[SYCL][DOC] Add Extension for FPGA task_sequence (#6348)
This PR: 1. Adds a new Intel FPGA experimental SYCL extension that introduces support for `task_sequences` which provides a sub kernel task level parallelism interface. 2. Updates the fpga_kernel_interface SYCL extension to add the `stall_enable_clusters` and `stall_free_clusters` properties. These are used with task sequences as well. --------- Co-authored-by: Steffen Larsen <[email protected]> Co-authored-by: GarveyJoe <[email protected]> Co-authored-by: Abhishek Tiwari <[email protected]> Co-authored-by: Xue, Bowen <[email protected]> Co-authored-by: Adel Ejjeh <[email protected]> Co-authored-by: Ejjeh, Adel <[email protected]>
1 parent eb03091 commit 86efa23

File tree

2 files changed

+598
-13
lines changed

2 files changed

+598
-13
lines changed

sycl/doc/extensions/proposed/sycl_ext_intel_fpga_kernel_interface_properties.asciidoc

Lines changed: 62 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -32,8 +32,10 @@ https://github.com/intel/llvm/issues
3232

3333
== Contributors
3434

35+
Jessica Davies, Intel +
3536
Joe Garvey, Intel +
36-
Abhishek Tiwari, Intel
37+
Abhishek Tiwari, Intel +
38+
Bowen Xue, Intel
3739

3840
== Dependencies
3941

@@ -54,8 +56,14 @@ rely on APIs defined in this specification.*
5456
== Overview
5557

5658
This extension introduces kernel properties to specify how or when control and
57-
data signals can be passed into or out of an FPGA kernel. These properties are
58-
meaningless on non-FPGA devices and can be ignored on such devices.
59+
data signals can be passed into or out of an FPGA kernel.
60+
61+
On FPGA targets, regions of the circuit called clusters may be statically
62+
scheduled. This extension also introduces kernel properties that specify how
63+
statically-scheduled clusters should be implemented for an FPGA target.
64+
65+
These properties are meaningless on non-FPGA devices and can be ignored on such
66+
devices.
5967

6068
== Specification
6169

@@ -91,18 +99,18 @@ enum class streaming_interface_options_enum {
9199
remove_downstream_stall
92100
};
93101

94-
enum class register_map_interface_options_enum {
95-
wait_for_done_write,
96-
do_not_wait_for_done_write
97-
};
98-
99102
struct streaming_interface_key {
100103
template <streaming_interface_options_enum option>
101104
using value_t = sycl::ext::oneapi::properties::property_value<
102105
streaming_interface_key,
103106
std::integral_constant<streaming_interface_options_enum, option>>;
104107
};
105108

109+
enum class register_map_interface_options_enum {
110+
wait_for_done_write,
111+
do_not_wait_for_done_write
112+
};
113+
106114
struct register_map_interface_key {
107115
template <register_map_interface_options_enum option>
108116
using value_t = sycl::ext::oneapi::properties::property_value<
@@ -117,25 +125,33 @@ struct pipelined_key {
117125
std::integral_constant<size_t, pipeline_directive_or_initiation_interval>>;
118126
};
119127

128+
enum class fpga_cluster_options_enum : /* unspecified */ {
129+
stall_enable,
130+
stall_free
131+
};
132+
133+
struct fpga_cluster_key {
134+
template <fpga_cluster_options_enum option>
135+
using value_t = sycl::ext::oneapi::properties::property_value<
136+
fpga_cluster_key,
137+
std::integral_constant<fpga_cluster_options_enum, option>>;
138+
};
139+
120140
template <streaming_interface_options_enum option>
121141
inline constexpr streaming_interface_key::value_t<option> streaming_interface;
122-
123142
inline constexpr streaming_interface_key::value_t<
124143
streaming_interface_options_enum::accept_downstream_stall>
125144
streaming_interface_accept_downstream_stall;
126-
127145
inline constexpr streaming_interface_key::value_t<
128146
streaming_interface_options_enum::remove_downstream_stall>
129147
streaming_interface_remove_downstream_stall;
130148

131149
template <register_map_interface_options_enum option>
132150
inline constexpr register_map_interface_key::value_t<option>
133151
register_map_interface;
134-
135152
inline constexpr register_map_interface_key::value_t<
136153
register_map_interface_options_enum::wait_for_done_write>
137154
register_map_interface_wait_for_done_write;
138-
139155
inline constexpr register_map_interface_key::value_t<
140156
register_map_interface_options_enum::do_not_wait_for_done_write>
141157
register_map_interface_do_not_wait_for_done_write;
@@ -144,6 +160,17 @@ template<int pipeline_directive_or_initiation_interval>
144160
inline constexpr pipelined_key::value_t<
145161
pipeline_directive_or_initiation_interval> pipelined;
146162

163+
template<fpga_cluster_options_enum option>
164+
inline constexpr fpga_cluster_key::value_t<option> fpga_cluster;
165+
inline constexpr fpga_cluster_key::value_t<
166+
fpga_cluster_options_enum::stall_enable_clusters> stall_enable_clusters;
167+
inline constexpr fpga_cluster_key::value_t<
168+
fpga_cluster_options_enum::stall_free_clusters> stall_free_clusters;
169+
170+
template <> struct is_property_key<streaming_interface_key> : std::true_type {};
171+
template <> struct is_property_key<register_map_interface_key> : std::true_type {};
172+
template <> struct is_property_key<fpga_cluster_key> : std::true_type {};
173+
147174
} // namespace sycl::ext::intel::experimental
148175
```
149176

@@ -218,6 +245,27 @@ inline constexpr pipelined_key::value_t<
218245

219246
If the property parameter (N) is not specified, the default value is `-1`.
220247
Valid values for `N` are values greater than or equal to `-1`.
248+
249+
|`fpga_cluster`
250+
|The `fpga_cluster` property is a request for the
251+
compiler to implement statically-scheduled clusters using the specified
252+
clustering method for all clusters in the annotated kernel, when possible. The
253+
following values are supported:
254+
255+
* `stall_enable`: Directs the compiler to prefer generating an enable signal to
256+
freeze the cluster when the cluster is stalled whenever possible.
257+
258+
* `stall_free`: Directs the compiler to prefer using an exit FIFO to hold
259+
output data from clusters when the cluster is stalled whenever possible.
260+
261+
*Note*: Some clusters cannot be implemented with some cluster types so the
262+
request isn't guaranteed to be followed.
263+
264+
The following properties have been provided for convenience:
265+
`stall_enable_clusters`,
266+
`stall_free_clusters`.
267+
268+
=======
221269
|===
222270
223271
Device compilers that do not support this extension may accept and ignore these
@@ -298,4 +346,5 @@ q.single_task(KernelFunctor{a, b, c}).wait();
298346
|Rev|Date|Author|Changes
299347
|1|2022-03-01|Abhishek Tiwari|*Initial public working draft*
300348
|2|2022-12-05|Abhishek Tiwari|*Make pipelined property parameter a signed int*
301-
|========================================
349+
|3|2024-01-05|Bowen Xue|*Add fpga_cluster property*
350+
|========================================

0 commit comments

Comments
 (0)