Skip to content

Commit c76ef5c

Browse files
JackAKirkJackAKirk
and
JackAKirk
authored
[SYCL][Doc] math functions added to bfloat16 ext (#5645)
This SYCL 2020 extension proposal proposes adding bfloat16 support to the fma, fmin, fmax and fabs SYCL floating point math functions. Blocked by #5393 Co-authored-by: JackAKirk <[email protected]>
1 parent 4296bb0 commit c76ef5c

File tree

1 file changed

+78
-3
lines changed

1 file changed

+78
-3
lines changed

sycl/doc/extensions/experimental/sycl_ext_oneapi_bfloat16.asciidoc

+78-3
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ products.
4848

4949
== Version
5050

51-
Revision: 4
51+
Revision: 5
5252

5353
== Introduction
5454

@@ -103,7 +103,7 @@ If the device doesn't have the aspect, objects of `bfloat16` class must not be
103103
used in the device code.
104104

105105
**NOTE**: The `ext_oneapi_bfloat16` aspect is not yet supported. The
106-
`bfloat16` class is currently supported only on Xe HP GPU and Nvidia A100 GPU.
106+
`bfloat16` class is currently supported only on Xe HP GPU and Nvidia GPUs with Compute Capability >= SM80.
107107

108108
== New `bfloat16` class
109109

@@ -316,9 +316,83 @@ int main (int argc, char *argv[]) {
316316
}
317317
----
318318

319+
== New bfloat16 math functions
320+
321+
Many applications will require dedicated functions that take parameters of type `bfloat16`. This extension adds `bfloat16` support to the `fma`, `fmin`, `fmax` and `fabs` SYCL floating point math functions. These functions can be used as element wise operations on matrices, supplementing the `bfloat16` support in the sycl_ext_oneapi_matrix extension.
322+
323+
The descriptions of the `fma`, `fmin`, `fmax` and `fabs` SYCL floating point math functions can be found in the SYCL specification: https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#_math_functions.
324+
325+
The following functions are only available when `T` is `bfloat16` or `sycl::marray<bfloat16, {N}>`, where `{N}` means any positive value of `size_t` type.
326+
327+
=== fma
328+
329+
```c++
330+
namespace sycl::ext::oneapi::experimental {
331+
332+
template <typename T>
333+
T fma(T a, T b, T c);
334+
} // namespace sycl::ext::oneapi::experimental
335+
```
336+
337+
==== Description
338+
339+
Returns the correctly rounded floating-point representation of the sum of `c` with the infinitely precise product of `a` and `b`.
340+
Rounding of intermediate products shall not occur. The mantissa LSB rounds to the nearest even. Subnormal numbers are supported.
341+
342+
=== fmax
343+
344+
```c++
345+
namespace sycl::ext::oneapi::experimental {
346+
template <typename T>
347+
T fmax(T x, T y);
348+
} // namespace sycl::ext::oneapi::experimental
349+
```
350+
351+
==== Description
352+
353+
Returns `y` if
354+
`x < y`, otherwise it
355+
returns `x`. If one argument is a
356+
NaN, `fmax()` returns the other
357+
argument. If both arguments are
358+
NaNs, `fmax()` returns a NaN.
359+
360+
=== fmin
361+
362+
```c++
363+
namespace sycl::ext::oneapi::experimental {
364+
template <typename T>
365+
T fmin(T x, T y);
366+
} // namespace sycl::ext::oneapi::experimental
367+
```
368+
369+
==== Description
370+
371+
Returns `y` if
372+
`y < x`, otherwise it
373+
returns `x`. If one argument is a
374+
NaN, `fmax()` returns the other
375+
argument. If both arguments are
376+
NaNs, `fmax()` returns a NaN.
377+
378+
=== fabs
379+
380+
```c++
381+
namespace sycl::ext::oneapi::experimental {
382+
template <typename T>
383+
T fabs(T x);
384+
} // namespace sycl::ext::oneapi::experimental
385+
```
386+
387+
==== Description
388+
389+
Compute absolute value of a `bfloat16`.
390+
319391
== Issues
320392

321-
None.
393+
1. The CUDA backend does not have a use case that would necessitate support of the `vec` class in bfloat16 math functions, and `marray` would always be preferred over `vec` if `vec` support were to be added in the CUDA backend. For portability reasons, support for the `vec` class can be easily added if other backends require it.
394+
395+
2. We should decide on a roadmap to extend support of `bfloat16` to other SYCL 2020 math functions.
322396

323397
== Revision History
324398

@@ -333,4 +407,5 @@ None.
333407
Apply code review suggestions
334408
|3|2021-08-18|Alexey Sotkin |Remove `uint16_t` constructor
335409
|4|2022-03-07|Aidan Belton and Jack Kirk |Switch from Intel vendor specific to oneapi
410+
|5|2022-04-05|Jack Kirk | Added section for bfloat16 math builtins
336411
|========================================

0 commit comments

Comments
 (0)