Skip to content

[SYCL][Doc] math functions added to bfloat16 ext #5645

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jun 8, 2022

Conversation

JackAKirk
Copy link
Contributor

@JackAKirk JackAKirk commented Feb 23, 2022

This SYCL 2020 extension proposal proposes adding bfloat16 support to the fma, fmin, fmax and fabs SYCL floating point math functions.

cc @dkhaldi @AlexeySotkin

Would this extension proposal be of interest to you?

Blocked by #5393

@JackAKirk JackAKirk requested a review from a team as a code owner February 23, 2022 14:30
@gmlueck
Copy link
Contributor

gmlueck commented Feb 23, 2022

Rather than creating a separate extension, would it make sense to combine this with SYCL_INTEL_bf16_conversion, and create a new experimental extension named "sycl_ext_intel_bfloat16" (or "sycl_ext_oneapi_bfloat16")? It seems like it would be easier for applications if all this bfloat16 functionality was tied to a single feature-test macro.

We also need to decide if these are "intel" or "oneapi" extensions. It makes no sense to have a "oneapi" extension layered on top of an "intel" extension.

@JackAKirk
Copy link
Contributor Author

JackAKirk commented Feb 23, 2022

Rather than creating a separate extension, would it make sense to combine this with SYCL_INTEL_bf16_conversion, and create a new experimental extension named "sycl_ext_intel_bfloat16" (or "sycl_ext_oneapi_bfloat16")? It seems like it would be easier for applications if all this bfloat16 functionality was tied to a single feature-test macro.

We also need to decide if these are "intel" or "oneapi" extensions. It makes no sense to have a "oneapi" extension layered on top of an "intel" extension.

Yes I think it would make sense to have a single bf16 extension as you suggest. However since the current one is marked INTEL, I thought it made sense to add a separate proposal here for the time being.

I think that there only needs to be a single "oneapi" extension, rather than a "intel" and a "oneapi" extension, and I already made an example PR that would switch the existing "intel" extension to support other backends, renaming it "oneapi", here: #5393. In this proposal document here I referred to the INTEL named "SYCL_INTEL_bf16_conversion" rather than "SYCL_ONEAPI_bf16_conversion" simply because this is its current name, and I have not received an answer as to whether it would be acceptable to switch it to the "oneapi" namespace.

@@ -0,0 +1,105 @@
# Bfloat16 math functions extension for DPC++: = SYCL_ONEAPI_bf16_math
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not format well in HTML. Maybe there is some sort of typo here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The equals sign can be used similarly to # but not in the middle of the sentence. I removed =, thanks.

// genbfloat16 fma (genbfloat16 a, genbfloat16 b, genbfloat16 c)
template <typename T>
detail::enable_if_t<detail::is_genbfloat16<T>::value, T> fma(T a, T b,
T c);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not usually expose internal details from the details namespace in these API specifications. We usually say something like this when an API is constrained to certain types:

// Available only when "T" is one of the genbfloat16 types.
T fma(T a, T b, T c);


## Introduction

This document proposes extending the `fma`, `fmin`, `fmax` and `fabs` SYCL floating point math functions to support the `bfloat16` type introduced in the `SYCL_EXT_INTEL_BF16_CONVERSION` extension. This proposal assumes that devices which support the `SYCL_EXT_INTEL_BF16_CONVERSION` extension have the `bfloat16` scalar data type: `bfloat16`, and the `bfloat16` vector data types: `bfloat16_1`, `bfloat16_2`, `bfloat16_3`, `bfloat16_4`, `bfloat16_8` and `bfloat16_16` available at compile-time, in line with corresponding `half` types that are available at compile time on devices that
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think we currently have an extension that defines the bfloat16_1, bfloat16_2, bfloat16_3, bfloat16_4, bfloat16_8 and bfloat16_16 types. Does this extension define those types, or are you assuming they come from some other extension?

If we decide to combine this extension with "SYCL_INTEL_bf16_conversion", then it would make sense for that combined extension to define all of these types.

Regarding the aspect that relates to these types, we already define aspect::ext_intel_bf16_conversion. It would probably make sense to replace that with a more general aspect named aspect::ext_oneapi_bfloat16.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was told that I should initially propose the support of equivalent vector types to those that are currently supported in SYCL 2020 for the half case: this would mean that e.g. bfloat16_2 is a vector of two bfloat16. If we continue with the plan of implementing the corresponding vector types and these bfloat16 vector types are not defined in another extension then I imagine that we would define the vector types in this extension; but as you say it probably makes sense to have a single bfloat16 extension. I think at this point we mainly need to know whether Intel also plans to support these functions at all with bfloat16 in their backends or whether this extension would only apply to the CUDA backend at this moment in time? Further on from this if there is any idea on whether Intel would like to support bfloat16 vector types for these functions that would also be welcome.

I have asked whether the aspect::ext_intel_bf16_conversion could be implemented as a backend agnostic aspect here: #5393 (comment)

@JackAKirk
Copy link
Contributor Author

@gmlueck

Thanks for the comments. I've added the unified extension and vector types issues that you raised to the issues for discussion/resolution section. I've also discussed the backend agnostic aspect introduced in #5720 in relation to this extension and the extension unification issue.

@gmlueck
Copy link
Contributor

gmlueck commented Mar 3, 2022

I think you need to decide which bfloat16_N types are supported before this can be implemented, and you need to decide which extension provides those types. If this extension only "proposed" at this time (meaning it's not implemented yet and could change), or is it "experimental" (meaning it is already implemented)? If it is only "proposed", the PR should be changed to put the file in that directory, and it is OK if these issues are not yet resolved.

We recently adopted a template to use for all new extensions. Can you reformat this extension using that template? See also the instructions in README-process

@JackAKirk
Copy link
Contributor Author

JackAKirk commented Mar 7, 2022

I think you need to decide which bfloat16_N types are supported before this can be implemented, and you need to decide which extension provides those types. If this extension only "proposed" at this time (meaning it's not implemented yet and could change), or is it "experimental" (meaning it is already implemented)? If it is only "proposed", the PR should be changed to put the file in that directory, and it is OK if these issues are not yet resolved.

This extension has an initial implementation here: #5748. In the same way as the matrix extension has temporary implementations using the uint16_t storage type of bfloat16 instead of bfloat16, this extension has temporary implementations of bfloat16 and bfloat16x2 using uint16_t and uint32_t respectively. The same applies for the bfloat16 part of the "fma_relu" extension: #5749. We think that the fma_relu deserves a separate extension since it also covers half data types cases.
As mentioned in the issues section of this design document the bfloat16_N types that should be supported needs to be carefully considered. A first step is to switch thebfloat16 class extension from Intel vendor specific to oneapi via #5393 or otherwise.

We recently adopted a template to use for all new extensions. Can you reformat this extension using that template? See also the instructions in README-process

I'd prefer to first find consensus with Intel on whether this should be a separate extension or part of sycl_ext_*_bf16_conversion. If this is going to be a separate extension such that a separate design doc will be merged then I will correct the formatting.

@JackAKirk
Copy link
Contributor Author

I think you need to decide which bfloat16_N types are supported before this can be implemented, and you need to decide which extension provides those types. If this extension only "proposed" at this time (meaning it's not implemented yet and could change), or is it "experimental" (meaning it is already implemented)? If it is only "proposed", the PR should be changed to put the file in that directory, and it is OK if these issues are not yet resolved.

This extension has an initial implementation here: #5748. In the same way as the matrix extension has temporary implementations using the uint16_t storage type of bfloat16 instead of bfloat16, this extension has temporary implementations of bfloat16 and bfloat16x2 using uint16_t and uint32_t respectively. The same applies for the bfloat16 part of the "fma_relu" extension: #5749. We think that the fma_relu deserves a separate extension since it also covers half data types cases. As mentioned in the issues section of this design document the bfloat16_N types that should be supported needs to be carefully considered. A first step is to switch thebfloat16 class extension from Intel vendor specific to oneapi via #5393 or otherwise.

We recently adopted a template to use for all new extensions. Can you reformat this extension using that template? See also the instructions in README-process

I'd prefer to first find consensus with Intel on whether this should be a separate extension or part of sycl_ext_*_bf16_conversion. If this is going to be a separate extension such that a separate design doc will be merged then I will correct the formatting.

As discussed it has now been decided to have a single bfloat16 oneapi experimental extension. As such the contents of the design document in this PR will be updated and merged with the bfloat16 oneapi experimental extension document once #5393 is first merged.

@JackAKirk JackAKirk marked this pull request as draft March 16, 2022 11:03
@MrSidims MrSidims self-requested a review March 16, 2022 11:08
@JackAKirk JackAKirk changed the title [SYCL][Doc] bfloat16 math functions extension proposal [SYCL][Doc] math functions added to bfloat16 ext Apr 15, 2022
@JackAKirk JackAKirk marked this pull request as ready for review April 15, 2022 12:09
@JackAKirk
Copy link
Contributor Author

Hi @gmlueck @dkhaldi

This PR is now ready for review, along with the implementation in #5964. I've added descriptions to the new functions in the extension document, and updated the "Issues" section. The function descriptions match the SYCL 2020 descriptions of the corresponding floating point math functions exactly except for fma: since "bfloat16" is not covered by the IEEE 754-2008 standard as far as I am aware. I've added test cases for correct NAN behaviour matching the descriptions of fmin and fmax to intel/llvm-test-suite#975.

Copy link
Contributor

@gmlueck gmlueck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rdeodhar as FYI

@pvchupin pvchupin merged commit c76ef5c into intel:sycl Jun 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants