[SYCL][CUDA] bfloat16 in oneapi namespace and also supporting CUDA #5393

JackAKirk · 2022-01-25T14:24:41Z

I have tested these changes on the CUDA backend using a intel->oneapi namespace version of this test:
https://github.com/intel/llvm-test-suite/blob/intel/SYCL/BFloat16/bfloat16_type.cpp

There is a bug in the verify_logic function in the bfloat16_type.cpp test (C accessor is not written to) - I'm not sure how this did not lead to a failure already. With the bug fixed the test passes for the CUDA backend with this patch. I've added a draft test file that also increases the coverage to test unary minus operator here: intel/llvm-test-suite#889.

Note that the unary neg intrinsic added here that is used in unary minus will be pulled down from upstream via e.g. https://reviews.llvm.org/D117887.

Is this an acceptable direction for generalizing this extension to support other vendor's backends?

Added bfloat16 in oneapi experimental namespace. Signed-off-by: jack.kirk <[email protected]>

JackAKirk · 2022-02-17T10:43:04Z

@AlexeySotkin

BTW this PR is ready for review. I had only left it as draft because there are some small outstanding issues which means it cannot be completely finished, but these should be minor issues. What we really want to know is whether this an acceptable direction for generalizing this extension to support other vendor's backends? It seems to be almost trivial to allow other backends to use this extension (the changes I made are very small).

One issue is that we need to know is whether the aspect from #4266 is going to be eventually added? so we know whether we also need to support it in the CUDA backend.

You may also want to take note of the bug that I found as described in my first message here ^^.

I am aware that I did not yet add support to the CUDA version of the intel/llvm device test because it will require a different compilation instruction for the CUDA case. Again this is a small issue that can be addressed at a later point.

dkhaldi · 2022-02-17T16:45:00Z

sycl/include/sycl/ext/oneapi/experimental/bfloat16.hpp

@@ -14,7 +14,7 @@
 __SYCL_INLINE_NAMESPACE(cl) {
 namespace sycl {
 namespace ext {
-namespace intel {
+namespace oneapi {
 namespace experimental {

 class [[sycl_detail::uses_aspects(ext_intel_bf16_conversion)]] bfloat16 {


This should be wrapped with SYCL_DEVICE_ONLY defined, for it to be built successfully by other compilers, right? (see #5594)
@AlexeySotkin, are you planning for adding this as a separate PR?

…_BF16_CONVERSION.asciidoc

JackAKirk · 2022-03-07T17:17:00Z

Added updated design doc switching from Intel vendor specific to oneapi extension. We are aware there is another open PR for this design doc: #5248.
It has also been suggested (#5645 (comment)) that we should merge this extension: #5645 with sycl_oneapi_bf16_conversion.asciidoc. I think that Codeplay agrees with this.

Removed aspect reference: can be added once the ext_oneapi_bfloat16 aspect is merged.

JackAKirk · 2022-03-16T11:00:08Z

Added updated design doc switching from Intel vendor specific to oneapi extension. We are aware there is another open PR for this design doc: #5248. It has also been suggested (#5645 (comment)) that we should merge this extension: #5645 with sycl_oneapi_bf16_conversion.asciidoc. I think that Codeplay agrees with this.

Update:

As discussed the plan is to have a single bfloat16 oneapi experimental extension. As such I have renamed the existing intel experimental extension as "sycl_ext_oneapi_bfloat16.asciidoc" and made some small naming changes. If it is preferred I can (temporarily) switch this document back to its original name since that could help with reviewing the (small number of) changes in the document?

Ideally this PR would be merged first, then other open PRs that add to the design document/bfloat16 functionality such as #5645 and #5248 would follow once they have been updated to reflect the new extension name etc.

This PR can be tested with a draft test-suite PR that is up to date for the purpose of this functional testing: intel/llvm-test-suite#889

gmlueck

If it is preferred I can (temporarily) switch this document back to its original name since that could help with reviewing the (small number of) changes in the document?

No need. I checked out the PR and did the diff manually.

I think this PR needs to remove the existing file sycl_ext_intel_bf16_conversion.asciidoc, correct?

gmlueck · 2022-03-16T15:48:12Z

sycl/include/sycl/ext/oneapi/experimental/bfloat16.hpp

 namespace experimental {

-class [[sycl_detail::uses_aspects(ext_intel_bf16_conversion)]] bfloat16 {
+class bfloat16 {


Just curious, did you remove this C++ attribute intentionally?

Removing it is probably not a big problem because we don't have support in the rest of the tool chain yet to take advantage of this attribute. If there's no reason to remove it, though, it seems better to keep it.

Yes it was intentional. I think that it will be replaced by the oneapi aspect in #5720.

The only reason for deleting it is that the cuda backend does not have this ext_intel_bf16_conversion aspect (which as I understand it will be replaced by the oneapi aspect generally). However I can try adding it back and seeing if the cuda backend bfloat16 test passes with it included, since I'm not sure how it will behave. Shall I do this? If the cuda backend test passes then this line can remain unchanged if that is preferred.

Oh, I see. I thought the aspect was renamed in this PR, but I see now that it is not.

When #5720 is merged, we should add this C++ attribute back and use the new aspect. I think you could probably keep the attribute in this PR with the old aspect name, and it would probably not hurt CUDA. The attribute will cause the front-end to put some metadata on the IR, but I think all IR passes ignore that metadata right now.

Oh, I see. I thought the aspect was renamed in this PR, but I see now that it is not.

When #5720 is merged, we should add this C++ attribute back and use the new aspect.

So is it OK to leave it as it is? I would remember to add it back when the new aspect is available....

I think you could probably keep the attribute in this PR with the old aspect name, and it would probably not hurt CUDA. The attribute will cause the front-end to put some metadata on the IR, but I think all IR passes ignore that metadata right now.

Or should I add it back now with the old aspect name?

Either way is OK.

JackAKirk · 2022-03-16T15:59:11Z

If it is preferred I can (temporarily) switch this document back to its original name since that could help with reviewing the (small number of) changes in the document?

No need. I checked out the PR and did the diff manually.

I think this PR needs to remove the existing file sycl_ext_intel_bf16_conversion.asciidoc, correct?

Yes, thanks for pointing this out. I've removed the old doc now.

gmlueck

spec changes are OK

JackAKirk · 2022-03-31T10:44:58Z

/verify with intel/llvm-test-suite#889

JackAKirk · 2022-03-31T13:14:31Z

/verify with intel/llvm-test-suite#889

Hi @bader

RE: #5849 (comment)

I've tried this here but it still seems to be testing using the old version of bfloat16_type.cpp rather than the new version in my llvm-test-suite test. Did I do something incorrectly?

Thanks

bader · 2022-03-31T16:36:28Z

/verify with intel/llvm-test-suite#889

Hi @bader

RE: #5849 (comment)

I've tried this here but it still seems to be testing using the old version of bfloat16_type.cpp rather than the new version in my llvm-test-suite test. Did I do something incorrectly?

Thanks

@JackAKirk, I think you are look at wrong logs.
http://pre-commit.llvm.intel.com:8013/file_review?path_name=/llvm_test_suite_associate/PR5393/461ddb8206 <- this seems to be the logs you are looking for. All tests pass on CUDA.

JackAKirk · 2022-03-31T17:47:25Z

/verify with intel/llvm-test-suite#889

Hi @bader
RE: #5849 (comment)
I've tried this here but it still seems to be testing using the old version of bfloat16_type.cpp rather than the new version in my llvm-test-suite test. Did I do something incorrectly?
Thanks

@JackAKirk, I think you are look at wrong logs. http://pre-commit.llvm.intel.com:8013/file_review?path_name=/llvm_test_suite_associate/PR5393/461ddb8206 <- this seems to be the logs you are looking for. All tests pass on CUDA.

Oh yes, I see that the failing tests at the bottom of this page must be from from the existing bfloat16_type.cpp test, which is expected to fail without the patch from intel/llvm-test-suite#889.

In that case this I think that this PR can be considered for merge. @alexbatashev would it be possible to get a review?

alexbatashev

LGTM

gmlueck

spec changes still look OK.

JackAKirk · 2022-04-05T08:57:13Z

/verify with intel/llvm-test-suite#889

This SYCL 2020 extension proposal proposes adding bfloat16 support to the fma, fmin, fmax and fabs SYCL floating point math functions. Blocked by #5393 Co-authored-by: JackAKirk <[email protected]>

JackAKirk added 2 commits January 25, 2022 14:06

Added bfloat16 support for cuda backend.

025cf7e

Added bfloat16 in oneapi experimental namespace. Signed-off-by: jack.kirk <[email protected]>

deleted intel namespace bfloat16.

66b4e33

JackAKirk mentioned this pull request Jan 25, 2022

Update SYCL_INTEL_bf16_conversion.asciidoc #5248

Closed

JackAKirk requested a review from AlexeySotkin January 25, 2022 14:28

JackAKirk added 2 commits January 25, 2022 14:29

Format.

2d04406

Changed extension macro name.

9418f74

JackAKirk requested a review from dkhaldi January 26, 2022 17:46

JackAKirk added 2 commits February 17, 2022 10:11

Merge branch 'sycl' into bf16-cvt-ext

65fddfa

fixed test.

4d99f3f

JackAKirk marked this pull request as ready for review February 17, 2022 10:28

JackAKirk requested a review from a team as a code owner February 17, 2022 10:28

JackAKirk requested a review from alexbatashev February 17, 2022 10:28

dkhaldi reviewed Feb 17, 2022

View reviewed changes

JackAKirk mentioned this pull request Feb 23, 2022

[SYCL][Doc] math functions added to bfloat16 ext #5645

Merged

dkhaldi requested a review from MrSidims February 28, 2022 19:36

JackAKirk mentioned this pull request Mar 4, 2022

[SYCL][CUDA][BFLOAT16]Temp oneapi test file adds unary cov.. intel/llvm-test-suite#889

Merged

Used neg ptx7.0 builtin for unary minus

3982001

JackAKirk requested review from a team as code owners March 4, 2022 15:42

Replaced SYCL_EXT_INTEL_BF16_CONVERSION.asciidoc with SYCL_EXT_ONEAPI…

8d2d11f

…_BF16_CONVERSION.asciidoc

JackAKirk requested a review from a team as a code owner March 7, 2022 16:48

JackAKirk added 3 commits March 8, 2022 11:22

Merge branch 'sycl' into bf16-cvt-ext

d8bc53f

Merge branch 'sycl' into bf16-cvt-ext

2f9b7d7

Renamed extension to cover all bfloat16 funct.

8a29c44

Removed aspect reference: can be added once the ext_oneapi_bfloat16 aspect is merged.

Updated macro name

f1fba08

gmlueck reviewed Mar 16, 2022

View reviewed changes

Removed old extension doc

461ddb8

JackAKirk requested review from dkhaldi and gmlueck and removed request for AlexeySotkin March 23, 2022 15:28

gmlueck previously approved these changes Mar 23, 2022

View reviewed changes

typo

e433fbc

JackAKirk dismissed gmlueck’s stale review via e433fbc March 31, 2022 11:03

alexbatashev approved these changes Mar 31, 2022

View reviewed changes

JackAKirk requested a review from gmlueck March 31, 2022 18:15

gmlueck approved these changes Mar 31, 2022

View reviewed changes

gmlueck mentioned this pull request Mar 31, 2022

[SYCL] Add fma_relu extension #5749

Closed

bader merged commit 5231fe4 into intel:sycl Apr 5, 2022

AidanBeltonS mentioned this pull request May 16, 2022

[SYCL] Add aspect for bfloat16 #5720

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SYCL][CUDA] bfloat16 in oneapi namespace and also supporting CUDA #5393

[SYCL][CUDA] bfloat16 in oneapi namespace and also supporting CUDA #5393

JackAKirk commented Jan 25, 2022 •

edited

Loading

JackAKirk commented Feb 17, 2022 •

edited

Loading

dkhaldi Feb 17, 2022

JackAKirk commented Mar 7, 2022 •

edited

Loading

JackAKirk commented Mar 16, 2022 •

edited

Loading

gmlueck left a comment

gmlueck Mar 16, 2022

JackAKirk Mar 16, 2022

gmlueck Mar 16, 2022

JackAKirk Mar 23, 2022

gmlueck Mar 23, 2022

JackAKirk commented Mar 16, 2022

gmlueck left a comment

JackAKirk commented Mar 31, 2022

JackAKirk commented Mar 31, 2022

bader commented Mar 31, 2022

JackAKirk commented Mar 31, 2022

alexbatashev left a comment

gmlueck left a comment

JackAKirk commented Apr 5, 2022

[SYCL][CUDA] bfloat16 in oneapi namespace and also supporting CUDA #5393

[SYCL][CUDA] bfloat16 in oneapi namespace and also supporting CUDA #5393

Conversation

JackAKirk commented Jan 25, 2022 • edited Loading

JackAKirk commented Feb 17, 2022 • edited Loading

dkhaldi Feb 17, 2022

Choose a reason for hiding this comment

JackAKirk commented Mar 7, 2022 • edited Loading

JackAKirk commented Mar 16, 2022 • edited Loading

gmlueck left a comment

Choose a reason for hiding this comment

gmlueck Mar 16, 2022

Choose a reason for hiding this comment

JackAKirk Mar 16, 2022

Choose a reason for hiding this comment

gmlueck Mar 16, 2022

Choose a reason for hiding this comment

JackAKirk Mar 23, 2022

Choose a reason for hiding this comment

gmlueck Mar 23, 2022

Choose a reason for hiding this comment

JackAKirk commented Mar 16, 2022

gmlueck left a comment

Choose a reason for hiding this comment

JackAKirk commented Mar 31, 2022

JackAKirk commented Mar 31, 2022

bader commented Mar 31, 2022

JackAKirk commented Mar 31, 2022

alexbatashev left a comment

Choose a reason for hiding this comment

gmlueck left a comment

Choose a reason for hiding this comment

JackAKirk commented Apr 5, 2022

JackAKirk commented Jan 25, 2022 •

edited

Loading

JackAKirk commented Feb 17, 2022 •

edited

Loading

JackAKirk commented Mar 7, 2022 •

edited

Loading

JackAKirk commented Mar 16, 2022 •

edited

Loading