Skip to content

[SYCL][CUDA] bfloat16 in oneapi namespace and also supporting CUDA #5393

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Apr 5, 2022

Conversation

JackAKirk
Copy link
Contributor

@JackAKirk JackAKirk commented Jan 25, 2022

I have tested these changes on the CUDA backend using a intel->oneapi namespace version of this test:
https://github.com/intel/llvm-test-suite/blob/intel/SYCL/BFloat16/bfloat16_type.cpp

There is a bug in the verify_logic function in the bfloat16_type.cpp test (C accessor is not written to) - I'm not sure how this did not lead to a failure already. With the bug fixed the test passes for the CUDA backend with this patch. I've added a draft test file that also increases the coverage to test unary minus operator here: intel/llvm-test-suite#889.

Note that the unary neg intrinsic added here that is used in unary minus will be pulled down from upstream via e.g. https://reviews.llvm.org/D117887.

Is this an acceptable direction for generalizing this extension to support other vendor's backends?

Added bfloat16 in oneapi experimental namespace.

Signed-off-by: jack.kirk <[email protected]>
@JackAKirk JackAKirk requested a review from dkhaldi January 26, 2022 17:46
@JackAKirk JackAKirk marked this pull request as ready for review February 17, 2022 10:28
@JackAKirk JackAKirk requested a review from a team as a code owner February 17, 2022 10:28
@JackAKirk
Copy link
Contributor Author

JackAKirk commented Feb 17, 2022

@AlexeySotkin

BTW this PR is ready for review. I had only left it as draft because there are some small outstanding issues which means it cannot be completely finished, but these should be minor issues. What we really want to know is whether this an acceptable direction for generalizing this extension to support other vendor's backends? It seems to be almost trivial to allow other backends to use this extension (the changes I made are very small).

One issue is that we need to know is whether the aspect from #4266 is going to be eventually added? so we know whether we also need to support it in the CUDA backend.

You may also want to take note of the bug that I found as described in my first message here ^^.

I am aware that I did not yet add support to the CUDA version of the intel/llvm device test because it will require a different compilation instruction for the CUDA case. Again this is a small issue that can be addressed at a later point.

@@ -14,7 +14,7 @@
__SYCL_INLINE_NAMESPACE(cl) {
namespace sycl {
namespace ext {
namespace intel {
namespace oneapi {
namespace experimental {

class [[sycl_detail::uses_aspects(ext_intel_bf16_conversion)]] bfloat16 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be wrapped with SYCL_DEVICE_ONLY defined, for it to be built successfully by other compilers, right? (see #5594)
@AlexeySotkin, are you planning for adding this as a separate PR?

@JackAKirk JackAKirk requested review from a team as code owners March 4, 2022 15:42
@JackAKirk JackAKirk requested a review from a team as a code owner March 7, 2022 16:48
@JackAKirk
Copy link
Contributor Author

JackAKirk commented Mar 7, 2022

Added updated design doc switching from Intel vendor specific to oneapi extension. We are aware there is another open PR for this design doc: #5248.
It has also been suggested (#5645 (comment)) that we should merge this extension: #5645 with sycl_oneapi_bf16_conversion.asciidoc. I think that Codeplay agrees with this.

@JackAKirk
Copy link
Contributor Author

JackAKirk commented Mar 16, 2022

Added updated design doc switching from Intel vendor specific to oneapi extension. We are aware there is another open PR for this design doc: #5248. It has also been suggested (#5645 (comment)) that we should merge this extension: #5645 with sycl_oneapi_bf16_conversion.asciidoc. I think that Codeplay agrees with this.

Update:

As discussed the plan is to have a single bfloat16 oneapi experimental extension. As such I have renamed the existing intel experimental extension as "sycl_ext_oneapi_bfloat16.asciidoc" and made some small naming changes. If it is preferred I can (temporarily) switch this document back to its original name since that could help with reviewing the (small number of) changes in the document?

Ideally this PR would be merged first, then other open PRs that add to the design document/bfloat16 functionality such as #5645 and #5248 would follow once they have been updated to reflect the new extension name etc.

This PR can be tested with a draft test-suite PR that is up to date for the purpose of this functional testing: intel/llvm-test-suite#889

Copy link
Contributor

@gmlueck gmlueck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is preferred I can (temporarily) switch this document back to its original name since that could help with reviewing the (small number of) changes in the document?

No need. I checked out the PR and did the diff manually.

I think this PR needs to remove the existing file sycl_ext_intel_bf16_conversion.asciidoc, correct?

namespace experimental {

class [[sycl_detail::uses_aspects(ext_intel_bf16_conversion)]] bfloat16 {
class bfloat16 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, did you remove this C++ attribute intentionally?

Removing it is probably not a big problem because we don't have support in the rest of the tool chain yet to take advantage of this attribute. If there's no reason to remove it, though, it seems better to keep it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it was intentional. I think that it will be replaced by the oneapi aspect in #5720.

The only reason for deleting it is that the cuda backend does not have this ext_intel_bf16_conversion aspect (which as I understand it will be replaced by the oneapi aspect generally). However I can try adding it back and seeing if the cuda backend bfloat16 test passes with it included, since I'm not sure how it will behave. Shall I do this? If the cuda backend test passes then this line can remain unchanged if that is preferred.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see. I thought the aspect was renamed in this PR, but I see now that it is not.

When #5720 is merged, we should add this C++ attribute back and use the new aspect. I think you could probably keep the attribute in this PR with the old aspect name, and it would probably not hurt CUDA. The attribute will cause the front-end to put some metadata on the IR, but I think all IR passes ignore that metadata right now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see. I thought the aspect was renamed in this PR, but I see now that it is not.

When #5720 is merged, we should add this C++ attribute back and use the new aspect.

So is it OK to leave it as it is? I would remember to add it back when the new aspect is available....

I think you could probably keep the attribute in this PR with the old aspect name, and it would probably not hurt CUDA. The attribute will cause the front-end to put some metadata on the IR, but I think all IR passes ignore that metadata right now.

Or should I add it back now with the old aspect name?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either way is OK.

@JackAKirk
Copy link
Contributor Author

If it is preferred I can (temporarily) switch this document back to its original name since that could help with reviewing the (small number of) changes in the document?

No need. I checked out the PR and did the diff manually.

I think this PR needs to remove the existing file sycl_ext_intel_bf16_conversion.asciidoc, correct?

Yes, thanks for pointing this out. I've removed the old doc now.

@JackAKirk JackAKirk requested review from dkhaldi and gmlueck and removed request for AlexeySotkin March 23, 2022 15:28
gmlueck
gmlueck previously approved these changes Mar 23, 2022
Copy link
Contributor

@gmlueck gmlueck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spec changes are OK

@JackAKirk
Copy link
Contributor Author

/verify with intel/llvm-test-suite#889

@JackAKirk
Copy link
Contributor Author

/verify with intel/llvm-test-suite#889

Hi @bader

RE: #5849 (comment)

I've tried this here but it still seems to be testing using the old version of bfloat16_type.cpp rather than the new version in my llvm-test-suite test. Did I do something incorrectly?

Thanks

@bader
Copy link
Contributor

bader commented Mar 31, 2022

/verify with intel/llvm-test-suite#889

Hi @bader

RE: #5849 (comment)

I've tried this here but it still seems to be testing using the old version of bfloat16_type.cpp rather than the new version in my llvm-test-suite test. Did I do something incorrectly?

Thanks

@JackAKirk, I think you are look at wrong logs.
http://pre-commit.llvm.intel.com:8013/file_review?path_name=/llvm_test_suite_associate/PR5393/461ddb8206 <- this seems to be the logs you are looking for. All tests pass on CUDA.

@JackAKirk
Copy link
Contributor Author

/verify with intel/llvm-test-suite#889

Hi @bader
RE: #5849 (comment)
I've tried this here but it still seems to be testing using the old version of bfloat16_type.cpp rather than the new version in my llvm-test-suite test. Did I do something incorrectly?
Thanks

@JackAKirk, I think you are look at wrong logs. http://pre-commit.llvm.intel.com:8013/file_review?path_name=/llvm_test_suite_associate/PR5393/461ddb8206 <- this seems to be the logs you are looking for. All tests pass on CUDA.

Oh yes, I see that the failing tests at the bottom of this page must be from from the existing bfloat16_type.cpp test, which is expected to fail without the patch from intel/llvm-test-suite#889.

In that case this I think that this PR can be considered for merge. @alexbatashev would it be possible to get a review?

Copy link
Contributor

@alexbatashev alexbatashev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@JackAKirk JackAKirk requested a review from gmlueck March 31, 2022 18:15
Copy link
Contributor

@gmlueck gmlueck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spec changes still look OK.

@JackAKirk
Copy link
Contributor Author

/verify with intel/llvm-test-suite#889

@bader bader merged commit 5231fe4 into intel:sycl Apr 5, 2022
pvchupin pushed a commit that referenced this pull request Jun 8, 2022
This SYCL 2020 extension proposal proposes adding bfloat16 support to the fma, fmin, fmax and fabs SYCL floating point math functions.

Blocked by #5393

Co-authored-by: JackAKirk <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants