Skip to content

[SYCL][CUDA] tf32 matrix MAD impl using uint32_t #5709

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 6 commits into from

Conversation

JackAKirk
Copy link
Contributor

@JackAKirk JackAKirk commented Mar 2, 2022

CUDA backend Implementation of tf32 MAD using the underlying 32 bit type, fully consistent with the existing matrix extension.

Integration test added here: intel/llvm-test-suite#881

buffer<uint32_t, 1> bufB(B, range<1>(K * N));
buffer<float, 1> bufC(C, range<1>(M * N));
buffer<float, 1> bufD(D, range<1>(M * N));

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a complete example in test/matrix where you show the necessary "manual" conversion function from float to fp19(uint32) during initialization and then from fp19 to float during accumulation and verification?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it's here: intel/llvm-test-suite#881
for the float to fp19

uint32_t make_tf32(float const &x);

For the fp19 to float:

float tf32_to_fp32(uint32_t x);

(I'll rename both to e.g. make_fp19)

// number of rows of a.
constexpr int K = 8; // number of cols of a/number of rows of b.

uint32_t A[M * K];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a comment that uint32 is used here as a storage for fp19

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments, I've updated both tests now.

@JackAKirk JackAKirk requested a review from dkhaldi March 7, 2022 13:43
@dkhaldi
Copy link
Contributor

dkhaldi commented Mar 10, 2022

LGTM but we need to start adopting the name tf32 instead of fp19.
As soon as you make the change, I will approve.

@JackAKirk JackAKirk changed the title [SYCL][CUDA] fp19 matrix MAD impl using uint32_t [SYCL][CUDA] tf32 matrix MAD impl using uint32_t Mar 11, 2022
@JackAKirk
Copy link
Contributor Author

This PR is no longer necessary: The complete tf32 implementation is now ready which can replace this PR: #5870

@JackAKirk JackAKirk closed this Mar 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants