[SYCL][CUDA] tf32 matrix MAD impl using uint32_t #5709

JackAKirk · 2022-03-02T13:10:18Z

CUDA backend Implementation of tf32 MAD using the underlying 32 bit type, fully consistent with the existing matrix extension.

Integration test added here: intel/llvm-test-suite#881

Signed-off-by: jack.kirk <[email protected]>

dkhaldi · 2022-03-03T15:17:11Z

sycl/test/check_device_code/matrix/matrix-nvptx-fp19-test.cpp

+  buffer<uint32_t, 1> bufB(B, range<1>(K * N));
+  buffer<float, 1> bufC(C, range<1>(M * N));
+  buffer<float, 1> bufD(D, range<1>(M * N));
+


Can you add a complete example in test/matrix where you show the necessary "manual" conversion function from float to fp19(uint32) during initialization and then from fp19 to float during accumulation and verification?

Yeah it's here: intel/llvm-test-suite#881
for the float to fp19

uint32_t make_tf32(float const &x);

For the fp19 to float:

float tf32_to_fp32(uint32_t x);

(I'll rename both to e.g. make_fp19)

dkhaldi · 2022-03-03T15:17:47Z

sycl/test/check_device_code/matrix/matrix-nvptx-fp19-test.cpp

+                      // number of rows of a.
+constexpr int K = 8;  // number of cols of a/number of rows of b.
+
+uint32_t A[M * K];


add a comment that uint32 is used here as a storage for fp19

Thanks for the comments, I've updated both tests now.

dkhaldi · 2022-03-10T20:27:50Z

LGTM but we need to start adopting the name tf32 instead of fp19.
As soon as you make the change, I will approve.

JackAKirk · 2022-03-31T18:27:38Z

This PR is no longer necessary: The complete tf32 implementation is now ready which can replace this PR: #5870

JackAKirk added 2 commits March 2, 2022 11:45

Implemented fp19 mma using the natural storage type uint32_t.

f6cf7b8

Signed-off-by: jack.kirk <[email protected]>

format

35302b5

JackAKirk requested a review from a team as a code owner March 2, 2022 13:10

JackAKirk requested a review from v-klochkov March 2, 2022 13:10

JackAKirk mentioned this pull request Mar 2, 2022

[SYCL][CUDA] fp19 matrix mad test update intel/llvm-test-suite#881

Closed

JackAKirk requested a review from dkhaldi March 2, 2022 13:12

JackAKirk added 2 commits March 2, 2022 13:15

format

712af98

format

3530643

dkhaldi reviewed Mar 3, 2022

View reviewed changes

added comment relating uint32_t to fp19

fa67ff9

JackAKirk requested a review from dkhaldi March 7, 2022 13:43

fp19 comments ->tf32

bfc68d2

dkhaldi approved these changes Mar 10, 2022

View reviewed changes

JackAKirk changed the title ~~[SYCL][CUDA] fp19 matrix MAD impl using uint32_t~~ [SYCL][CUDA] tf32 matrix MAD impl using uint32_t Mar 11, 2022

JackAKirk closed this Mar 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SYCL][CUDA] tf32 matrix MAD impl using uint32_t #5709

[SYCL][CUDA] tf32 matrix MAD impl using uint32_t #5709

JackAKirk commented Mar 2, 2022 •

edited

Loading

dkhaldi Mar 3, 2022

JackAKirk Mar 3, 2022

dkhaldi Mar 3, 2022

JackAKirk Mar 3, 2022

JackAKirk Mar 3, 2022

dkhaldi commented Mar 10, 2022

JackAKirk commented Mar 31, 2022

[SYCL][CUDA] tf32 matrix MAD impl using uint32_t #5709

[SYCL][CUDA] tf32 matrix MAD impl using uint32_t #5709

Conversation

JackAKirk commented Mar 2, 2022 • edited Loading

dkhaldi Mar 3, 2022

Choose a reason for hiding this comment

JackAKirk Mar 3, 2022

Choose a reason for hiding this comment

dkhaldi Mar 3, 2022

Choose a reason for hiding this comment

JackAKirk Mar 3, 2022

Choose a reason for hiding this comment

JackAKirk Mar 3, 2022

Choose a reason for hiding this comment

dkhaldi commented Mar 10, 2022

JackAKirk commented Mar 31, 2022

JackAKirk commented Mar 2, 2022 •

edited

Loading