[SYCL] Add test cases for SYCL2020 reductions #194

v-klochkov · 2021-03-25T06:03:52Z

The corresponding changes in SYCL RT:
intel/llvm#3410
intel/llvm#3481

This patch

adds checks for SYCL-2020 reductions with initialize_to_identity property (#3410)
enables the test reduction_nd_N_vars.cpp for level-zero (#3481)
reduces some of tests by eliminating unneeded/duplicated checks.
removes the test reduction_transparent.cpp as useless because transparent
operators already checked in many other tests.
reduces runtime of reduction tests by about 200x in average by re-using
queue created once instead of creating it in each of test-cases

Signed-off-by: Vyacheslav N Klochkov [email protected]

…property The corresponding LIT tests PR: intel/llvm-test-suite#194 SYCL2020 reductions use dynamic information (instead of static info used previously) to ask to do discard-write to user's reduction variable or USM memory. That caused big changes removing/re-structuring the code that previously relied on accessor-mode (read-write vs discard-write). With this patch the SYCL-2020 reductions become on par with ONEAPI::reduction Signed-off-by: Vyacheslav N Klochkov <[email protected]>

…#3410) * [SYCL] Implement SYCL2020 reductions with set initialize_to_identity property The corresponding LIT tests PR: intel/llvm-test-suite#194 SYCL2020 reductions use dynamic information (instead of static info used previously) to ask to do discard-write to user's reduction variable or USM memory. That caused big changes removing/re-structuring the code that previously relied on accessor-mode (read-write vs discard-write). With this patch the SYCL-2020 reductions become on par with ONEAPI::reduction Signed-off-by: Vyacheslav N Klochkov <[email protected]>

Using sycl::detail::tuple is a temporary work-around for various problems caused by using std::tuple: a) reduction using std::tuple cannot be compiled on Windows because std::tuple cannot be copied to DEVICE. b) internal error in level_zero RT. The new sycl::detail::tuple class is a very limited version of oneDPL's implementation of tuple. It includes such functionality: - convert from std::tuple and to std::tuple - tie(), get<I>(), tuple_element, make_tuple This change enables parallel_for() with number of reductions more than 1 for level_zero and for Windows. The corresponding changes in LIT tests: intel/llvm-test-suite#194 Signed-off-by: Vyacheslav N Klochkov <[email protected]>

…y property The corresponding changes in SYCL RT: intel/llvm#3410 intel/llvm#3481 Signed-off-by: Vyacheslav N Klochkov <[email protected]>

Signed-off-by: Vyacheslav N Klochkov <[email protected]>

…_reduction_2020

v-klochkov · 2021-04-08T01:21:51Z

This change is supposed to end annoying flaky reduction test fails in CI due to timeouts.

rdeodhar · 2021-04-08T17:50:12Z

The change to use a queue created in "main" looks good to me.

Regarding flakiness on the CPU device, could this message hold a clue for difference in behavior across machines?
"The implementation handling parallel_for with reduction requires work group size not bigger than 1"

v-klochkov · 2021-04-08T17:56:42Z

The change to use a queue created in "main" looks good to me.

Regarding flakiness on the CPU device, could this message hold a clue for difference in behavior across machines?
"The implementation handling parallel_for with reduction requires work group size not bigger than 1"

I see different (unexpected) output results. Combining that with diagnostics shown by you, it may be some problem with pointers, that I am going to debug/fix. That is not expected for ONEAPI::detail::reduGetMaxWGSize() to return 1 for those simple cases in reduction_nd_N_vars.cpp.

againull

Also as a side note: it looks like there are too much different changes for one review. As far as I remember ideal PRs are < 300 lines. And in this case it even looks like merged changes from different PRs. I don't insist on splitting it, just a note.

againull · 2021-04-08T17:52:47Z

SYCL/Reduction/reduction_nd_N_vars.cpp

@@ -100,8 +97,9 @@ int testOne(T1 IdentityVal1, T1 InitVal1, BinaryOperation1 BOp1,
    CorrectOut2 = BOp2(CorrectOut2, InitVal2);
  if (Mode3 == access::mode::read_write)
    CorrectOut3 = BOp3(CorrectOut3, InitVal3);
-  // 4th reduction is USM and this is read_write.
-  CorrectOut4 = BOp4(CorrectOut4, InitVal4);
+  // discard_write mode for USM reductions is available only SYCL2020.


I cannot match the comment to the code. I see USM but I don't see discard_write. Am I missing something?

ONEAPI::reduction if initialized with USM pointer assumed read-write to that USM memory. I.e. add the original value of element pointed by USM memory in the final sum. Only SYCL-2020 can ignore the original value of USM memory (i.e. do discard_write).

The test may have 'Mode4' be equal to access::mode::discard_write, which is treated as 'read_write' if 'IsSYCL2020==false'

The comment could be this:
// ONEAPI::reduction supports only read_write access to USM memory

againull · 2021-04-08T18:02:54Z

SYCL/Reduction/reduction_nd_s0_dw.cpp

@@ -1,5 +1,4 @@
 // RUN: %clangxx -fsycl -fsycl-targets=%sycl_triple %s -o %t.out
-// RUNx: %HOST_RUN_PLACEHOLDER %t.out


Is this feature device-only? Or are we going to add support for HOST in the future? If the second, probably it makes sense to keep this line or add todo to not forget to enable it.

running on host would require adding support for barrier() and group algorithm reduce(). That is a known issue that cannot be forgotten. Also several other tests already mention that (e.g. reduction_placeholder.cpp, reduction_queue_parallel_for.cpp )

I can return that line if you think it is useful to have it there.

v-klochkov · 2021-04-08T18:26:27Z

Also as a side note: it looks like there are too much different changes for one review.

Yes, I know, sorry. That is not typical situation.
The LIT changes for intel/llvm#3481 is removal of few lines disabling reduction_nd_N_vars.cpp for Windows and for level_zero on Linux.

The original change for intel/llvm#3410 was smaller than that, but it did not pass several times without additional changes reducing tests and re-using queue. That is why it is so big now.

againull

Don't have any major concerns, LGTM

) * [SYCL] Add test cases for SYCL2020 reductions + initialize_to_identity property The corresponding changes in SYCL RT: intel#3410 intel#3481 This patch - adds checks for SYCL-2020 reductions with initialize_to_identity property (intel/llvm-test-suite#3410) - enables the test reduction_nd_N_vars.cpp for level-zero (intel/llvm-test-suite#3481) - reduces some of tests by eliminating unneeded/duplicated checks. - removes the test reduction_transparent.cpp as useless because transparent operators already checked in many other tests. - reduces runtime of reduction tests by about 200x in average by re-using queue created once instead of creating it in each of test-cases Signed-off-by: Vyacheslav N Klochkov <[email protected]>

v-klochkov mentioned this pull request Mar 25, 2021

[SYCL] Implement SYCL2020 reductions with set initialize_to_identity … intel/llvm#3410

Merged

v-klochkov force-pushed the public_vklochkov_reduction_2020 branch 2 times, most recently from a025530 to c623e5c Compare March 25, 2021 17:26

v-klochkov force-pushed the public_vklochkov_reduction_2020 branch 2 times, most recently from 2b1d0de to 5110a10 Compare April 3, 2021 20:39

v-klochkov mentioned this pull request Apr 3, 2021

[SYCL] Pull oneDPL tuple to use in reduction implementation intel/llvm#3481

Merged

v-klochkov force-pushed the public_vklochkov_reduction_2020 branch from 5110a10 to 5a203cb Compare April 3, 2021 20:45

v-klochkov force-pushed the public_vklochkov_reduction_2020 branch from 5a203cb to 53f6067 Compare April 5, 2021 16:50

v-klochkov marked this pull request as ready for review April 5, 2021 17:17

v-klochkov requested a review from vladimirlaz April 5, 2021 17:17

v-klochkov force-pushed the public_vklochkov_reduction_2020 branch 3 times, most recently from 725ad13 to 0d95b4d Compare April 6, 2021 17:22

v-klochkov added 2 commits April 7, 2021 12:59

[SYCL] Add test cases for SYCL2020 reductions + initialize_to_identit…

f96a788

…y property The corresponding changes in SYCL RT: intel/llvm#3410 intel/llvm#3481 Signed-off-by: Vyacheslav N Klochkov <[email protected]>

[SYCL] Improve reduction tests time by about 200x by re-using queue

f53c4a4

Signed-off-by: Vyacheslav N Klochkov <[email protected]>

v-klochkov force-pushed the public_vklochkov_reduction_2020 branch from 0d95b4d to f53c4a4 Compare April 7, 2021 23:16

v-klochkov added 2 commits April 7, 2021 17:08

clang-format + disabled reduction_nd_N_vars.cpp on CPU as flaky

6ecb6aa

Signed-off-by: Vyacheslav N Klochkov <[email protected]>

Merge remote-tracking branch 'intel_llvm/intel' into public_vklochkov…

7207735

…_reduction_2020

v-klochkov requested review from rdeodhar and againull April 8, 2021 00:14

againull reviewed Apr 8, 2021

View reviewed changes

againull approved these changes Apr 8, 2021

View reviewed changes

v-klochkov merged commit c38e874 into intel:intel Apr 8, 2021

v-klochkov deleted the public_vklochkov_reduction_2020 branch April 8, 2021 18:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SYCL] Add test cases for SYCL2020 reductions #194

[SYCL] Add test cases for SYCL2020 reductions #194

v-klochkov commented Mar 25, 2021 •

edited

Loading

v-klochkov commented Apr 8, 2021

rdeodhar commented Apr 8, 2021

v-klochkov commented Apr 8, 2021 •

edited

Loading

againull left a comment

againull Apr 8, 2021

v-klochkov Apr 8, 2021

v-klochkov Apr 8, 2021

againull Apr 8, 2021

v-klochkov Apr 8, 2021

v-klochkov Apr 8, 2021

v-klochkov commented Apr 8, 2021 •

edited

Loading

againull left a comment

		@@ -1,5 +1,4 @@
		// RUN: %clangxx -fsycl -fsycl-targets=%sycl_triple %s -o %t.out
		// RUNx: %HOST_RUN_PLACEHOLDER %t.out

[SYCL] Add test cases for SYCL2020 reductions #194

[SYCL] Add test cases for SYCL2020 reductions #194

Conversation

v-klochkov commented Mar 25, 2021 • edited Loading

v-klochkov commented Apr 8, 2021

rdeodhar commented Apr 8, 2021

v-klochkov commented Apr 8, 2021 • edited Loading

againull left a comment

Choose a reason for hiding this comment

againull Apr 8, 2021

Choose a reason for hiding this comment

v-klochkov Apr 8, 2021

Choose a reason for hiding this comment

v-klochkov Apr 8, 2021

Choose a reason for hiding this comment

againull Apr 8, 2021

Choose a reason for hiding this comment

v-klochkov Apr 8, 2021

Choose a reason for hiding this comment

v-klochkov Apr 8, 2021

Choose a reason for hiding this comment

v-klochkov commented Apr 8, 2021 • edited Loading

againull left a comment

Choose a reason for hiding this comment

v-klochkov commented Mar 25, 2021 •

edited

Loading

v-klochkov commented Apr 8, 2021 •

edited

Loading

v-klochkov commented Apr 8, 2021 •

edited

Loading