-
Notifications
You must be signed in to change notification settings - Fork 131
[SYCL] Extend sub-group load/store tests to cover 3-, 16-elements vectors #253
Conversation
vladimirlaz
commented
Apr 26, 2021
- added processing of vectors of 3 and 16 elements;
- increase local group size to fit 16-element vectors when sub-group size is 16;
- tune check to avoid overlapping data ranges inside and between local groups;
- increase threshold because cumulative error raises due to increase vector elements number.
SYCL/SubGroup/helper.hpp
Outdated
@@ -98,7 +109,7 @@ template <typename T2> struct utils<T2, 16> { | |||
|
|||
template <typename T> void exit_if_not_equal(T val, T ref, const char *name) { | |||
if (std::is_floating_point<T>::value) { | |||
if (std::fabs(val - ref) > 0.01) { | |||
if (std::fabs(val - ref) > 0.02) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was the reason to increase the threshold?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was the reason to increase the threshold?
During data verification all elements of vectors are added and this value is compared with reference one. If we increase number of added elements twice potential cumulative error is increased twice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't it be better to switch to validating relative error instead of absolute one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test does not target specific accuracy goals. It checks that the return values are not something completely different from expected.
Does it make sense to invest in tuning accuracy of the test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest we use input data, so that results have no error at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bader, could you, please, describe a bit? I think I didn't get it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
During data verification all elements of vectors are added and this value is compared with reference one.
When you do FP addition, the computed result is rounded to fit into resulting data type. The error occurs only if add
result can't be exactly preserved and has to be rounded. E.g. T == float
, 2^{20} + 2^{-10}
can't be represented "exactly", but 2^{20} + 2^{21}
can. I suggest using input values, so that rounding error will be 0, so you can always use exact match.
BTW, using std::fabs
here reduces accuracy for T == double
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented bitwise comparison from floating point type except half.
Valid range for half type is too narrow and will require to rework data for several tests. This is out of scope for current PR. But aligning input data for current test allow to revert threshold increase.
The test will start to pass once intel/llvm#3617 submitted |
…f skipping the whole test when CPU device is available on host machine
SYCL/SubGroup/helper.hpp
Outdated
@@ -98,7 +109,7 @@ template <typename T2> struct utils<T2, 16> { | |||
|
|||
template <typename T> void exit_if_not_equal(T val, T ref, const char *name) { | |||
if (std::is_floating_point<T>::value) { | |||
if (std::fabs(val - ref) > 0.01) { | |||
if (std::fabs(val - ref) > 0.02) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't it be better to switch to validating relative error instead of absolute one?
- when 3-element vector is passed it is packed as single element and 2 element vector; - when 16-element vector is passed it is packed as 2 sequential 8-element vectors. The test has changed in scope of intel/llvm-test-suite#253
…tors (intel/llvm-test-suite#253) - Add processing of vectors of 3 and 16 elements. - Increase local group size to fit 16-element vectors when sub-group size is 16. - Tune check to avoid overlapping data ranges inside and between local groups. - Implemented bitwise comparison for floating point types except half. Valid range for half type is too narrow and will require to rework data for several tests. This is out of scope for current PR. .