You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
checkValueRange is used to determine if an nd_range is compatible with
-fsycl-queries-fit-in-int, and is run as part of every kernel launch.
The previous implementation checked the size of each component of the
global range, local range, offset, and global range + offset, and also
checked the linearized version of each of these values.
The new implementation simplifies these checks, based on the following
logic:
- The linear global range size must be >= every component of the global
range.
If the linear global range fits in int, we don't need to check anything
else.
- Each value in the global range must be >= the value in the local
range.
If the global range fits in int, we don't need to check the local range.
- There is no need to check offset-related values if the offset is zero.
The new implementation also makes use of __builtin_mul_overflow where
available. This shifts the burden of maintaining fast code for these
checks to the compiler, and allows us to benefit from aggressive
optimizations.
The new implementation could be optimized further if there was a quick
way to check whether an nd_range has an offset.
---------
Signed-off-by: John Pennycook <[email protected]>
Co-authored-by: Udit Kumar Agarwal <[email protected]>
0 commit comments