-
Notifications
You must be signed in to change notification settings - Fork 769
[ESIMD] Replace mask_type_t with simd_mask to represent Gen predicates. #4230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
d97debd
to
0011fe1
Compare
9804242
to
73c79c6
Compare
73c79c6
to
a158e1d
Compare
Rebased and resolved conflicts. |
Jenkins/Precommit failure is because required test fixes intel/llvm-test-suite#388 are not committed yet. |
e60c397
to
e277180
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I spent some time reviewing it, but before going into more details, let me ask a few high-level questions:
- do we need
simd_view
fromsimd_mask
? I noticed a few places where you distinguish whether a simd_view originates fromsimd
or fromsimd_mask
- The patch is quite big. I think it may be split in the following parts:
- introduce simd_obj_impl class
- helper patch with fixes for
simd_view
class, changes intypes.hpp
, etc. - operators
- introduce simd_mask class
sycl/include/sycl/ext/intel/experimental/esimd/detail/simd_obj_impl.hpp
Outdated
Show resolved
Hide resolved
Yes, we do. The reason is that simd_view<simd_mask,...> and simd_view<simd,...> support different set of APIs.
Unfortunately, it is next to impossible to split this into separate patches where each can pass building and testing individually. |
@kbobrovs, please, resolve merge conflicts. |
@kbobrovs, please double check there is no code quality regression with this change set. Thanks. |
…dicates. Addresses the following review comment from simd.hpp: // TODO @rolandschulz, @mattkretz // Introduce simd_mask type and let user use this type instead of specific // type representation (simd<uint16_t, N>) to make it more portable // TODO @iburyl should be mask_type_t, which might become more abstracted in // the future revisions. // Gen predicates (masks) used to be represented with simd<unsigned short, N> types which made them indistinguishable from simd objects forcing the same same sets of available APIs, even though it did not always make sense. This patch introduces a new class simd_mask, which provides its own set of APIs, some of which are unique, other - repeat those from the simd class. A new class 'simd_obj_impl' is added, which encompasses functionality common between simd and simd_mask objects. Particularly, view creation APIs (select, bit_cast_view), object construction APIs - replicate, memory I/O (copy from/to) and others. simd and simd_mask both extend the simd_obj_impl class. They use the 'curiously recurring template' design pattern, when simd_obj_impl is templated by its derivative class (simd or simd_mask), to allow simd_obj_impl APIs return subclass objects rather than simd_obj_impl instances, those avoiding API duplication in subclasses. For example, 'select' functions defined in simd_obj_impl, will return a view of the derived class (subclass) rather than a view of simd_obj_impl: simd_view<Derived,...>, where Derived is the derived class. APIs unique to simd_mask: - unary logical negation '!' operator - implicit conversion to bool for single-element masks - logical binary operators '||' and '&&' APIs unique to simd: - binary arithmetic operators ('+', '-', '*', '/') and corresponding compound assignments - unary arithmetic operators ('+', '-', '++', '--') - relational operators ('<', '<=', '>', '>=') The same division of the APIs to unique sets applies to simd_view<simd<...>,...> and simd_view<simd_mask<...>,...> simd_view class specializations.
Signed-off-by: kbobrovs <[email protected]>
Fix test failures: - fix vertical stride in replicate - allow unary logical negation for simd with integer element type - use data() to access M_data in non-constructors
Signed-off-by: kbobrovs <[email protected]>
98dd06c
to
b01eb88
Compare
@kychendev, in the absence of the perf regression suite (TBD), I manually checked linear, kmeans, bitonic sort v2, mandelbrot and dgetrf - no regressions. |
Precommit failed due to old version of llvm-test-suite, fixed by intel/llvm-test-suite#388. @kychendev, @DenisBakhvalov - please approve |
…t ctor. Signed-off-by: kbobrovs <[email protected]>
@kychendev, @DenisBakhvalov - I manually checked CG (unfortunately, objcopy __CLANG_OFFLOAD_BUNDLE__sycl-spir64 somehow produces invalid .spv now - will file a separate JIRA), and caught one problem indeed - now simd<...>{val} invokes longer initializer list-based constructor as opposed to (val, stride) version. Fixed. Please take another look. |
Thanks for double checking code quality and catching the issue. Added few comments. |
Signed-off-by: kbobrovs <[email protected]>
Signed-off-by: kbobrovs <[email protected]>
…oved tests per review commetns. Signed-off-by: kbobrovs <[email protected]>
@kychendev, @DenisBakhvalov - review comments addressed. |
Jenkins failures are because the test suite is not updated yet. intel/llvm-test-suite#388 is the fix.
|
…388) * [ESIMD] Add simd_mask test, fix other tests to use simd_mask<N>. * [ESIMD] Fix unsupported simd_view += id<1> operation in test. * Fix simd mask usage in ESIMD tests, stride=1 for 1-element select. Complementary test change to intel/llvm#4230 Signed-off-by: kbobrovs <[email protected]>
* upstream/sycl: (2344 commits) [ESIMD] Rename slm_load4/slm_store4 to slm_load_rgba/slm_store_rgba (intel#4158) [SYCL] Avoid re-computing group_range in nd_item::get_group_range() (intel#4621) [clang-offload-extract] Ignore zero padding in .tgting section (intel#4622) [Driver][SYCL] Fix -fsycl-help output when redirected (intel#4619) [Driver][SYCL][FPGA] Do not unbundle aoco as an archive for hardware (intel#4477) [Driver][SYCL] Fix offload-bundler and offload-deps triples (intel#4616) [SYCL] Fix bit_cast for half type (intel#4603) [SYCL] Fix a typo in accessor::get_range method (intel#4556) [SYCL] Detach allocas from their dependencies regardless of linked alloca presence (intel#4573) [SYCL][L0] Make sure that we only query/sync host-visible events from the host. (intel#4613) Fix tests with wrong alias metadata [Driver][SYCL] Fixup arguments to llc call for PIC and code-model (intel#4614) [SYCL][L0] Add ownership control for LeveL-Zero kernel_bundle interop. (intel#4576) [SYCL][Driver] Expose llvm-foreach --jobs functionality through a driver option (intel#4543) [SYCL] Prevent stream buffer leak on constructor exception (intel#4594) [ESIMD] Replace mask_type_t with simd_mask to represent Gen predicates. (intel#4230) Fix for a bunch of fixed point integer SPIR-V instructions (intel#1213) Amend SingleElementVectorINTEL decoration use cases according to spec update (intel#1192) Enforce UserSemantic decoration if no FPGA decorations found [SYCL][CUDA] Fix context scope in kernel launch (intel#4606) ...
…ntel/llvm-test-suite#388) * [ESIMD] Add simd_mask test, fix other tests to use simd_mask<N>. * [ESIMD] Fix unsupported simd_view += id<1> operation in test. * Fix simd mask usage in ESIMD tests, stride=1 for 1-element select. Complementary test change to intel#4230 Signed-off-by: kbobrovs <[email protected]>
Addresses the following review comment from simd.hpp:
// TODO @rolandschulz, @mattkretz
// Introduce simd_mask type and let user use this type instead of specific
// type representation (simd<uint16_t, N>) to make it more portable
// TODO @iburyl should be mask_type_t, which might become more abstracted in
// the future revisions.
//
Gen predicates (masks) used to be represented with simd<unsigned short, N> types
which made them indistinguishable from simd objects forcing the same same sets
of available APIs, even though it did not always make sense. This patch
introduces a new class simd_mask, which provides its own set of APIs, some of
which are unique, other - repeat those from the simd class.
A new class 'simd_obj_impl' is added, which encompasses functionality common
between simd and simd_mask objects. Particularly, view creation APIs (select,
bit_cast_view), object construction APIs - replicate, memory I/O (copy from/to)
and others. simd and simd_mask both extend the simd_obj_impl class. They use the
'curiously recurring template' design pattern, when simd_obj_impl is templated
by its derivative class (simd or simd_mask), to allow simd_obj_impl APIs return
subclass objects rather than simd_obj_impl instances, those avoiding API
duplication in subclasses. For example, 'select' functions defined in
simd_obj_impl, will return a view of the derived class (subclass) rather than
a view of simd_obj_impl: simd_view<Derived,...>, where Derived is the derived
class.
APIs unique to simd_mask:
APIs unique to simd:
assignments
The same division of the APIs to unique sets applies to simd_view<simd<...>,...>
and simd_view<simd_mask<...>,...> simd_view class specializations.
Complementary test change: intel/llvm-test-suite#388
Signed-off-by: kbobrovs [email protected]