This repository was archived by the owner on Dec 22, 2021. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 43
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1523bf1
to
53b82d8
Compare
Strong support, I'm adding this to Highway as well. It would be much harder for users to emulate this, especially if we do not add sign select nor i64 gt_s. |
Adding a preliminary vote for the inclusion of i64x2.abs operation to the SIMD proposal below. Please vote with - 👍 For including i64x2.abs |
I do have an issue with examples here - they seem to be all wrapper libraries. It isn't surprising that wrapper libraries would ave all sorts of operations, but this isn't the same as an app somebody could run. |
53b82d8
to
10706d4
Compare
f89840d
to
c7b0168
Compare
c7b0168
to
7384f2a
Compare
Fixed a bug in suggested lowering on SSE2 and ARM NEON (thanks @ngzhian for reporting). |
ngzhian
added a commit
to ngzhian/simd
that referenced
this pull request
Feb 9, 2021
This was merged in WebAssembly#413.
ngzhian
added a commit
to ngzhian/simd
that referenced
this pull request
Feb 10, 2021
This was merged in WebAssembly#413.
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Introduction
This is proposal to add 64-bit variant of existing
abs
instruction. ARM64 and x86 with AVX512 natively support this instruction, and on earlier instruction sets it can be emulated with 3-5 instructions.Applications
Mapping to Common Instruction Sets
This section illustrates how the new WebAssembly instructions can be lowered on common instruction sets. However, these patterns are provided only for convenience, compliant WebAssembly implementations do not have to follow the same code generation patterns.
x86/x86-64 processors with AVX512F and AVX512VL instruction sets
y = i64x2.abs(x)
is lowered toVPABSQ xmm_y, xmm_x
x86/x86-64 processors with AVX instruction set
y = i64x2.abs(x)
(x
is noty
) is lowered to:VPXOR xmm_y, xmm_y, xmm_y
VPSUBQ xmm_y, xmm_y, xmm_x
VBLENDVPD xmm_y, xmm_x, xmm_y, xmm_x
x86/x86-64 processors with SSE4.1 instruction set
y = i64x2.abs(x)
(x
is noty
andx
/y
is not inxmm0
) is lowered to:PXOR xmm0, xmm0, xmm0
PSUBQ xmm0, xmm_x
MOVDQA xmm_y, xmm0
BLENDVPD xmm_y, xmm_x
x86/x86-64 processors with SSE2 instruction set
y = i64x2.abs(x)
is lowered to:PSHUFD xmm_tmp, xmm_x, 0xF5
MOVDQA xmm_y, xmm_x
PSRAD xmm_tmp, 31
PXOR xmm_y, xmm_tmp
PSUBQ xmm_y, xmm_tmp
x = i64x2.abs(x)
is lowered to:PSHUFD xmm_tmp, xmm_x, 0xF5
PSRAD xmm_tmp, 31
PXOR xmm_x, xmm_tmp
PSUBQ xmm_x, xmm_tmp
ARM64 processors
y = i64x2.abs(x)
is lowered toABS Vy.2D, Vx.2D
ARMv7 processors with NEON instruction set
y = i64x2.abs(x)
is lowered to:VSHR.S64 Qtmp, Qx, #63
VEOR Qy, Qy, Qtmp
VSUB.I64 Qy, Qx, Qtmp