Skip to content
This repository was archived by the owner on Dec 22, 2021. It is now read-only.

i64x2.abs instruction #413

Merged
merged 1 commit into from
Feb 5, 2021
Merged

i64x2.abs instruction #413

merged 1 commit into from
Feb 5, 2021

Conversation

Maratyszcza
Copy link
Contributor

@Maratyszcza Maratyszcza commented Dec 23, 2020

Introduction

This is proposal to add 64-bit variant of existing abs instruction. ARM64 and x86 with AVX512 natively support this instruction, and on earlier instruction sets it can be emulated with 3-5 instructions.

Applications

Mapping to Common Instruction Sets

This section illustrates how the new WebAssembly instructions can be lowered on common instruction sets. However, these patterns are provided only for convenience, compliant WebAssembly implementations do not have to follow the same code generation patterns.

x86/x86-64 processors with AVX512F and AVX512VL instruction sets

  • i64x2.abs
    • y = i64x2.abs(x) is lowered to VPABSQ xmm_y, xmm_x

x86/x86-64 processors with AVX instruction set

  • i64x2.abs
    • y = i64x2.abs(x) (x is not y) is lowered to:
      • VPXOR xmm_y, xmm_y, xmm_y
      • VPSUBQ xmm_y, xmm_y, xmm_x
      • VBLENDVPD xmm_y, xmm_x, xmm_y, xmm_x

x86/x86-64 processors with SSE4.1 instruction set

  • i64x2.abs
    • y = i64x2.abs(x) (x is not y and x/y is not in xmm0) is lowered to:
      • PXOR xmm0, xmm0, xmm0
      • PSUBQ xmm0, xmm_x
      • MOVDQA xmm_y, xmm0
      • BLENDVPD xmm_y, xmm_x

x86/x86-64 processors with SSE2 instruction set

  • i64x2.abs
    • y = i64x2.abs(x) is lowered to:
      • PSHUFD xmm_tmp, xmm_x, 0xF5
      • MOVDQA xmm_y, xmm_x
      • PSRAD xmm_tmp, 31
      • PXOR xmm_y, xmm_tmp
      • PSUBQ xmm_y, xmm_tmp
    • x = i64x2.abs(x) is lowered to:
      • PSHUFD xmm_tmp, xmm_x, 0xF5
      • PSRAD xmm_tmp, 31
      • PXOR xmm_x, xmm_tmp
      • PSUBQ xmm_x, xmm_tmp

ARM64 processors

  • i64x2.abs
    • y = i64x2.abs(x) is lowered to ABS Vy.2D, Vx.2D

ARMv7 processors with NEON instruction set

  • i64x2.abs
    • y = i64x2.abs(x) is lowered to:
      • VSHR.S64 Qtmp, Qx, #63
      • VEOR Qy, Qy, Qtmp
      • VSUB.I64 Qy, Qx, Qtmp

@jan-wassenberg
Copy link

Strong support, I'm adding this to Highway as well. It would be much harder for users to emulate this, especially if we do not add sign select nor i64 gt_s.

@dtig
Copy link
Member

dtig commented Jan 25, 2021

Adding a preliminary vote for the inclusion of i64x2.abs operation to the SIMD proposal below. Please vote with -

👍 For including i64x2.abs
👎 Against including i64x2.abs

@penzn
Copy link
Contributor

penzn commented Jan 25, 2021

I do have an issue with examples here - they seem to be all wrapper libraries. It isn't surprising that wrapper libraries would ave all sorts of operations, but this isn't the same as an app somebody could run.

@ngzhian ngzhian added the 2021-01-29 Agenda for sync meeting 1/29/21 label Jan 26, 2021
@dtig dtig added needs discussion Proposal with an unclear resolution and removed 2021-01-29 Agenda for sync meeting 1/29/21 labels Feb 2, 2021
@Maratyszcza Maratyszcza force-pushed the abs-64bit branch 3 times, most recently from f89840d to c7b0168 Compare February 5, 2021 16:43
@tlively tlively merged commit 961edc4 into WebAssembly:master Feb 5, 2021
@Maratyszcza
Copy link
Contributor Author

Fixed a bug in suggested lowering on SSE2 and ARM NEON (thanks @ngzhian for reporting).

ngzhian added a commit to ngzhian/simd that referenced this pull request Feb 9, 2021
@ngzhian ngzhian mentioned this pull request Feb 9, 2021
ngzhian added a commit to ngzhian/simd that referenced this pull request Feb 10, 2021
ngzhian added a commit that referenced this pull request Feb 10, 2021
ngzhian added a commit that referenced this pull request Feb 10, 2021
This was merged in #413.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
needs discussion Proposal with an unclear resolution
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants