You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a reland of 4985db4
...with a better implementation of map(). I don't understand
why we had to revert, but it had something with calling the
function pointer in map_(), so maybe this will help.
I've flattened the map_() / map() merge CL into this one,
and marked the resulting map() as no_sanitize("cfi"). I
don't see anything wrong, so I think it's a false positive.
Original change's description:
> update skvx scalar-fallback strategy
>
> Turns out Clang's a lot better at auto-vectorizing "obvious" scalar code
> into obvious vector code when it's written out the long way, e.g.
>
> F32x4 x = ...;
> x = { sqrtf(x[0]), sqrtf(x[1]), sqrtf(x[2]), sqrtf(x[3]) };
>
> vectorizes into sqrtps a lot more reliably than our recurse-onto-scalars
> strategy, and also better than the other naive approach,
>
> F32x4 x = ...;
> for (int i = 0; i < 4; i++) { x[i] = sqrtf(x[i]); }
>
> So here I've added a map(V, fn) -> V' using C++14 tricks to let the
> compiler handle the expansion of x = { fn(x[0]), fn(x[1]), ...
> fn(x[N-1]) } for any N, and implemented most skvx scalar fallback code
> using that.
>
> With these now vectorizing well at any N, we can remove any
> specializations we'd written for particular N, really tidying up.
>
> Over in the SkVM interpreter, this is a big improvement for ceil and
> floor, which were being done 2 floats at a time instead of 8. They're
> now slimmed way down to
>
> shlq $6, %r13
> vroundps $K, (%r12,%r13), %ymm0
> vroundps $K, 32(%r12,%r13), %ymm1
> jmp ...
>
> where K is 9 or 10 depending on the op.
>
> I haven't found a scalar function that Clang will vectorize to vcvtps2pd
> (the rounding one, not truncating vcvttps2pd), so I've kept lrint()
> written the long way, updated to the style I've been using lately with
> specializations inline.
>
> Change-Id: Ia97abe3c876008228bf62b1daacd6f6140408fc4
> Reviewed-on: https://skia-review.googlesource.com/c/skia/+/317375
> Reviewed-by: Herb Derby <[email protected]>
> Commit-Queue: Mike Klein <[email protected]>
Cq-Include-Trybots: luci.chromium.try:linux_chromium_cfi_rel_ng
Bug: chromium:1129408
Change-Id: Ia9c14074b9a14a67dd221f4925894d35a551f9d7
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/317551
Commit-Queue: Mike Klein <[email protected]>
Reviewed-by: Herb Derby <[email protected]>
0 commit comments