Soft Freeze on new operation additions post Phase 3 #203

dtig · 2020-03-09T18:13:04Z

As the SIMD proposal is now at Phase 3, and getting closer to standardization, I'd like to propose a soft freeze on operation additions so that implementations, and toolchains have the opportunity to implement and optimize the existing operations without too much churn.

The freeze does not include existing issues/PRs, and is only for new proposals. The rationale is that due to the nature of the SIMD proposal, it's possible to keep adding new operations and given that there is reasonable consensus (barring open issues) on the current proposal, it will be hard to progress in standardizing the proposal without some sort of a freeze.

This is a soft freeze as it is possible with more implementations getting up to speed, and more applications are using the current SIMD proposal that there are must-have operations, or semantics need to be tweaked. But the bar for these would be higher.

Some open questions -

Is a deadline for a soft freeze useful (as opposed to going in effect immediately)? I suspect not, as the proposal has already been under discussion for some time now and as it's not a hard freeze deadlines may be arbitrary and not useful.
What should the bar be? This seems hard to codify in very concrete terms, but my thoughts are that the bar here should be that this would be very hard to emulate without an addition to the Spec, and the alternatives if any will cause performance cliffs. Addition of operations would be discussed on a case-by-case basis.

I'll be pinning this issue as suggested in a previous meeting for visibility. Any thoughts/concerns about this approach?

dtig · 2020-03-09T18:16:32Z

This would also make it easier for having a consistent set of opcode numbers as proposed in #129 going forward.

jan-wassenberg · 2020-03-10T06:40:59Z

@dtig makes sense. I see no pressing need for a deadline/extension. Hard/expensive to emulate without spec change sounds like a reasonable bar to me.

zeux · 2020-03-11T04:39:06Z

I would propose something like:

hard/expensive to emulate in absence of spec change
well supported on 2+ major archs
useful in 2+ important use cases

Obviously all of this is vague (what are important use cases? what's the threshold for emulation complexity? etc.) but I think the goal is to think very carefully about any new instructions and default to maintaining the current proposal since solidifying it is (arguably) more important than adding a new instruction. Agreed that the deadline doesn't seem necessary.

tlively · 2020-03-11T17:54:14Z

I like @zeux's idea for the threshold because it shares its axes of criteria with the current threshold and is simply stricter.

dtig · 2020-03-17T18:17:43Z

Thanks all, I also like @zeux's idea of introducing an explicit threshold, and we all seem to be in agreement that an explicit deadline isn't necessary. Are there other ideas or concerns about this? If not, my next steps will be to add some explicit wording about this into the README, or somewhere that's easy to find.

arunetm · 2020-03-17T18:57:40Z

Thanks @dtig for putting this together and thanks @zeux for suggesting the explicit thresholds, a general agreement on these will help to ensure consistent evaluation of the current outstanding issues/proposals. The criteria and thresholds are reasonable imo, any disagreements ?

Agree that no further explicit deadline is necessary as we were having this issue pinned and open for feedback. @abrown brought up a potential need for considering new instructions in rare cases, when current instruction semantics cannot be calibrated effectively for addressing the issues. Do we need a provision for these? @abrown anything to add?

penzn · 2020-10-10T00:41:27Z

I think there is a common consensus that we need to stick to some variant of the criteria @zeux proposed above (#203 (comment)):

hard/expensive to emulate in absence of spec change

well supported on 2+ major archs

useful in 2+ important use cases

However I think we need a bit more clarity on those, especially on what we mean by important use cases. I think it is very easy to find examples of native instructions or matching intrinsics for most PRs, (not harder than demonstrating that proposed instructions would perform well in micro benchmarks). I would like to propose the following as minimum criteria for a use case:

code (snippet from a software project)
the project should compile to Wasm or be in the process of porting
addition of proposed instruction(s) estimated to have measurable positive impact on performance

First bullet might sound obvious, but we do have PRs that either cite no use cases or refer to publications. With our current load of PRs approaching one per day, we need to think how realistic it would be to evaluate on actual code. I don't think that references to matching native intrinsics should be considered sufficient, unless that code is in the process of being ported to Wasm.

It is important that the project has a Wasm port, even experimental one (even one done by the person proposing the PR) and everybody can build that. I think we should actually go a step further and track the application in the proposal. This is nothing out of the ordinary, performance work is done like this in the "native compilation" world.

omnisip · 2020-10-10T00:48:29Z

@penzn

If I understand your suggestion correctly, it would be advantageous to support every intrinsic in the Intel library that proves to be cost-effective to implement and doesn't have an effective scalar fallback. Intel not ARM because porting with emscripten only supports or emulates Intel intrinsics today.

Maratyszcza · 2020-10-10T00:53:12Z

@omnisip According to Emscripten documentation, ARM NEON intrinsics are supported too.

penzn · 2020-10-10T00:58:41Z

Intel not ARM because porting with emscripten only supports or emulates Intel intrinsics today.

@omnisip please maintain professional tone.

CC @dtig @arunetm

omnisip · 2020-10-10T01:06:49Z

Huh? I like the idea of supporting Intel ops, and @Maratyszcza just nicely informed me that ARM ops were being supported too. I didn't know that. Both are definitely valuable use cases and if it were up to me, I'd find a way to support all of them.

…

On Fri, Oct 9, 2020, 18:58 Petr Penzin ***@***.***> wrote: Intel not ARM because porting with emscripten only supports or emulates Intel intrinsics today. @omnisip <https://github.com/omnisip> please maintain professional tone. CC @dtig <https://github.com/dtig> @arunetm <https://github.com/arunetm> — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#203 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AACKQJR7TEQZQ25RVM7TH2LSJ6WU5ANCNFSM4LEOSFIA> .

arunetm · 2020-10-10T03:51:26Z

There might have been some miscommunication here, inferring tone in comments can get tricky. I hope we can move past that and focus on the criteria of inclusion, which is a very timely topic.

Thanks @penzn for adding more clarity to the criteria we had, these are very helpful. It will be great to see if we have have general consensus around these clarifications as well. Performance figures are necessary to be captured along with the new proposals as we are trying to stabilize and move the MVP forward.

If I understand your suggestion correctly, it would be advantageous to support every intrinsic in the Intel library that proves to be cost-effective to implement and doesn't have an effective scalar fallback.

@omnisip SIMD instruction set is vast and its a non goal for the MVP to support all of them. On a high level, for the minimum viable product, its ideal to include operations with good architecture support that are necessary for real world Wasm simd use cases (this is further clarified through the criteria above by @zeux, @penzn ).

It's great to see the new ops proposals and associated use cases! Once standardized and in use, we cannot modify or drop instructions if they prove to be hidden perf cliffs for developers/tools. So i am in favor of being conservative in instruction inclusion for MVP. As instruction sets and use cases will keep evolving even after this MVP, a process to propose and standardize instructions after this MVP is already in the works. So, operations that has non-trivial implementation costs or use cases can be vetted as needed without holding the MVP back. We already have a sizable set of operations in this post-mvp bucket. If we can get to a general consensus on the criteria, we can use it as the guideline to triage the newer proposals. Great topic for the next SIMD sync. :)

(few edits for clarity)

lars-t-hansen · 2020-10-15T07:10:34Z

I like the expanded criteria put forward by @penzn, but I think we should be even tougher:

The proposer of a new instruction commits to doing timely benchmarking work to validate the effectiveness of the instruction in a real application or set of applications (which should be identified in the proposal), cross-platform (or at least a not-just-Intel subset we can agree on), when provided with a prototype implementation in a VM and compiler

This hasn't been a problem so far (thanks to @Maratyszcza in large part), quite the contrary, so let's make it explicit.

That puts the burden on the implementers to implement... which means the V8 team for the VM and the tools both. I don't know how that's working out. To share the burden, we could take on the VM work in SpiderMonkey, even if that has a number of disadvantages - a new VM brings new sources of noise to the benchmarking; no 32-bit ARM support; emerging + immature 64-bit ARM support; still working on the finer points of optimization on Intel.

Re making progress and iterating to a fixed point, would it be meaningful to adopt @omnisip's suggestion (from a different thread) to have a regular cadence of "releases", but for determining the MVP instruction set? Every month or so, look at the proposed instructions, and discuss whether there is a set that could be made available for the next round of experimentation and possible inclusion?

tlively · 2020-10-15T14:45:30Z

That puts the burden on the implementers to implement... which means the V8 team for the VM and the tools both. I don't know how that's working out.

It's working out fairly well so far. @ngzhian and I have been working to catch up with all the new instructions. We should have plenty of new performance results not tomorrow, but for the next meeting. The recent hold up is mostly on my end with the tools.

I think that we should continue evaluating instructions on a rolling basis at our syncs as they are implemented and benchmarked. Formalizing that doesn't seem very useful.

penzn · 2020-10-17T02:34:20Z

The point I am trying to argue is that we probably should prioritize PRs with clear path to Wasm compilation and, avoid PRs where we have to produce the kernels ourselves. However I don't think we should consider anything without a clear Wasm use case completely invalid, just defer it to "post MVP" or at least to the bottom of the review queue.

Also see #379 (comment)

dtig · 2021-01-08T19:59:00Z

Closing this issue as #389 supersedes this.

dtig pinned this issue Mar 9, 2020

arunetm mentioned this issue Apr 2, 2020

Agenda for sync meeting 04/03/2020 #210

Closed

dtig mentioned this issue May 8, 2020

Proposal to add fixed-point multiplication instructions #221

Closed

dtig added the tracking label May 8, 2020

tlively mentioned this issue Jun 2, 2020

v128.load32_zero and v128.load64_zero instructions #237

Merged

ngzhian mentioned this issue Jul 6, 2020

Sign Select instructions #124

Closed

tlively mentioned this issue Sep 10, 2020

Finalizing the instruction set #343

Closed

Maratyszcza mentioned this issue Oct 8, 2020

Population count instruction #379

Merged

penzn mentioned this issue Oct 10, 2020

Agenda for sync meeting 10/16/2020 #369

Closed

tlively mentioned this issue Oct 16, 2020

Agenda for sync meeting 10/30/2020 #386

Closed

arunetm mentioned this issue Oct 20, 2020

Remove integer SIMD not-equals instructions #351

Closed

tlively unpinned this issue Nov 16, 2020

dtig closed this as completed Jan 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Soft Freeze on new operation additions post Phase 3 #203

Soft Freeze on new operation additions post Phase 3 #203

dtig commented Mar 9, 2020

dtig commented Mar 9, 2020

jan-wassenberg commented Mar 10, 2020

zeux commented Mar 11, 2020

tlively commented Mar 11, 2020

dtig commented Mar 17, 2020

arunetm commented Mar 17, 2020

penzn commented Oct 10, 2020

omnisip commented Oct 10, 2020

Maratyszcza commented Oct 10, 2020 •

edited

Loading

penzn commented Oct 10, 2020

omnisip commented Oct 10, 2020 via email

arunetm commented Oct 10, 2020 •

edited

Loading

lars-t-hansen commented Oct 15, 2020

tlively commented Oct 15, 2020

penzn commented Oct 17, 2020

dtig commented Jan 8, 2021 •

edited

Loading

Soft Freeze on new operation additions post Phase 3 #203

Soft Freeze on new operation additions post Phase 3 #203

Comments

dtig commented Mar 9, 2020

dtig commented Mar 9, 2020

jan-wassenberg commented Mar 10, 2020

zeux commented Mar 11, 2020

tlively commented Mar 11, 2020

dtig commented Mar 17, 2020

arunetm commented Mar 17, 2020

penzn commented Oct 10, 2020

omnisip commented Oct 10, 2020

Maratyszcza commented Oct 10, 2020 • edited Loading

penzn commented Oct 10, 2020

omnisip commented Oct 10, 2020 via email

arunetm commented Oct 10, 2020 • edited Loading

lars-t-hansen commented Oct 15, 2020

tlively commented Oct 15, 2020

penzn commented Oct 17, 2020

dtig commented Jan 8, 2021 • edited Loading

Maratyszcza commented Oct 10, 2020 •

edited

Loading

arunetm commented Oct 10, 2020 •

edited

Loading

dtig commented Jan 8, 2021 •

edited

Loading