Remove local sharing logic #2540

dcodeIO · 2022-10-21T18:52:23Z

New version of #1260, that drops all the logic to add and free temporary locals, instead using a new and unique local for each var, let, const respectively temporary. Has a couple advantages:

Compiler becomes much simpler
- No more need to keep track of temporaries to then free locals and scopes etc.
- No more weird side-effects from local reuse if compilation order is special
- Generally more reliably because less can go wrong
Named locals, even if scoped, maintain their name in debug info now, since there's no more sharing
- Can now simply use let everywhere, as var is no longer necessary to have debug names
Unique locals aid future implementation of local-dependent features, say closures

There is one drawback: The sharing logic we had before was something an optimizer would typically do (similar to optimize-locals). When it gets to the shadow stack, it helped to keep stack frames compact before Binaryen, since Binaryen can no longer reason about local spilling (which just becomes loads and stores) from just the instructions. Frame size about doubles it appears, while performance should remain about the same given other more significant costs.

Closes #1260

I've read the contributing guidelines
I've added my name and email to the NOTICE file

MaxGraey · 2022-10-21T18:59:03Z

tests/compiler/infer-array.release.wat

-   i64.const 0
-   i64.store $0
-   local.get $0
-   i64.const 0
-   i64.store $0 offset=8
+   i32.const 0
+   i32.const 108
+   memory.fill $0


Hmm, it looks like this signifintally affect to some cases like this one. When we got memory.fill instead of two stores

This could be a side-effect of the shadow stack now being less compact, so where previously it had two slots it now has more, producing a fill. I also had to increase the default stack size because of it. Would say the stack effects are in fact the most notable downside.

In a sense, the compiler was doing some sort of local optimization before, even when not optimizing, which now is gone, yet Binaryen cannot reason about the shadow stack (which is just memory accesses), leading to the effects.

After increased stacksize this downgrade change still present, hmm

Yes, there are more locals now, so shadow stack frames become larger. The cause is that locals are no longer shared, but there is one per named local now.

I'm not sure that's a good change. If binaryen could handle all this unique locals into the same number as before (at least for release) well maybe this simplification would make sense, but apparently this simplification will increase optimization time, increase size and degrade speed

The shadow stack is already quite a suffering on its own (that's why using the minimal runtime has its benefits), and it appears that frame size ultimately doesn't matter much except that the shadow stack becomes larger when there are many scoped locals. On the contrary, this change unblocks quite a few annoying things, like that we now can have actual debug names on all local, and ofc that all this complexity, that like one person on the planet has an idea how to approach, is gone from the compiler.

You're share perf data for bootstraping. But as far as I remember it doesn't use incremental GC and therefore shadow stack right?

It uses incremental GC:

assemblyscript/src/asconfig.json

Line 10 in 78b2d1a

"runtime": "incremental",

To give an outlook towards what this ultimately enables: Closures need to retroactively lift locals to a captured environment, and the obvious way to do this in an almost-single-pass compiler is to compile a function, find that it closed over a variable, and then amend any function also using that closed over variable. This is easy if each local has a unique index, but extremely hard with local sharing in place. In a sense, we previously performed optimizations that lost critical information, preventing us from creating passes that modify already generated code. No sane compiler would actually do this, I'd say, and would actually do the opposite, like utilize SSA to actually be able to reason about this stuff. Now we don't need SSA for closures, but compacting locals would make this task incredibly hard. Makes a lot of sense to remove this optimization, as it simply happens too early in the pipeline.

Now I see a reason. Thanks!

Of course you are right that the ideal opportunity to remove this is once we generate Wasm GC code, so Binaryen has all the information, optimizing before we'd actually downlevel to a bundled runtime with a shadow stack. Then, Binaryen does the compaction, and we only generate the shadow stack after, yielding about the same result as before. That's the ultimate goal of course, but still we'll have to start somewhen, and the data suggests that this somewhen can as well be now so we can move forward.

cli/options.json

MaxGraey · 2022-10-23T11:48:41Z

It looks like top-function var is not recommended anymore and let / const more preferable? Perhaps we should add according lint rule?

dcodeIO · 2022-10-23T12:01:18Z

Using let everywhere, now that all locals get a proper debug name, makes sense to aid refactoring, i.e. no more switching between var and let when moving code in or out of an inner scope. Makes sense now to stick to let throughout the compiler's codebase, though tests may differ of course so we test both let and var. Fine with a lint rule for this.

dcodeIO added 5 commits October 21, 2022 00:28

maintain local names in flows

5c4e538

simplify / compute non-parameter locals

5701d7e

remove temp local logic

3ded0fe

Merge branch 'main' into simplify-locals

7e2f1ba

refactor

c192bd5

MaxGraey reviewed Oct 21, 2022

View reviewed changes

dcodeIO added 2 commits October 21, 2022 20:59

increase default shadow stack size

3a0a2f3

update fixtures

a79ec5a

MaxGraey reviewed Oct 21, 2022

View reviewed changes

cli/options.json Outdated Show resolved Hide resolved

dcodeIO added 13 commits October 21, 2022 21:25

no more need for var

3a1838e

keep var in AST reserialization

208403f

half stack size

4a2674c

simplify inlining

27dad05

update cli help

9fa421a

lint

b573d52

simplify

aa23373

be careful with local flags

e59d754

sanity check

52a51a1

refactor

979e3aa

fix

b78c1b9

clarify that Flow#parentFunction is targetFunction

c32103d

clarify that Flow#actualFunction is sourceFunction

85a5ed8

dcodeIO marked this pull request as ready for review October 22, 2022 22:54

dcodeIO requested a review from MaxGraey October 23, 2022 11:24

MaxGraey approved these changes Oct 23, 2022

View reviewed changes

dcodeIO merged commit 8bb4fae into main Oct 23, 2022

HerrCai0907 deleted the simplify-locals branch October 17, 2023 08:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove local sharing logic #2540

Remove local sharing logic #2540

dcodeIO commented Oct 21, 2022 •

edited

Loading

MaxGraey Oct 21, 2022

dcodeIO Oct 21, 2022 •

edited

Loading

MaxGraey Oct 21, 2022

dcodeIO Oct 21, 2022

MaxGraey Oct 21, 2022

dcodeIO Oct 22, 2022

dcodeIO Oct 22, 2022

dcodeIO Oct 22, 2022

MaxGraey Oct 22, 2022

dcodeIO Oct 22, 2022

MaxGraey commented Oct 23, 2022

dcodeIO commented Oct 23, 2022

Remove local sharing logic #2540

Remove local sharing logic #2540

Conversation

dcodeIO commented Oct 21, 2022 • edited Loading

Choose a reason for hiding this comment

dcodeIO Oct 21, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MaxGraey commented Oct 23, 2022

dcodeIO commented Oct 23, 2022

dcodeIO commented Oct 21, 2022 •

edited

Loading

dcodeIO Oct 21, 2022 •

edited

Loading