Skip to content

Middle-end support for local allocs #491

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Feb 1, 2022

Conversation

stedolan
Copy link
Contributor

This PR adds support for local allocations to the middle-end, following on from #478.

The main change is a change to Lambda to introduce the following features, as well as the corresponding middle-end changes to compile them:

  • Allocation modes (Alloc_heap / Alloc_local) on allocating primitives
  • Region boundaries (Lregion e) delimiting lifetime of local allocs
  • Tail call markers (Apply_tail) on applications

The last of these is the trickiest: tail calls from within a region should end the region after the arguments to the tail call have been evaluated, but before control is transferred.

I'm afraid that the middle-end changes are large and complicated. However, the complicated bit (handling overapplications and tail calls of local closures) is not large, and the large bit (plumbing alloc modes through every single allocating primitive) is not complicated. (There are detailed tests, but not in this PR as they depend also on frontend support)

This PR only adds support for Closure and Flambda1. PR #490 has a draft by @mshinwell of Flambda2 support, but in this PR there is only stub support in Flambda2.

The remaining changes to ocaml/ are the minimal ones required for CI. Full support for local allocs in the ocaml/ subtree will appear later.

(As with #478, this code is extracted from the local allocs branch, whose dev history and previous PRs are here)

Copy link
Contributor

@xclerc xclerc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(For the files I own.)

@mshinwell
Copy link
Collaborator

Apart from the backend changes I'll review the whole of this PR.

Copy link
Contributor

@gretay-js gretay-js left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My review is for backend changes only.

Question: mode replaces a place-holder Alloc_heap throughout cmmgen.ml in this PR, except a couple of places related to unboxed numbers: is_unboxed_number_cmm and transl_let. Is it intentional?

@mshinwell
Copy link
Collaborator

My review is for backend changes only.

Question: mode replaces a place-holder Alloc_heap throughout cmmgen.ml in this PR, except a couple of places related to unboxed numbers: is_unboxed_number_cmm and transl_let. Is it intentional?

I'll look at this, thanks for spotting that.

@stedolan
Copy link
Contributor Author

My review is for backend changes only.

Question: mode replaces a place-holder Alloc_heap throughout cmmgen.ml in this PR, except a couple of places related to unboxed numbers: is_unboxed_number_cmm and transl_let. Is it intentional?

Thanks for pointing this out, I'll have a proper look on Monday. The Cmm number unboxing is tricky, and there might well be something wrong here.

But as a quick note: it is always safe to allocate a float or boxed integer as Alloc_heap, although it is not always optimal. (The invariant that must be upheld is that there may never be heap -> local pointers. Floats and boxed integers never contain pointers, so making them heap can never break this invariant).

There are a couple of places where Alloc_heap is unconditionally used, including:

  • C externals that return unboxed floats when used in a position requiring a boxed float cause an allocation, which is always Alloc_heap (even if the context would allow a local one)
  • 32-bit fallback implementations of 64-bit primitives use Alloc_heap for the boxed 64-bit values
  • Accesses to flat float arrays which box the result generate an Alloc_heap box

(All of these could be made to use local allocations with even more propagation of alloc_mode through the compiler, this is left for a future patch)

@mshinwell
Copy link
Collaborator

mshinwell commented Jan 28, 2022

My review is for backend changes only.
Question: mode replaces a place-holder Alloc_heap throughout cmmgen.ml in this PR, except a couple of places related to unboxed numbers: is_unboxed_number_cmm and transl_let. Is it intentional?

Thanks for pointing this out, I'll have a proper look on Monday. The Cmm number unboxing is tricky, and there might well be something wrong here.

I also think there might be something wrong, but expect to fix it today. (update: we need to discuss on Monday!)

@stedolan
Copy link
Contributor Author

Question: mode replaces a place-holder Alloc_heap throughout cmmgen.ml in this PR, except a couple of places related to unboxed numbers: is_unboxed_number_cmm and transl_let. Is it intentional?

I've just had a look:

  • is_unboxed_number_cmm uses Alloc_heap when unboxing a constant (found via Cconst_symbol). This is correct: constants count as "heap" because there are no restrictions on where they may escape, as they are never deallocated.

  • transl_let uses Alloc_heap when unboxing a mutable float or boxed integer. This is correct but somewhat conservative: if the resulting value is used in a context where boxing is required, then the box will be Alloc_heap even if an Alloc_local is allowed. For instance:

    let f (g : local_ float -> int) =
      let x = local_ ref 0. in
      for i = 1 to 10 do x := !x +. 1. done;
      let r = g !x in
      r

    Here, x is unboxed, mutable and local, and a boxing operation is re-introduced in the call to g !x. Because of the conservative use of Alloc_heap, this box will be Alloc_heap even though Alloc_local would be OK. (It should be possible to improve this later by tracking more information about allocation mode, but this seems OK for now)

stedolan and others added 5 commits February 1, 2022 10:53
Changes Lambda by introducing:

  - Allocation modes (Alloc_heap / Alloc_local) on allocating primitives
  - Region boundaries (Lregion e) delimiting lifetime of local allocs
  - Tail call markers (Apply_tail) on applications

The last of these is the trickiest: tail calls from within a region
should end the region after the arguments to the tail call have been
evaluated, but before control is transferred.

Two middle-ends (Closure and Flambda1) now support local allocs.
The trickiest part by far is correctly maintaining allocation modes
during inlining, particularly when the application being inlined is a
partial or over-application, or a tail call.

Flambda2 here has only stub support for local allocs.

The remaining changes to ocaml/ are the minimal ones required for CI.
Full support for local allocs in the ocaml/ subtree will appear later.
Co-authored-by: Xavier Clerc <[email protected]>
@mshinwell
Copy link
Collaborator

This is ok to merge once CI passes.

stedolan added a commit to ocaml-flambda/ocaml-jst that referenced this pull request Feb 1, 2022
@mshinwell mshinwell merged commit 5f15f9d into oxcaml:main Feb 1, 2022
mshinwell pushed a commit that referenced this pull request Feb 1, 2022
stedolan added a commit that referenced this pull request Feb 1, 2022
86526aa flambda-backend: Middle-end support for local allocs (#491)
969b937 flambda-backend: Backend support for local allocations (#478)
2d1e6ef flambda-backend: Remove leading space from LINE. (#484)

git-subtree-dir: ocaml
git-subtree-split: 86526aa
stedolan added a commit that referenced this pull request Feb 1, 2022
173842c Merge flambda-backend changes
ed7eba2 Remove leading space from LINE. (#484)
bd61170 Bump magic numbers (#5)
c50c47d Add CI builds with local allocations enabled
1412792 Move local allocations support behind '-extension local'
6d8e42a Better tail call behaviour in caml_applyN
c7dac3d Typemod: toplevel bindings escape even if no variables are bound
82d6c3e Several fixes for partial application and currying
d05c70c Pprintast support for new local syntax
e0e62fc Typecheck x |> f y as (f y x), not ((f y) x)
d7e34ce Remove autogeneration of @ocaml.curry
b9a0593 Port #493
0a872d9 Code review fixes from #491
6c168bb Remove local allocation counting
3c6e7f0 Code review fixes from #478
bb97207 Rename Lambda.apply_position
a7cb650 Quieten Makefile when runtime dep files are not present
c656dc9 Merge flambda-backend changes
11b5424 Avoid printing double spaces in function argument lists
7751faa Restore locations to Typedtree.{pat,let}_bound_idents_full
e450b6c add build_ocaml_compiler.sexp
0403bb3 Revert PR 9895 to continue installing VERSION
b3447db Ensure new local attributes are namespaced properly
7f213fc Allow empty functions again
8f22ad8 Bugfix: ensure local domain state is initialised
80f54dd Bugfix for Selectgen with regions
e8133a1 Fix external-external signature inclusion
9840051 Bootstrap
d879f23 Merge remote-tracking branch 'jane/local-reviewed' into local-merge
94454f5 Use Local_store for the local allocations ref
54a164c Create fewer regions, according to typechecking (#59)
1c2479b Merge flambda-backend changes
ce34678 Fix printing of modes in return types
91f2281 Hook mode variable solving into Btype.snapshot/backtrack
54e4b09 Move Alloc_mode and Value_mode to Btype
ff4611e Merge flambda-backend changes
ce62e45 Ensure allocations are initialised, even dead ones
6b6ec5a Fix the alloc.ml test on 32-bit builds
81e9879 Merge flambda-backend changes
40a7f89 Update repo URL for ocaml-jst, and rename script.
0454ee7 Add some new locally-allocating primitives (#57)
8acdda1 Reset the local stack pointer in exception handlers (#56)
8dafa98 Improve typing for (||) and (&&) (#55)
8c64754 Fix make_check_all_arches (#54)
b50cd45 Allow arguments to primitives to be local even in tail position (#53)
cad125d Fix modes from or-patterns (#50)
4efdb72 Fix tailcalls tests with inlining (#52)
4a795cb Flambda support (#49)
74722cb Add [@ocaml.principal] and [@ocaml.noprincipal] attributes, and use in oo.mli
6d7d3b8 Ensure that functions are evaluated after their arguments (flambda-backend #353)
89bda6b Keep Sys.opaque_identity in Cmm and Mach (port upstream PR 9412)
a39126a Fix tailcalls within regions (#48)
4ac4cfd Fix stdlib manpages build
3a95f5e Merge flambda-backend changes
efe80c9 Add jane/pull-flambda-patches script
fca94c4 Register allocations for Omitted parameter closures (#47)
103b139 Remove various FIXMEs (#46)
62ba2c1 Bootstrap
a0062ad Allow local allocations for various primitives (#43)
7a2165e Allow primitives to be poly-moded (#43)
2af3f55 Fix a flaky test by refactoring TypePairs (ocaml/ocaml#10638)
58dd807 Bootstrap
ee3be10 Fix modes in build_apply for partial applications
fe73656 Tweak for evaluation order of labelled partial applications (#10653)
0527570 Fix caml_modify on local allocations (#40)
e657e99 Relax modes for `as` patterns (#42)
f815bf2 Add special mode handling for tuples in matches and let bindings (#38)
39f1211 Only take the upper bounds of modes associated with allocations (#37)
aec6fde Interpret arrow types in "local positions" differently
c4f3319 Bootstrap
ff6fdad Add some missing regions
40d586d Bootstrap
66d8110 Switch to a system with 3 modes for values
f2c5a85 Bugfix for Comballoc with local allocations. (#41)
83bcd09 Fix bug with root scanning during compaction (#39)
1b5ec83 Track modes in Lambda.lfunction and onwards (#33)
f1e2e97 Port ocaml/ocaml#10728
56703cd Port ocaml/ocaml#10081
eb66785 Support local allocations in i386 and fix amd64 bug (#31)
c936b19 Disallow local recursive non-functions (#30)
c7a193a GC support for local allocations (#29)
8dd7270 Nonlocal fields (#28)
e19a2f0 Bootstrap
694b9ac Add syntax to the parser for local allocations (#26)
f183008 Lower initial stack size
918226f Allow local closure allocations (#27)
2552e7d Introduce mode variables (#25)
bc41c99 Minor fixes for local allocations (#24)
a2a4e60 Runtime and compiler support for more local allocations (#23)
d030554 Typechecking for local allocations (#21)
9ee2332 Bugfix missing from #20
02c4cef Retain block-structured local regions until Mach.
86dbe1c amd64: Move stack realloc calls out-of-line
324d218 More typing modes and locking of environments
a4080b8 Initial version of local allocation (unsafe)

git-subtree-dir: ocaml
git-subtree-split: 173842c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants