Skip to content

Commit 81dd85e

Browse files
committed
Documentation for local allocations
1 parent b05519f commit 81dd85e

File tree

3 files changed

+874
-0
lines changed

3 files changed

+874
-0
lines changed

jane/doc/local-intro.md

Lines changed: 155 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,155 @@
1+
# Introduction to Local Allocations
2+
3+
4+
Instead of allocating values normally on the GC heap, local
5+
allocations allow you to stack-allocate values using the new `local_`
6+
keyword:
7+
8+
let local_ x = { foo; bar } in
9+
...
10+
11+
or equivalently, by putting the keyword on the expression itself:
12+
13+
let x = local_ { foo; bar } in
14+
...
15+
16+
To enable this feature, you need to pass the `-extension local` flag
17+
to the compiler. Without this flag, `local_` is not recognized as a
18+
keyword, and no local allocations will be performed.
19+
20+
These values live on a separate stack, and are popped off at the end
21+
of the _region_. Generally, the region ends when the surrounding
22+
function returns, although read [the reference](local-reference.md) for more
23+
details.
24+
25+
This helps performance in a couple of ways: first, the same few hot
26+
cachelines are constantly reused, so the cache footprint is lower than
27+
usual. More importantly, local allocations will never trigger a GC,
28+
and so they're safe to use in low-latency code that must currently be
29+
zero-alloc.
30+
31+
However, for this to be safe, local allocations must genuinely be
32+
local. Since the memory they occupy is reused quickly, we must ensure
33+
that no dangling references to them escape. This is checked by the
34+
typechecker, and you'll see new error messages if local values leak:
35+
36+
# let local_ thing = { foo; bar } in
37+
some_global := thing;;
38+
^^^^^
39+
Error: This value escapes its region
40+
41+
42+
Most of the types of allocation that OCaml does can be locally
43+
allocated: tuples, records, variants, closures, boxed numbers,
44+
etc. Local allocations are also possible from C stubs, although this
45+
requires code changes to use the new `caml_alloc_local` instead of
46+
`caml_alloc`. A few types of allocation cannot be locally allocated,
47+
though, including first-class modules, classes and objects, and
48+
exceptions. The contents of mutable fields (inside `ref`s, `array`s
49+
and mutable record fields) also cannot be locally allocated.
50+
51+
52+
## Local parameters
53+
54+
Generally, OCaml functions can do whatever they like with their
55+
arguments: use them, return them, capture them in closures or store
56+
them in globals, etc. This is a problem when trying to pass around
57+
locally-allocated values, since we need to guarantee they do not
58+
escape.
59+
60+
The remedy is that we allow the `local_` keyword to also appear on function parameters:
61+
62+
let f (local_ x) = ...
63+
64+
A local parameter is a promise by a function not to let a particular
65+
argument escape its region. In the body of f, you'll get a type error
66+
if x escapes, but when calling f you can freely pass local values as
67+
the argument. This promise is visible in the type of f:
68+
69+
val f : local_ 'a -> ...
70+
71+
The function f may be equally be called with locally-allocated or
72+
GC-heap values: the `local_` annotation places obligations only on the
73+
definition of f, not its uses.
74+
75+
Even if you're not interested in performance benefits, local
76+
parameters are a useful new tool for structuring APIs. For instance,
77+
consider a function that accepts a callback, to which it passes some
78+
mutable value:
79+
80+
let uses_callback ~f =
81+
let tbl = Foo.Table.create () in
82+
fill_table tbl;
83+
let result = f tbl in
84+
add_table_to_global_registry tbl;
85+
result
86+
87+
Part of the contract of `uses_callback` is that it expects `f` not to
88+
capture its argument: unexpected results could ensue if `f` stored a
89+
reference to this table somewhere, and it was later used and modified
90+
after it was added to the global registry. Using `local_`
91+
annotations allows this constraint to be made explicit and checked at
92+
compile time, by giving `uses_callback` the signature:
93+
94+
val uses_callback : f:(local_ int Foo.Table.t -> 'a) -> 'a
95+
96+
97+
## Inference
98+
99+
The examples above use the local_ keyword to mark local
100+
allocations. In fact, this is not necessary, and the compiler will
101+
use local allocations by default where possible, as long as the
102+
`-extension local` flag is enabled.
103+
104+
The only effect of the keyword on e.g. a let binding is to change the
105+
behavior for escaping values: if the bound value looks like it escapes
106+
and therefore cannot be locally allocated, then without the keyword
107+
the compiler will allocate this value on the GC heap as usual, while
108+
with the keyword it will instead report an error.
109+
110+
Inference can even determine whether parameters are local, which is
111+
useful for helper functions. It's less useful for toplevel functions,
112+
though, as whether their parameters are local is generally forced by
113+
their signature in the mli file, where no inference is performed.
114+
115+
Inference does not work across files: if you want e.g. to pass a local
116+
argument to a function in another module, you'll need to explicitly
117+
mark the local parameter in the other module's mli.
118+
119+
120+
121+
122+
## More control
123+
124+
There are a number of other features that allow more precise control
125+
over which values are locally allocated, including:
126+
127+
- **Local closures**:
128+
129+
```
130+
let local_ f a b c = ...
131+
```
132+
133+
defines a function `f` whose closure is itself locally allocated.
134+
135+
- **Local-returning functions**
136+
137+
```
138+
let f a b c = local_
139+
...
140+
```
141+
142+
defines a function `f` which returns local allocations into its
143+
caller's region.
144+
145+
- **Global fields**
146+
147+
```
148+
type 'a t = { global_ g : 'a }
149+
```
150+
151+
defines a record type `t` whose `g` field is always known to be on
152+
the GC heap (and may therfore freely escape regions), even though
153+
the record itself may be locally allocated.
154+
155+
For more details, read [the reference](./local-reference.md).

jane/doc/local-pitfalls.md

Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
# Some Pitfalls of Local Allocations
2+
3+
This document outlines some common pitfalls that may come up when
4+
trying out local allocations in a new codebase, as well as some
5+
suggested workarounds. Over time, this list may grow (as experience
6+
discovers new things that go wrong) or shrink (as we deploy new
7+
compiler versions that ameliorate some issues).
8+
9+
10+
## Tail calls
11+
12+
Many OCaml functions just happen to end in a tail call, even those
13+
that are not intentionally tail-recursive. To preserve the
14+
constant-space property of tail calls, the compiler applies special
15+
rules around local allocations in tail calls (see [the
16+
reference](./local-reference.md)).
17+
18+
If this causes a problem for calls that just happen to be in tail
19+
position, the easiest workaround is to prevent them from being
20+
treated as tail calls by moving them, replacing:
21+
22+
func arg1 arg2
23+
24+
with
25+
26+
let res = func arg1 arg2 in res
27+
28+
With this version, local values used in `fun arg1 arg2` will be freed
29+
after `func` returns.
30+
31+
## Partial applications with local parameters
32+
33+
To enable the use of local allocations with higher-order functions, a
34+
necessary step is to add local annotations to function types,
35+
particularly those of higher-order functions. For instance, an `iter`
36+
function may become:
37+
38+
val iter : 'a list -> f:local_ ('a -> unit) -> unit
39+
40+
thus allowing locally-allocated closures `f` to be used.
41+
42+
However, this is unfortunately not an entirely backwards-compatible
43+
change. The problem is that partial applications of `iter` functions
44+
with the new type are themselves locally allocated, because they close
45+
over the possibly-local `f`. This means in particular that partial
46+
applications will no longer be accepted as module-level definitions:
47+
48+
let print_each_foo = iter ~f:(print_foo)
49+
50+
The fix in these cases is to expand the partial application to a full
51+
application by introducing extra arguments:
52+
53+
let print_each_foo x = iter ~f:(print_foo) x
54+
55+
## Typing of (@@) and (|>)
56+
57+
The typechecking of (@@) and (|>) changed slightly with the local
58+
allocations typechecker, in order to allow them to work with both
59+
local and nonlocal arguments. The major difference is that:
60+
61+
f x @@ y
62+
y |> f x
63+
f x y
64+
65+
are now all typechecked in exactly the same way. Previously, the
66+
first two were typechecked differently, as an application of an
67+
operator to the expressions `f x` and `y`, rather than a single
68+
application with two arguments.
69+
70+
This affects which expressions are in "argument position", which can
71+
have a subtle effect on when optional arguments are given their
72+
default values. If this affects you (which is extremely rare), you
73+
will see type errors involving optional parameters, and you can
74+
restore the old behaviour by removing the use of `(@@)` or `(|>)` and
75+
parenthesizing their subexpressions. That is, the old typing behaviour
76+
of `f x @@ y` is available as:
77+
78+
(f x) y

0 commit comments

Comments
 (0)