Skip to content

Commit ec1cbf0

Browse files
authored
Document -sSPLIT_MODULE and wasm-split (#16664)
The wasm-split integration has proved to be useful and multiple parties have shown interest in using it, so publishing the documentation will be helpful.
1 parent 3f94742 commit ec1cbf0

File tree

3 files changed

+389
-1
lines changed

3 files changed

+389
-1
lines changed

site/source/docs/index.rst

+4-1
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,9 @@ This comprehensive documentation set contains everything you need to know to use
3131

3232
- :ref:`api-reference-index` is a reference for the Emscripten toolchain.
3333
- :ref:`tools-reference` is a reference for the Emscripten integration APIs.
34-
- :ref:`Sanitizers` shows how to debug with sanitizers
34+
- :ref:`Sanitizers` shows how to debug with sanitizers.
35+
- :ref:`Module-Splitting` is a guide to splitting modules and deferring the
36+
loading of code to improve startup time.
3537

3638
The full hierarchy of articles, opened to the second level, is shown below.
3739

@@ -44,6 +46,7 @@ The full hierarchy of articles, opened to the second level, is shown below.
4446
optimizing/Optimizing-Code
4547
optimizing/Optimizing-WebGL
4648
optimizing/Profiling-Toolchain
49+
optimizing/Module-Splitting
4750
compiling/index
4851
building_from_source/index
4952
contributing/index
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,380 @@
1+
.. _Module-Splitting:
2+
3+
================
4+
Module Splitting
5+
================
6+
7+
*wasm-split and the SPLIT_MODULE Emscripten integration are both in active
8+
development and may change and gain new features frequently. This page will be
9+
kept up-to-date with the latest changes.*
10+
11+
Large codebases often contain a lot of code that is very rarely used in practice
12+
or is never used early in the application's life cycle. Loading that unused code
13+
can noticeably delay application startup, so it would be good to defer loading
14+
that code until after the application has already started. One excellent
15+
solution for this is to use dynamic linking, but that requires refactoring an
16+
application into shared libraries and also comes with some performance overhead,
17+
so it is not always feasible. Module splitting is another approach where a
18+
module is split into separate pieces, the primary and secondary modules, after
19+
it is built normally. The primary module is loaded first and contains the code
20+
necessary to start the application, while the secondary module contains code
21+
that will be needed later or not at all. The secondary module will automatically
22+
be loaded on demand.
23+
24+
wasm-split is a Binaryen tool that performs module splitting. After running
25+
wasm-split, the primary module has all the same imports and exports as the
26+
original module and is meant to be a drop-in replacement for it. However, it
27+
also imports a placeholder function for each secondary function that was split
28+
out into the secondary module. Before the secondary module is loaded, calls of
29+
secondary functions will call the appropriate placeholder function instead. The
30+
placeholder functions are responsible for loading and instantiating the
31+
secondary module, which automatically replaces all the placeholder functions
32+
with the original secondary functions when it is instantiated. After the
33+
secondary module is loaded, the placeholder function that loaded it is also
34+
responsible for calling its corresponding newly-loaded secondary function and
35+
returning the result to its caller. The loading of the secondary module is
36+
therefore completely transparent to the primary module; it just looks like a
37+
function call took a long time to return.
38+
39+
Currently the only workflow for splitting modules involves instrumenting the
40+
original module to collect a profile of what functions are run, running that
41+
instrumented module with a number of interesting workloads, and using the
42+
resulting profiles to determine how to split the module. wasm-split will leave
43+
any function that was run during any of the profiled workloads in the primary
44+
module and will split all other functions out into the secondary module.
45+
46+
Emscripten has a prototype integration with wasm-split enabled by the
47+
``-sSPLIT_MODULE`` option. This option will emit the original module with the
48+
wasm-split instrumentation applied so it is ready to collect profiles. It will
49+
also insert the placeholder functions responsible for loading a secondary module
50+
into the emitted JS. The developer is then responsible for running appropriate
51+
workloads, collecting the profiles, and using the wasm-split tool to perform the
52+
splitting. After the module is split, everything will work correctly with no
53+
further changes to the JS produced by the initial compilation.
54+
55+
Basic Example
56+
-------------
57+
58+
Let’s run through a basic example of using SPLIT_MODULE with Node. Later we will
59+
adapt the example to run on the Web as well.
60+
61+
Here’s our application code::
62+
63+
// application.c
64+
65+
#include <stdio.h>
66+
#include <emscripten.h>
67+
68+
void foo() {
69+
printf("foo\n");
70+
}
71+
72+
void bar() {
73+
printf("bar\n");
74+
}
75+
76+
void unsupported(int i) {
77+
printf("%d is not supported!\n", i);
78+
}
79+
80+
EM_JS(int, get_number, (), {
81+
if (typeof prompt === 'undefined') {
82+
prompt = require('prompt-sync')();
83+
}
84+
return parseInt(prompt('Give me 0 or 1: '));
85+
});
86+
87+
int main() {
88+
int i = get_number();
89+
if (i == 0) {
90+
foo();
91+
} else if (i == 1) {
92+
bar();
93+
} else {
94+
unsupported(i);
95+
}
96+
}
97+
98+
This application prompts the user for some input and executes different
99+
functions depending on what the user provides. It uses the prompt-sync npm
100+
module to make the prompting behavior portable between Node and the Web. We will
101+
see that the input we provide during profiling will determine how our functions
102+
are split between the primary and secondary modules.
103+
104+
We can compile our application with ``-sSPLIT_MODULE``::
105+
106+
$ emcc application.c -o application.js -sSPLIT_MODULE
107+
108+
In addition to the typical application.wasm and application.js files, this also
109+
produces an application.wasm.orig file. application.wasm.orig is the original,
110+
unmodified module that a normal Emscripten build would produce, while
111+
application.wasm has been instrumented by wasm-split to collect profiles.
112+
113+
The instrumented module has an additional exported function,
114+
``__write_profile``, that takes as arguments a pointer and length for an
115+
in-memory buffer to which it will write the profile. ``__write_profile`` returns
116+
the length of the profile, and only writes the data if the supplied buffer is
117+
large enough. ``__write_profile`` can be called externally from JS or
118+
internally, from the application itself. For simplicity, we will just call it at
119+
the end of our main function here, but note that this will mean that any
120+
functions called after main, such as destructors for global objects, will not be
121+
included in the profile.
122+
123+
Here’s the function to write the profile and our new main function::
124+
125+
EM_JS(void, write_profile, (), {
126+
var __write_profile = Module['asm']['__write_profile'];
127+
if (__write_profile) {
128+
129+
// Get the size of the profile and allocate a buffer for it.
130+
var len = __write_profile(0, 0);
131+
var ptr = _malloc(len);
132+
133+
// Write the profile data to the buffer.
134+
__write_profile(ptr, len);
135+
136+
// Write the profile file.
137+
var profile_data = new Uint8Array(buffer, ptr, len);
138+
nodeFS.writeFileSync('profile.data', profile_data);
139+
140+
// Free the buffer.
141+
_free(ptr);
142+
}
143+
});
144+
145+
int main() {
146+
int i = get_number();
147+
if (i == 0) {
148+
foo();
149+
} else if (i == 1) {
150+
bar();
151+
} else {
152+
unsupported(i);
153+
}
154+
write_profile();
155+
}
156+
157+
Note that we only try to write the profile if the ``__write_profile`` export
158+
exists. This is important because only the instrumented, unsplit module exports
159+
``__write_profile``. The split modules will not include the profiling
160+
instrumentation or this export.
161+
162+
Our new write_profile function depends on malloc and free being available to JS,
163+
so we need to explicitly export them on the command line::
164+
165+
$ emcc application.c -o application.js -sSPLIT_MODULE -sEXPORTED_FUNCTIONS=_malloc,_free,_main
166+
167+
Now we can run our application, which produces a profile.data file. The next
168+
step is to use wasm-split and the profile to split the original module,
169+
application.wasm::
170+
171+
$ wasm-split --enable-mutable-globals --export-prefix=% application.wasm.orig -o1 application.wasm -o2 application.deferred.wasm --profile=profile.data
172+
173+
Let’s break down what all those options are for.
174+
175+
``--enable-mutable-globals``
176+
This option enables the mutable-global target feature, which allows mutable
177+
Wasm globals (as opposed to C/C++ globals) to be imported and exported.
178+
wasm-split has to share mutable globals between the primary and secondary
179+
modules, so it requires this feature to be enabled.
180+
181+
``--export-prefix=%``
182+
This is a prefix added to all the new exports wasm-split creates to share
183+
module elements from the primary module to the secondary module. The prefix
184+
can be used to differentiate "true" exports from those that only exist to be
185+
consumed by the secondary module. Emscripten’s wasm-split integration expects
186+
“%” in particular to be used as the prefix.
187+
188+
``-o1 application.wasm``
189+
Write the primary module to application.wasm. Note that this will overwrite
190+
the instrumented module previously produced by Emscripten, so the application
191+
will now use the split modules rather than the instrumented module.
192+
193+
``-o2 application.deferred.wasm``
194+
Write the secondary module to application.deferred.wasm. Emscripten expects
195+
the name of the secondary module to be the same as the name of the primary
196+
module with “.wasm” replaced with “.deferred.wasm”.
197+
198+
``--profile=profile.data``
199+
Directs wasm-split to use the profile in profile.data to guide the splitting.
200+
201+
Running application.js in node again, we can see that the application works just
202+
as it did before, but if we execute any code path besides the one used in the
203+
profiled workload, the application will print a console message about a
204+
placeholder function being called and the deferred module being loaded.
205+
206+
Profiling Multiple Workloads
207+
----------------------------
208+
209+
wasm-split supports merging profiles from multiple profiling workloads into a
210+
single profile to guide splitting. Any function that was run in any of the
211+
workloads will be kept in the primary module and all other functions will be
212+
split out into the secondary module.
213+
214+
This command will merge any number of profiles (here just profile1.data and
215+
profile2.data) into a single profile::
216+
217+
$ wasm-split --merge-profiles profile1.data profile2.data -o profile.data
218+
219+
Multithreaded Programs
220+
----------------------
221+
222+
By default, the data gathered by the wasm-split instrumentation is stored in
223+
Wasm globals, so it is thread local. But in a multithreaded program, it is
224+
important to collect profile information from all threads. To do so, you can can
225+
tell wasm-split to collect shared profile information in shared memory using the
226+
``--in-memory`` wasm-split flag. This will use memory starting at address zero
227+
to store the profile information, so you must also pass ``-sGLOBAL_BASE=N`` to
228+
Emscripten, where ``N`` is at least the number of functions in the module, to
229+
prevent the program from clobbering that memory region.
230+
231+
After splitting, multithreaded applications will currently fetch and compile the
232+
secondary module separately on each thread. The compiled secondary module is not
233+
postmessaged to each thread the way the Emscripten postmessages the primary
234+
module to the threads. This is not as bad as it sounds because downloads of the
235+
secondary module from workers will be serviced from the cache if the appropriate
236+
Cache-Control headers are set, but improving this is an area for future work.
237+
238+
Running on the Web
239+
------------------
240+
241+
One complication to keep in mind when using SPLIT_MODULE for Web applications is
242+
that the secondary module cannot be loaded both lazily and asynchronously, which
243+
means it cannot be loaded lazily on the main browser thread. The reason is that
244+
the placeholder functions need to be completely transparent to the functions in
245+
the primary module, so they can’t return until they have synchronously loaded
246+
and called the correct secondary function.
247+
248+
One workaround for this limitation would be to eagerly load and instantiate the
249+
secondary module and ensure that no secondary functions can possibly be called
250+
before it has been instantiated on the main browser thread. This may be
251+
difficult to ensure, though. Another fix would be to run the Asyncify
252+
transformation on the primary module to allow placeholder functions to return to
253+
the JS event loop while waiting for the secondary module to load asynchronously.
254+
This is on the wasm-split roadmap, although we do not yet know what the size and
255+
performance overhead of this solution will be.
256+
257+
This limitation on lazy loading means that the best way to run applications with
258+
SPLIT_MODULE is in a worker thread, for example using ``-sPROXY_TO_PTHREAD``. In
259+
PROXY_TO_PTHREAD mode, it is important to collect a profile for the browser main
260+
thread in addition to the application main thread because the browser main
261+
thread runs some functions not run in the application main thread, such as the
262+
shim that wraps the proxied main function and the functions involved in handling
263+
calls proxied back to the main browser thread. See the previous section for how
264+
to collect profiles from multiple threads.
265+
266+
Another minor complication is that the profile data cannot be immediately
267+
written to a file from inside the browser. The data must instead be transmitted
268+
to developer machines some other way, such as posting it to the dev server or
269+
copying a base64 encoding of it from the console.
270+
271+
Here’s code implementing the base64 solution::
272+
273+
var profile_data = new Uint8Array(buffer, ptr, len);
274+
var binary = '';
275+
for (var i = 0; i < profile_data.length; i++) {
276+
binary += String.fromCharCode(profile_data[i]);
277+
}
278+
console.log("===BEGIN===");
279+
console.log(window.btoa(binary));
280+
console.log("===END===");
281+
282+
Then the profile file can be created by by running::
283+
284+
$ echo [pasted base64] | base64 --decode > profile.data
285+
286+
or::
287+
288+
$ base64 --decode [base64 file] > profile.data
289+
290+
Usage with Dynamic Linking
291+
--------------------------
292+
293+
Module splitting can be used in conjunction with dynamic linking, but making the
294+
two features work correctly together requires some developer intervention.
295+
wasm-split often needs to grow the table to make space for placeholder
296+
functions, but that means that the instrumented and split modules would have
297+
different table sizes. Normally this is not a problem, but
298+
MAIN_MODULE/SIDE_MODULE dynamic linking support currently requires the table
299+
size to be baked into the JS Emscripten emits, so the table size needs to be
300+
stable.
301+
302+
To ensure that the table size is the same between the instrumented module and
303+
the split modules, use the ``-sINITIAL_TABLE=N`` Emscripten setting, where ``N``
304+
is the desired table size. Then, when using wasm-split to perform the splitting,
305+
pass ``--initial-table=N`` to wasm-split to ensure that the split modules have
306+
the correct table size as well.
307+
308+
If the specified table size is too small, you will get an error message when the
309+
primary module is loaded after splitting. Adjust the table size you specify
310+
until it is large enough. Besides taking up extra space at runtime, there is no
311+
downside to specifying a table size that is larger than necessary.
312+
313+
Custom Loading of the Secondary Module
314+
--------------------------------------
315+
316+
The default logic for lazily loading the secondary module can be overridden by
317+
implementing the "loadSplitModule" custom hook function. The hook is called from
318+
placeholder functions and is responsible for returning the [instance, module]
319+
pair for the secondary module. The hook takes as arguments the name of the file
320+
to load (e.g. “my_program.deferred.wasm”), the imports object to instantiate the
321+
module with, and the property corresponding to the called placeholder function.
322+
Here is an example implementation that does the same thing as the default
323+
implementation with some extra logging::
324+
325+
Module["loadSplitModule"] = function(deferred, imports, prop) {
326+
console.log('Custom handler for loading split module.');
327+
console.log('Called with placeholder ', prop);
328+
329+
return instantiateSync(deferred, imports);
330+
}
331+
332+
If the module was eagerly loaded, then this hook could simply instantiate the
333+
module rather than fetching and compiling it as well. However, if the eagerly
334+
loaded module is instantiated eagerly as well, the placeholder functions will be
335+
patched out and never called in the first place, so this custom hook will never
336+
be called either.
337+
338+
When eagerly instantiating the secondary module, the imports object should be::
339+
340+
{'primary': Module['asm']}
341+
342+
Debugging
343+
---------
344+
345+
wasm-split has several options to make debugging split modules easier.
346+
347+
``-v``
348+
When splitting, print the primary and secondary functions. When merging
349+
profiles, print profiles that do not contribute to the merged profile.
350+
351+
``-g``
352+
Preserve names in both the primary and secondary modules. Without this option,
353+
wasm-split will strip the names instead.
354+
355+
``--emit-module-names``
356+
Generate and emit module names to differentiate the primary and secondary
357+
module in stack traces, even if -g is not used.
358+
359+
``--symbolmap``
360+
Emit separate map files for the primary and secondary modules, mapping
361+
function indices to function names. When combined with --emit-module-names,
362+
these maps can be used to re-symbolify stack traces. To ensure that the
363+
function names are available for wasm-split to emit into the maps,
364+
pass --profiling-funcs to Emscripten.
365+
366+
``--placeholdermap``
367+
Emit a map file mapping placeholder function indices to their corresponding
368+
secondary functions. This can be useful for figuring out what function caused
369+
the secondary module to be loaded.
370+
371+
372+
Upcoming Changes
373+
----------------
374+
375+
*A list of changes and new features that have not yet been incorporated into
376+
this documentation.*
377+
378+
Work is planned on an integration with the Asyncify instrumentation that will
379+
allow the secondary module to be asynchronously loaded on the main browser
380+
thread.

0 commit comments

Comments
 (0)