|
| 1 | +.. _Module-Splitting: |
| 2 | + |
| 3 | +================ |
| 4 | +Module Splitting |
| 5 | +================ |
| 6 | + |
| 7 | +*wasm-split and the SPLIT_MODULE Emscripten integration are both in active |
| 8 | +development and may change and gain new features frequently. This page will be |
| 9 | +kept up-to-date with the latest changes.* |
| 10 | + |
| 11 | +Large codebases often contain a lot of code that is very rarely used in practice |
| 12 | +or is never used early in the application's life cycle. Loading that unused code |
| 13 | +can noticeably delay application startup, so it would be good to defer loading |
| 14 | +that code until after the application has already started. One excellent |
| 15 | +solution for this is to use dynamic linking, but that requires refactoring an |
| 16 | +application into shared libraries and also comes with some performance overhead, |
| 17 | +so it is not always feasible. Module splitting is another approach where a |
| 18 | +module is split into separate pieces, the primary and secondary modules, after |
| 19 | +it is built normally. The primary module is loaded first and contains the code |
| 20 | +necessary to start the application, while the secondary module contains code |
| 21 | +that will be needed later or not at all. The secondary module will automatically |
| 22 | +be loaded on demand. |
| 23 | + |
| 24 | +wasm-split is a Binaryen tool that performs module splitting. After running |
| 25 | +wasm-split, the primary module has all the same imports and exports as the |
| 26 | +original module and is meant to be a drop-in replacement for it. However, it |
| 27 | +also imports a placeholder function for each secondary function that was split |
| 28 | +out into the secondary module. Before the secondary module is loaded, calls of |
| 29 | +secondary functions will call the appropriate placeholder function instead. The |
| 30 | +placeholder functions are responsible for loading and instantiating the |
| 31 | +secondary module, which automatically replaces all the placeholder functions |
| 32 | +with the original secondary functions when it is instantiated. After the |
| 33 | +secondary module is loaded, the placeholder function that loaded it is also |
| 34 | +responsible for calling its corresponding newly-loaded secondary function and |
| 35 | +returning the result to its caller. The loading of the secondary module is |
| 36 | +therefore completely transparent to the primary module; it just looks like a |
| 37 | +function call took a long time to return. |
| 38 | + |
| 39 | +Currently the only workflow for splitting modules involves instrumenting the |
| 40 | +original module to collect a profile of what functions are run, running that |
| 41 | +instrumented module with a number of interesting workloads, and using the |
| 42 | +resulting profiles to determine how to split the module. wasm-split will leave |
| 43 | +any function that was run during any of the profiled workloads in the primary |
| 44 | +module and will split all other functions out into the secondary module. |
| 45 | + |
| 46 | +Emscripten has a prototype integration with wasm-split enabled by the |
| 47 | +``-sSPLIT_MODULE`` option. This option will emit the original module with the |
| 48 | +wasm-split instrumentation applied so it is ready to collect profiles. It will |
| 49 | +also insert the placeholder functions responsible for loading a secondary module |
| 50 | +into the emitted JS. The developer is then responsible for running appropriate |
| 51 | +workloads, collecting the profiles, and using the wasm-split tool to perform the |
| 52 | +splitting. After the module is split, everything will work correctly with no |
| 53 | +further changes to the JS produced by the initial compilation. |
| 54 | + |
| 55 | +Basic Example |
| 56 | +------------- |
| 57 | + |
| 58 | +Let’s run through a basic example of using SPLIT_MODULE with Node. Later we will |
| 59 | +adapt the example to run on the Web as well. |
| 60 | + |
| 61 | +Here’s our application code:: |
| 62 | + |
| 63 | + // application.c |
| 64 | + |
| 65 | + #include <stdio.h> |
| 66 | + #include <emscripten.h> |
| 67 | + |
| 68 | + void foo() { |
| 69 | + printf("foo\n"); |
| 70 | + } |
| 71 | + |
| 72 | + void bar() { |
| 73 | + printf("bar\n"); |
| 74 | + } |
| 75 | + |
| 76 | + void unsupported(int i) { |
| 77 | + printf("%d is not supported!\n", i); |
| 78 | + } |
| 79 | + |
| 80 | + EM_JS(int, get_number, (), { |
| 81 | + if (typeof prompt === 'undefined') { |
| 82 | + prompt = require('prompt-sync')(); |
| 83 | + } |
| 84 | + return parseInt(prompt('Give me 0 or 1: ')); |
| 85 | + }); |
| 86 | + |
| 87 | + int main() { |
| 88 | + int i = get_number(); |
| 89 | + if (i == 0) { |
| 90 | + foo(); |
| 91 | + } else if (i == 1) { |
| 92 | + bar(); |
| 93 | + } else { |
| 94 | + unsupported(i); |
| 95 | + } |
| 96 | + } |
| 97 | + |
| 98 | +This application prompts the user for some input and executes different |
| 99 | +functions depending on what the user provides. It uses the prompt-sync npm |
| 100 | +module to make the prompting behavior portable between Node and the Web. We will |
| 101 | +see that the input we provide during profiling will determine how our functions |
| 102 | +are split between the primary and secondary modules. |
| 103 | + |
| 104 | +We can compile our application with ``-sSPLIT_MODULE``:: |
| 105 | + |
| 106 | + $ emcc application.c -o application.js -sSPLIT_MODULE |
| 107 | + |
| 108 | +In addition to the typical application.wasm and application.js files, this also |
| 109 | +produces an application.wasm.orig file. application.wasm.orig is the original, |
| 110 | +unmodified module that a normal Emscripten build would produce, while |
| 111 | +application.wasm has been instrumented by wasm-split to collect profiles. |
| 112 | + |
| 113 | +The instrumented module has an additional exported function, |
| 114 | +``__write_profile``, that takes as arguments a pointer and length for an |
| 115 | +in-memory buffer to which it will write the profile. ``__write_profile`` returns |
| 116 | +the length of the profile, and only writes the data if the supplied buffer is |
| 117 | +large enough. ``__write_profile`` can be called externally from JS or |
| 118 | +internally, from the application itself. For simplicity, we will just call it at |
| 119 | +the end of our main function here, but note that this will mean that any |
| 120 | +functions called after main, such as destructors for global objects, will not be |
| 121 | +included in the profile. |
| 122 | + |
| 123 | +Here’s the function to write the profile and our new main function:: |
| 124 | + |
| 125 | + EM_JS(void, write_profile, (), { |
| 126 | + var __write_profile = Module['asm']['__write_profile']; |
| 127 | + if (__write_profile) { |
| 128 | + |
| 129 | + // Get the size of the profile and allocate a buffer for it. |
| 130 | + var len = __write_profile(0, 0); |
| 131 | + var ptr = _malloc(len); |
| 132 | + |
| 133 | + // Write the profile data to the buffer. |
| 134 | + __write_profile(ptr, len); |
| 135 | + |
| 136 | + // Write the profile file. |
| 137 | + var profile_data = new Uint8Array(buffer, ptr, len); |
| 138 | + nodeFS.writeFileSync('profile.data', profile_data); |
| 139 | + |
| 140 | + // Free the buffer. |
| 141 | + _free(ptr); |
| 142 | + } |
| 143 | + }); |
| 144 | + |
| 145 | + int main() { |
| 146 | + int i = get_number(); |
| 147 | + if (i == 0) { |
| 148 | + foo(); |
| 149 | + } else if (i == 1) { |
| 150 | + bar(); |
| 151 | + } else { |
| 152 | + unsupported(i); |
| 153 | + } |
| 154 | + write_profile(); |
| 155 | + } |
| 156 | + |
| 157 | +Note that we only try to write the profile if the ``__write_profile`` export |
| 158 | +exists. This is important because only the instrumented, unsplit module exports |
| 159 | +``__write_profile``. The split modules will not include the profiling |
| 160 | +instrumentation or this export. |
| 161 | + |
| 162 | +Our new write_profile function depends on malloc and free being available to JS, |
| 163 | +so we need to explicitly export them on the command line:: |
| 164 | + |
| 165 | + $ emcc application.c -o application.js -sSPLIT_MODULE -sEXPORTED_FUNCTIONS=_malloc,_free,_main |
| 166 | + |
| 167 | +Now we can run our application, which produces a profile.data file. The next |
| 168 | +step is to use wasm-split and the profile to split the original module, |
| 169 | +application.wasm:: |
| 170 | + |
| 171 | + $ wasm-split --enable-mutable-globals --export-prefix=% application.wasm.orig -o1 application.wasm -o2 application.deferred.wasm --profile=profile.data |
| 172 | + |
| 173 | +Let’s break down what all those options are for. |
| 174 | + |
| 175 | +``--enable-mutable-globals`` |
| 176 | + This option enables the mutable-global target feature, which allows mutable |
| 177 | + Wasm globals (as opposed to C/C++ globals) to be imported and exported. |
| 178 | + wasm-split has to share mutable globals between the primary and secondary |
| 179 | + modules, so it requires this feature to be enabled. |
| 180 | + |
| 181 | +``--export-prefix=%`` |
| 182 | + This is a prefix added to all the new exports wasm-split creates to share |
| 183 | + module elements from the primary module to the secondary module. The prefix |
| 184 | + can be used to differentiate "true" exports from those that only exist to be |
| 185 | + consumed by the secondary module. Emscripten’s wasm-split integration expects |
| 186 | + “%” in particular to be used as the prefix. |
| 187 | + |
| 188 | +``-o1 application.wasm`` |
| 189 | + Write the primary module to application.wasm. Note that this will overwrite |
| 190 | + the instrumented module previously produced by Emscripten, so the application |
| 191 | + will now use the split modules rather than the instrumented module. |
| 192 | + |
| 193 | +``-o2 application.deferred.wasm`` |
| 194 | + Write the secondary module to application.deferred.wasm. Emscripten expects |
| 195 | + the name of the secondary module to be the same as the name of the primary |
| 196 | + module with “.wasm” replaced with “.deferred.wasm”. |
| 197 | + |
| 198 | +``--profile=profile.data`` |
| 199 | + Directs wasm-split to use the profile in profile.data to guide the splitting. |
| 200 | + |
| 201 | +Running application.js in node again, we can see that the application works just |
| 202 | +as it did before, but if we execute any code path besides the one used in the |
| 203 | +profiled workload, the application will print a console message about a |
| 204 | +placeholder function being called and the deferred module being loaded. |
| 205 | + |
| 206 | +Profiling Multiple Workloads |
| 207 | +---------------------------- |
| 208 | + |
| 209 | +wasm-split supports merging profiles from multiple profiling workloads into a |
| 210 | +single profile to guide splitting. Any function that was run in any of the |
| 211 | +workloads will be kept in the primary module and all other functions will be |
| 212 | +split out into the secondary module. |
| 213 | + |
| 214 | +This command will merge any number of profiles (here just profile1.data and |
| 215 | +profile2.data) into a single profile:: |
| 216 | + |
| 217 | + $ wasm-split --merge-profiles profile1.data profile2.data -o profile.data |
| 218 | + |
| 219 | +Multithreaded Programs |
| 220 | +---------------------- |
| 221 | + |
| 222 | +By default, the data gathered by the wasm-split instrumentation is stored in |
| 223 | +Wasm globals, so it is thread local. But in a multithreaded program, it is |
| 224 | +important to collect profile information from all threads. To do so, you can can |
| 225 | +tell wasm-split to collect shared profile information in shared memory using the |
| 226 | +``--in-memory`` wasm-split flag. This will use memory starting at address zero |
| 227 | +to store the profile information, so you must also pass ``-sGLOBAL_BASE=N`` to |
| 228 | +Emscripten, where ``N`` is at least the number of functions in the module, to |
| 229 | +prevent the program from clobbering that memory region. |
| 230 | + |
| 231 | +After splitting, multithreaded applications will currently fetch and compile the |
| 232 | +secondary module separately on each thread. The compiled secondary module is not |
| 233 | +postmessaged to each thread the way the Emscripten postmessages the primary |
| 234 | +module to the threads. This is not as bad as it sounds because downloads of the |
| 235 | +secondary module from workers will be serviced from the cache if the appropriate |
| 236 | +Cache-Control headers are set, but improving this is an area for future work. |
| 237 | + |
| 238 | +Running on the Web |
| 239 | +------------------ |
| 240 | + |
| 241 | +One complication to keep in mind when using SPLIT_MODULE for Web applications is |
| 242 | +that the secondary module cannot be loaded both lazily and asynchronously, which |
| 243 | +means it cannot be loaded lazily on the main browser thread. The reason is that |
| 244 | +the placeholder functions need to be completely transparent to the functions in |
| 245 | +the primary module, so they can’t return until they have synchronously loaded |
| 246 | +and called the correct secondary function. |
| 247 | + |
| 248 | +One workaround for this limitation would be to eagerly load and instantiate the |
| 249 | +secondary module and ensure that no secondary functions can possibly be called |
| 250 | +before it has been instantiated on the main browser thread. This may be |
| 251 | +difficult to ensure, though. Another fix would be to run the Asyncify |
| 252 | +transformation on the primary module to allow placeholder functions to return to |
| 253 | +the JS event loop while waiting for the secondary module to load asynchronously. |
| 254 | +This is on the wasm-split roadmap, although we do not yet know what the size and |
| 255 | +performance overhead of this solution will be. |
| 256 | + |
| 257 | +This limitation on lazy loading means that the best way to run applications with |
| 258 | +SPLIT_MODULE is in a worker thread, for example using ``-sPROXY_TO_PTHREAD``. In |
| 259 | +PROXY_TO_PTHREAD mode, it is important to collect a profile for the browser main |
| 260 | +thread in addition to the application main thread because the browser main |
| 261 | +thread runs some functions not run in the application main thread, such as the |
| 262 | +shim that wraps the proxied main function and the functions involved in handling |
| 263 | +calls proxied back to the main browser thread. See the previous section for how |
| 264 | +to collect profiles from multiple threads. |
| 265 | + |
| 266 | +Another minor complication is that the profile data cannot be immediately |
| 267 | +written to a file from inside the browser. The data must instead be transmitted |
| 268 | +to developer machines some other way, such as posting it to the dev server or |
| 269 | +copying a base64 encoding of it from the console. |
| 270 | + |
| 271 | +Here’s code implementing the base64 solution:: |
| 272 | + |
| 273 | + var profile_data = new Uint8Array(buffer, ptr, len); |
| 274 | + var binary = ''; |
| 275 | + for (var i = 0; i < profile_data.length; i++) { |
| 276 | + binary += String.fromCharCode(profile_data[i]); |
| 277 | + } |
| 278 | + console.log("===BEGIN==="); |
| 279 | + console.log(window.btoa(binary)); |
| 280 | + console.log("===END==="); |
| 281 | + |
| 282 | +Then the profile file can be created by by running:: |
| 283 | + |
| 284 | + $ echo [pasted base64] | base64 --decode > profile.data |
| 285 | + |
| 286 | +or:: |
| 287 | + |
| 288 | + $ base64 --decode [base64 file] > profile.data |
| 289 | + |
| 290 | +Usage with Dynamic Linking |
| 291 | +-------------------------- |
| 292 | + |
| 293 | +Module splitting can be used in conjunction with dynamic linking, but making the |
| 294 | +two features work correctly together requires some developer intervention. |
| 295 | +wasm-split often needs to grow the table to make space for placeholder |
| 296 | +functions, but that means that the instrumented and split modules would have |
| 297 | +different table sizes. Normally this is not a problem, but |
| 298 | +MAIN_MODULE/SIDE_MODULE dynamic linking support currently requires the table |
| 299 | +size to be baked into the JS Emscripten emits, so the table size needs to be |
| 300 | +stable. |
| 301 | + |
| 302 | +To ensure that the table size is the same between the instrumented module and |
| 303 | +the split modules, use the ``-sINITIAL_TABLE=N`` Emscripten setting, where ``N`` |
| 304 | +is the desired table size. Then, when using wasm-split to perform the splitting, |
| 305 | +pass ``--initial-table=N`` to wasm-split to ensure that the split modules have |
| 306 | +the correct table size as well. |
| 307 | + |
| 308 | +If the specified table size is too small, you will get an error message when the |
| 309 | +primary module is loaded after splitting. Adjust the table size you specify |
| 310 | +until it is large enough. Besides taking up extra space at runtime, there is no |
| 311 | +downside to specifying a table size that is larger than necessary. |
| 312 | + |
| 313 | +Custom Loading of the Secondary Module |
| 314 | +-------------------------------------- |
| 315 | + |
| 316 | +The default logic for lazily loading the secondary module can be overridden by |
| 317 | +implementing the "loadSplitModule" custom hook function. The hook is called from |
| 318 | +placeholder functions and is responsible for returning the [instance, module] |
| 319 | +pair for the secondary module. The hook takes as arguments the name of the file |
| 320 | +to load (e.g. “my_program.deferred.wasm”), the imports object to instantiate the |
| 321 | +module with, and the property corresponding to the called placeholder function. |
| 322 | +Here is an example implementation that does the same thing as the default |
| 323 | +implementation with some extra logging:: |
| 324 | + |
| 325 | + Module["loadSplitModule"] = function(deferred, imports, prop) { |
| 326 | + console.log('Custom handler for loading split module.'); |
| 327 | + console.log('Called with placeholder ', prop); |
| 328 | + |
| 329 | + return instantiateSync(deferred, imports); |
| 330 | + } |
| 331 | + |
| 332 | +If the module was eagerly loaded, then this hook could simply instantiate the |
| 333 | +module rather than fetching and compiling it as well. However, if the eagerly |
| 334 | +loaded module is instantiated eagerly as well, the placeholder functions will be |
| 335 | +patched out and never called in the first place, so this custom hook will never |
| 336 | +be called either. |
| 337 | + |
| 338 | +When eagerly instantiating the secondary module, the imports object should be:: |
| 339 | + |
| 340 | + {'primary': Module['asm']} |
| 341 | + |
| 342 | +Debugging |
| 343 | +--------- |
| 344 | + |
| 345 | +wasm-split has several options to make debugging split modules easier. |
| 346 | + |
| 347 | +``-v`` |
| 348 | + When splitting, print the primary and secondary functions. When merging |
| 349 | + profiles, print profiles that do not contribute to the merged profile. |
| 350 | + |
| 351 | +``-g`` |
| 352 | + Preserve names in both the primary and secondary modules. Without this option, |
| 353 | + wasm-split will strip the names instead. |
| 354 | + |
| 355 | +``--emit-module-names`` |
| 356 | + Generate and emit module names to differentiate the primary and secondary |
| 357 | + module in stack traces, even if -g is not used. |
| 358 | + |
| 359 | +``--symbolmap`` |
| 360 | + Emit separate map files for the primary and secondary modules, mapping |
| 361 | + function indices to function names. When combined with --emit-module-names, |
| 362 | + these maps can be used to re-symbolify stack traces. To ensure that the |
| 363 | + function names are available for wasm-split to emit into the maps, |
| 364 | + pass --profiling-funcs to Emscripten. |
| 365 | + |
| 366 | +``--placeholdermap`` |
| 367 | + Emit a map file mapping placeholder function indices to their corresponding |
| 368 | + secondary functions. This can be useful for figuring out what function caused |
| 369 | + the secondary module to be loaded. |
| 370 | + |
| 371 | + |
| 372 | +Upcoming Changes |
| 373 | +---------------- |
| 374 | + |
| 375 | +*A list of changes and new features that have not yet been incorporated into |
| 376 | +this documentation.* |
| 377 | + |
| 378 | +Work is planned on an integration with the Asyncify instrumentation that will |
| 379 | +allow the secondary module to be asynchronously loaded on the main browser |
| 380 | +thread. |
0 commit comments