-
Notifications
You must be signed in to change notification settings - Fork 89
Speed up linking by changing cmxa format #607
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@xclerc and I are going to review this. |
This looks good. Two comments:
Is it worth trying to upstream this patch right away? @xclerc could you proceed with your review on this one please? |
Sorry, I keep forgetting the policy: should we update |
Good point, yes I think we should do that for the formats involved. |
There was indeed an issue there! (The reason for the
This was because of an issue with the build system, now resolved since #585 has been merged. The ocaml/asmcomp directory is no longer affected by this PR.
I bumped the relevant numbers. |
ce76e02 flambda-backend: Bugfix for type_application (#746) 44f3afb flambda-backend: PR580 for main branch (#743) b851eaa flambda-backend: Backport first part of ocaml/ocaml PR10498 (#737) fafb4bd flambda-backend: Fix return mode for eta-expanded function in type_argument (#735) c31f6c3 flambda-backend: Fix treatment of functions called [@nontail] (#725) 847781e flambda-backend: Fix build_upstream post-PR703 (#712) bfcbbf8 flambda-backend: Extend Pblock value kind to handle variants (#703) b2cab95 flambda-backend: Merge ocaml-jst a6d6e0e flambda-backend: Merge ocaml-jst 88a4f63 flambda-backend: Use Pmakearray for immutable arrays (#699) eeaa44b flambda-backend: Install an ocamldoc binary (#695) 48d322b flambda-backend: Ensure that GC is not invoked from bounds check failures (#681) 4370fa1 flambda-backend: Review changes of term directory (#602) 65a4566 flambda-backend: Add code coverage using bisect_ppx (#352) 63ab65f flambda-backend: Bugfix for primitive inclusion (#662) 7e3e0c8 flambda-backend: Fix inclusion checks for primitives (#661) 96c68f9 flambda-backend: Speed up linking by changing cmxa format (#607) 1829150 flambda-backend: Bugfix for Translmod.all_idents (#659) git-subtree-dir: ocaml git-subtree-split: ce76e02
b7ec31d47 Don't check the names of unprefixed compilation units 4da7b7519 Fix compilation f012913f3 Code review 204fdff1c Fix compilation on Power e7ebe5b11 Code review b3f462cae Temporarily allow `.cinaps` as the name of a compilation unit 01a9585d1 Fix `opttoploop.ml` compilation 2864c3463 Replace `Compilenv.make_symbol` in all archs 96d2760cd Simpler symbols 67b25c1 Merge flambda-backend changes b2cab95 flambda-backend: Merge ocaml-jst a6d6e0e flambda-backend: Merge ocaml-jst 88a4f63 flambda-backend: Use Pmakearray for immutable arrays (oxcaml#699) eeaa44b flambda-backend: Install an ocamldoc binary (oxcaml#695) 48d322b flambda-backend: Ensure that GC is not invoked from bounds check failures (oxcaml#681) 4370fa1 flambda-backend: Review changes of term directory (oxcaml#602) 65a4566 flambda-backend: Add code coverage using bisect_ppx (oxcaml#352) 63ab65f flambda-backend: Bugfix for primitive inclusion (oxcaml#662) 7e3e0c8 flambda-backend: Fix inclusion checks for primitives (oxcaml#661) 96c68f9 flambda-backend: Speed up linking by changing cmxa format (oxcaml#607) 1829150 flambda-backend: Bugfix for Translmod.all_idents (oxcaml#659) git-subtree-dir: ocaml git-subtree-split: b7ec31d4701dd0d3c63b9fa241dac6b983ef84f5
This patch improves linking time, by changing the cmxa format to be smaller and avoid redundant checking of module digests.
On large executables (~600 MB), this shortens linking time by about a third, makes the ocamlopt process take ~10x less memory, and reduces the size of cmxa files by anywhere from 5x to 70x.
(The patch grew larger than I'd like. If it's now unreviewably big, let me know and I'll split it up next week)
The main change is that the format of cmxa files goes from this:
to this:
Instead of keeping a set of imported cmi/cmx files and hashes for each module in a library, the cmxa file now contains one unified list of all of the cmi/cmx files and hashes, and each module contains a bitmap indicating which of these it depends on. Since the dependencies of individual modules in a library tend to be almost identical, this saves a lot of space, as well as saving time during linking (since we only need to check them for consistency once). Similarly, the set of generic functions (caml_curry and friends) is now maintained once per library rather than separately for each module.
The other change to linking is to read cmxas, check consistency and construct the list of objects to link in a single pass, rather than in three separate passes. This removes the need to hold all of the cmxa files in memory at once, greatly reducing the maximum memory usage (and speeding things up somewhat)