-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Counterintuitive effects of module import location on compilation performance #97239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@llvm/issue-subscribers-clang-modules Author: Cameron Angus (kamrann)
I recently experienced an extreme improvement and subsequent regression in compilation performance whilst migrating a library to modules. It turned out the key change had little to do with my gradually encapsulating more into modules, but was down to something that I did by accident. Here's a reduced and much simplified repro.
```
// lots_of_std.h
#include <chrono>
// pmi_x.ixx module; #include "lots_of_std.h" export module x;
// maybe_import_x.h #ifdef import_x_in_gmf
// ip_y_a.ixx module; #include "maybe_import_x.h" export module y:a; import x;
// ip_y_b.ixx module; #include "maybe_import_x.h" export module y:b; import x;
<ip_y_c.ixx, ip_y_d.ixx, ip_y_e.ixx, ip_y_f.ixx as above, each importing the preceding partition>
// pmi_y.ixx export module y; export import :a;
|
Thanks for looking into this. The key point here is that we should try to reduce redeclarations between module units: https://clang.llvm.org/docs/StandardCPlusPlusModules.html#performance-tips. In short, it is worse to include We can view this as a suggestion to get rid of things in GMF.
So let's look into this:
In this case, the compiler can see the GMF from module X, the GMF from
Since import that in the front of including makes the number of declarations in the GMF to go down. But the compiler may still need to put the effort to calculate something. So the best practice may be: https://clang.llvm.org/docs/StandardCPlusPlusModules.html#providing-a-header-to-skip-parsing-redundant-headers
Yeah, this is implementation details and I guess reduced BMI can help here. In the end, I think the suggestion is at the first: don't (or try to reduce) introduce redeclarations in different module units, or in another word, try to reduce the duplicated include in seperate module units into a standalone module and then try to import that one instead of including the headers all the time. |
Thanks for the response and insights. So first off, perhaps using If merely duplicating an import from the purview up to the top of the GMF can be such a big win, it would be nice if the compiler could do this automatically. But I guess an import preceding
Perhaps this is just a distinction between my understanding of modules from a user's perspective, and clang's implementation details (and I have zero knowledge of how all this is implemented), but I thought the idea with the GMF was that it wasn't propagated to downstream importers. I would have assumed any deduplication of declarations was done for a TU, and then the deduplicated results went into the BMI, such that when the compiler came to compile an importing TU, say Anyway I realize there are no doubt major technical difficulties implementing this and I don't have enough knowledge to say anything more specific. I just wanted to bring it up, as it's already a lot of work to migrate to modules and if people are going to encounter significant compilation time regressions while they're part way through the process, many will likely give up and revert, which would be a pity. |
Yeah, generally agreed. What I tried is to put this in the clang's document.
The reason why we don't suggest to include headers after import is about the potential compiler bugs. It is error prone. And as far as I know, MSVC has similar problems.
Also it is true that, include headers after import may change the semantics. Here are two examples: #96147 and #96423
The compiler have to store all the things and distinguish them. Maybe it can be cheaper but it can't be skipped. From the perspective of the specification, the entities in GMFs are reachable but not visible to consumers. So they have to live in the BMI.
Agreed. Life are never easy. From the perspective of the compiler, we can't do such things automatically since it is not technical correct. But I think it might be helpful to provide a script in the user level to detect such things. I think you can try it if you want. |
Sounds good. I'm trying my best to migrate in a way that maintains ability to compile the code without modules, and various combinations in between, so the aim is to eventually be able to run a series of builds with various configurations and generate comparisons of time traces. If I get there, I may have some more useful insights/numbers to provide. One last comment. Regarding
I was actually surprised when I came across that documented advice. I initially started the migration using MSVC, but promptly gave up when faced with that problem - since I wanted to migrate bit by bit and have non-modular third party dependencies, guaranteeing include before import was just too difficult. I switched to clang and have had no problem at all in that respect. So from my experience at least, clang has come pretty far along in dealing with that particular issue and maybe it's just some very specific corner cases that remain. |
It is a QoI problem. In clang, we have two different code paths to deal with include before import and import before include and we can't merge them. So here is the root of the problem and this is the reason why we said it is not suggested. And this disconverge makes me think it is problematic. But it is good if we can make it after all. I'll close this one as it looks all problem are addressed. And I suggest to open such disucssions in LLVM Discourse (https://discourse.llvm.org/) to have a wider visiblity. Also, if you have new conclusions or codes ready to be public, I'll suggest to contribute to https://arewemodulesyet.org/ to help the ecosystem. |
I recently experienced an extreme improvement and subsequent regression in compilation performance whilst migrating a library to modules. It turned out the key change had little to do with my gradually encapsulating more into modules, but was down to something that I did by accident. Here's a reduced and much simplified repro.
So a simple module including some heavy headers in its GMF, then a second module with a linear chain of interface partitions each including the same headers in their GMF, importing the first module, and importing the preceding partition. Each partition also conditionally (via preprocessor switch) imports the first module at the top of its GMF, on top of doing so in its purview.
Compilation times from
-ftime-trace
below. Note that in the default (fat BMI) mode, each TU generates two JSON files from the time trace; one appears largely insignificant time-wise, so below numbers are from the significant one only. By 'thin BMI', I'm referring to-fexperimental-modules-reduced-bmi
.In addition to the above, compilation time for
pmi_x.ixx
was constant at around 3.8s for all cases.BMI sizes were as follows (MB):
Observations:
import x;
statement at the top of the partition GMFs produces a huge improvement in compilation performance. If the import were moved there from the purview then perhaps I might expect some improvement since if I understand right, that would be a semantic change as the declarations would not be passed on to downstream importing partitions. But since it's just being added there as an addition to the import in the purview, this is very counterintuitive to me. When I noticed I'd done this accidentally I wasn't even sure it was legal to import in the GMF. My understanding was that as all code becomes more and more modularized, the goal would be for the GMF to contain fewer and fewer things, so it was super surprising to find that my compilation times got slashed (by 3x in my real use case) by just dumping an additional, seemingly superfluous import statement into the GMF.f
should take any longer thanb
. Note that from looking at the time traces, the increased time is entirely confined withinWriteAST
; the preceding frontend times barely change. Also, though in the fat BMI case there is a small variance in partition BMI sizes in the default (no import in GMF) case, it appeared somewhat arbitrary - the size was not increasing froma
tof
.One other question is why the BMI contains much of anything at all in the above repro since nothing is exported or even declared in the module purviews. Though I'm guessing this must be known and is perhaps just a pending implementation issue.
The text was updated successfully, but these errors were encountered: