-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[clang][modules] Guard against bad -fmodule-file mappings (#132059) #133462
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-clang @llvm/pr-subscribers-clang-modules Author: Naveen Seth Hanig (naveen-seth) ChangesWhen using -fmodule-file=<name>=<path/to/bmi> with incorrect inputs, the compiler crashes in two scenarios:
The crash is caused during serialization, when trying to resolve declaration IDs in the AST body after having imported the wrong module. Because the 2nd scenario can only be detected after reading the BMI's module name, checking for duplicate values while parsing command-line options is not enough to fix the crash. This commit fixes the issue by validating module identity after having read the AST's ControlBlock. Full diff: https://github.com/llvm/llvm-project/pull/133462.diff 7 Files Affected:
diff --git a/clang/include/clang/Basic/DiagnosticSerializationKinds.td b/clang/include/clang/Basic/DiagnosticSerializationKinds.td
index 3914d3930bec7..16cc946e3f3d9 100644
--- a/clang/include/clang/Basic/DiagnosticSerializationKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSerializationKinds.td
@@ -98,6 +98,8 @@ def err_imported_module_relocated : Error<
def err_module_different_modmap : Error<
"module '%0' %select{uses|does not use}1 additional module map '%2'"
"%select{| not}1 used when the module was built">;
+def err_module_mismatch : Error<
+ "tried loading module '%0' from '%1' but found module '%2' instead">, DefaultFatal;
def err_ast_file_macro_def_undef : Error<
"macro '%0' was %select{defined|undef'd}1 in the AST file '%2' but "
diff --git a/clang/include/clang/Serialization/ASTReader.h b/clang/include/clang/Serialization/ASTReader.h
index 2779b3d1cf2ea..57f2a08e09359 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -423,6 +423,9 @@ class ASTReader
/// configuration.
ConfigurationMismatch,
+ /// The AST file contains a different module than expected.
+ ModuleMismatch,
+
/// The AST file has errors.
HadErrors
};
@@ -1512,11 +1515,13 @@ class ASTReader
SourceLocation ImportLoc, ModuleFile *ImportedBy,
SmallVectorImpl<ImportedModule> &Loaded,
off_t ExpectedSize, time_t ExpectedModTime,
+ StringRef ExpectedModuleName,
ASTFileSignature ExpectedSignature,
unsigned ClientLoadCapabilities);
ASTReadResult ReadControlBlock(ModuleFile &F,
SmallVectorImpl<ImportedModule> &Loaded,
const ModuleFile *ImportedBy,
+ StringRef ExpectedModuleName,
unsigned ClientLoadCapabilities);
static ASTReadResult
ReadOptionsBlock(llvm::BitstreamCursor &Stream, StringRef Filename,
@@ -1819,6 +1824,18 @@ class ASTReader
unsigned ClientLoadCapabilities,
ModuleFile **NewLoadedModuleFile = nullptr);
+ /// \overload
+ ///
+ /// Calls the above function and checks if the AST file contains the expected
+ /// module. Returns ASTReadResult::Failure on mismatch.
+ ///
+ /// \param ExpectedModuleName The expected name of the new loaded module.
+ ASTReadResult ReadAST(StringRef FileName, ModuleKind Type,
+ SourceLocation ImportLoc,
+ unsigned ClientLoadCapabilities,
+ StringRef ExpectedModuleName,
+ ModuleFile **NewLoadedModuleFile = nullptr);
+
/// Make the entities in the given module and any of its (non-explicit)
/// submodules visible to name lookup.
///
diff --git a/clang/lib/Frontend/ASTUnit.cpp b/clang/lib/Frontend/ASTUnit.cpp
index 0a5f1cfd1a264..7500be81ea976 100644
--- a/clang/lib/Frontend/ASTUnit.cpp
+++ b/clang/lib/Frontend/ASTUnit.cpp
@@ -887,6 +887,7 @@ std::unique_ptr<ASTUnit> ASTUnit::LoadFromASTFile(
case ASTReader::OutOfDate:
case ASTReader::VersionMismatch:
case ASTReader::ConfigurationMismatch:
+ case ASTReader::ModuleMismatch:
case ASTReader::HadErrors:
AST->getDiagnostics().Report(diag::err_fe_unable_to_load_pch);
return nullptr;
diff --git a/clang/lib/Frontend/ChainedIncludesSource.cpp b/clang/lib/Frontend/ChainedIncludesSource.cpp
index a7096e27796a0..ee0363249124b 100644
--- a/clang/lib/Frontend/ChainedIncludesSource.cpp
+++ b/clang/lib/Frontend/ChainedIncludesSource.cpp
@@ -81,6 +81,7 @@ createASTReader(CompilerInstance &CI, StringRef pchFile,
case ASTReader::OutOfDate:
case ASTReader::VersionMismatch:
case ASTReader::ConfigurationMismatch:
+ case ASTReader::ModuleMismatch:
case ASTReader::HadErrors:
break;
}
diff --git a/clang/lib/Frontend/CompilerInstance.cpp b/clang/lib/Frontend/CompilerInstance.cpp
index 91093d3ccb84c..95b3ae34f36fe 100644
--- a/clang/lib/Frontend/CompilerInstance.cpp
+++ b/clang/lib/Frontend/CompilerInstance.cpp
@@ -676,6 +676,7 @@ IntrusiveRefCntPtr<ASTReader> CompilerInstance::createPCHExternalASTSource(
case ASTReader::OutOfDate:
case ASTReader::VersionMismatch:
case ASTReader::ConfigurationMismatch:
+ case ASTReader::ModuleMismatch:
case ASTReader::HadErrors:
// No suitable PCH file could be found. Return an error.
break;
@@ -1909,13 +1910,12 @@ ModuleLoadResult CompilerInstance::findOrCompileModuleAndReadAST(
: Source == MS_PrebuiltModulePath
? 0
: ASTReader::ARR_ConfigurationMismatch;
- switch (getASTReader()->ReadAST(ModuleFilename,
- Source == MS_PrebuiltModulePath
- ? serialization::MK_PrebuiltModule
- : Source == MS_ModuleBuildPragma
- ? serialization::MK_ExplicitModule
- : serialization::MK_ImplicitModule,
- ImportLoc, ARRFlags)) {
+ switch (getASTReader()->ReadAST(
+ ModuleFilename,
+ Source == MS_PrebuiltModulePath ? serialization::MK_PrebuiltModule
+ : Source == MS_ModuleBuildPragma ? serialization::MK_ExplicitModule
+ : serialization::MK_ImplicitModule,
+ ImportLoc, ARRFlags, ModuleName)) {
case ASTReader::Success: {
if (M)
return M;
@@ -1952,6 +1952,7 @@ ModuleLoadResult CompilerInstance::findOrCompileModuleAndReadAST(
// Fall through to error out.
[[fallthrough]];
case ASTReader::VersionMismatch:
+ case ASTReader::ModuleMismatch:
case ASTReader::HadErrors:
ModuleLoader::HadFatalFailure = true;
// FIXME: The ASTReader will already have complained, but can we shoehorn
diff --git a/clang/lib/Serialization/ASTReader.cpp b/clang/lib/Serialization/ASTReader.cpp
index 0cd2cedb48dd9..afae2f3a0e217 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -2926,6 +2926,8 @@ static bool isDiagnosedResult(ASTReader::ASTReadResult ARR, unsigned Caps) {
case ASTReader::VersionMismatch: return !(Caps & ASTReader::ARR_VersionMismatch);
case ASTReader::ConfigurationMismatch:
return !(Caps & ASTReader::ARR_ConfigurationMismatch);
+ case ASTReader::ModuleMismatch:
+ return true;
case ASTReader::HadErrors: return true;
case ASTReader::Success: return false;
}
@@ -3020,11 +3022,10 @@ ASTReader::ASTReadResult ASTReader::ReadOptionsBlock(
}
}
-ASTReader::ASTReadResult
-ASTReader::ReadControlBlock(ModuleFile &F,
- SmallVectorImpl<ImportedModule> &Loaded,
- const ModuleFile *ImportedBy,
- unsigned ClientLoadCapabilities) {
+ASTReader::ASTReadResult ASTReader::ReadControlBlock(
+ ModuleFile &F, SmallVectorImpl<ImportedModule> &Loaded,
+ const ModuleFile *ImportedBy, StringRef ExpectedModuleName,
+ unsigned ClientLoadCapabilities) {
BitstreamCursor &Stream = F.Stream;
if (llvm::Error Err = Stream.EnterSubBlock(CONTROL_BLOCK_ID)) {
@@ -3315,7 +3316,7 @@ ASTReader::ReadControlBlock(ModuleFile &F,
// Load the AST file.
auto Result = ReadASTCore(ImportedFile, ImportedKind, ImportLoc, &F,
- Loaded, StoredSize, StoredModTime,
+ Loaded, StoredSize, StoredModTime, ImportedName,
StoredSignature, Capabilities);
// If we diagnosed a problem, produce a backtrace.
@@ -3338,7 +3339,10 @@ ASTReader::ReadControlBlock(ModuleFile &F,
case OutOfDate: return OutOfDate;
case VersionMismatch: return VersionMismatch;
case ConfigurationMismatch: return ConfigurationMismatch;
- case HadErrors: return HadErrors;
+ case ModuleMismatch:
+ return ModuleMismatch;
+ case HadErrors:
+ return HadErrors;
case Success: break;
}
break;
@@ -3363,6 +3367,14 @@ ASTReader::ReadControlBlock(ModuleFile &F,
if (Listener)
Listener->ReadModuleName(F.ModuleName);
+ // Return if the AST unexpectedly contains a different module
+ if (F.Kind == MK_PrebuiltModule && !ExpectedModuleName.empty() &&
+ F.ModuleName != ExpectedModuleName) {
+ Diag(diag::err_module_mismatch)
+ << ExpectedModuleName << F.FileName << F.ModuleName;
+ return ASTReadResult::ModuleMismatch;
+ }
+
// Validate the AST as soon as we have a name so we can exit early on
// failure.
if (ASTReadResult Result = readUnhashedControlBlockOnce())
@@ -4684,6 +4696,15 @@ ASTReader::ASTReadResult ASTReader::ReadAST(StringRef FileName, ModuleKind Type,
SourceLocation ImportLoc,
unsigned ClientLoadCapabilities,
ModuleFile **NewLoadedModuleFile) {
+ return ReadAST(FileName, Type, ImportLoc, ClientLoadCapabilities, "",
+ NewLoadedModuleFile);
+}
+
+ASTReader::ASTReadResult ASTReader::ReadAST(StringRef FileName, ModuleKind Type,
+ SourceLocation ImportLoc,
+ unsigned ClientLoadCapabilities,
+ StringRef ExpectedModuleName,
+ ModuleFile **NewLoadedModuleFile) {
llvm::TimeTraceScope scope("ReadAST", FileName);
llvm::SaveAndRestore SetCurImportLocRAII(CurrentImportLoc, ImportLoc);
@@ -4702,8 +4723,8 @@ ASTReader::ASTReadResult ASTReader::ReadAST(StringRef FileName, ModuleKind Type,
SmallVector<ImportedModule, 4> Loaded;
if (ASTReadResult ReadResult =
ReadASTCore(FileName, Type, ImportLoc,
- /*ImportedBy=*/nullptr, Loaded, 0, 0, ASTFileSignature(),
- ClientLoadCapabilities)) {
+ /*ImportedBy=*/nullptr, Loaded, 0, 0, ExpectedModuleName,
+ ASTFileSignature(), ClientLoadCapabilities)) {
ModuleMgr.removeModules(ModuleMgr.begin() + NumModules);
// If we find that any modules are unusable, the global index is going
@@ -4954,25 +4975,26 @@ static unsigned moduleKindForDiagnostic(ModuleKind Kind) {
llvm_unreachable("unknown module kind");
}
-ASTReader::ASTReadResult
-ASTReader::ReadASTCore(StringRef FileName,
- ModuleKind Type,
- SourceLocation ImportLoc,
- ModuleFile *ImportedBy,
- SmallVectorImpl<ImportedModule> &Loaded,
- off_t ExpectedSize, time_t ExpectedModTime,
- ASTFileSignature ExpectedSignature,
- unsigned ClientLoadCapabilities) {
+ASTReader::ASTReadResult ASTReader::ReadASTCore(
+ StringRef FileName, ModuleKind Type, SourceLocation ImportLoc,
+ ModuleFile *ImportedBy, SmallVectorImpl<ImportedModule> &Loaded,
+ off_t ExpectedSize, time_t ExpectedModTime, StringRef ExpectedModuleName,
+ ASTFileSignature ExpectedSignature, unsigned ClientLoadCapabilities) {
ModuleFile *M;
std::string ErrorStr;
- ModuleManager::AddModuleResult AddResult
- = ModuleMgr.addModule(FileName, Type, ImportLoc, ImportedBy,
- getGeneration(), ExpectedSize, ExpectedModTime,
- ExpectedSignature, readASTFileSignature,
- M, ErrorStr);
+ ModuleManager::AddModuleResult AddResult = ModuleMgr.addModule(
+ FileName, Type, ImportLoc, ImportedBy, getGeneration(), ExpectedSize,
+ ExpectedModTime, ExpectedSignature, readASTFileSignature, M, ErrorStr);
switch (AddResult) {
case ModuleManager::AlreadyLoaded:
+ // Return if the AST unexpectedly contains a different module
+ if (Type == MK_PrebuiltModule && !ExpectedModuleName.empty() &&
+ M->ModuleName != ExpectedModuleName) {
+ Diag(diag::err_module_mismatch)
+ << ExpectedModuleName << FileName << M->ModuleName;
+ return ASTReadResult::ModuleMismatch;
+ }
Diag(diag::remark_module_import)
<< M->ModuleName << M->FileName << (ImportedBy ? true : false)
<< (ImportedBy ? StringRef(ImportedBy->ModuleName) : StringRef());
@@ -5053,7 +5075,8 @@ ASTReader::ReadASTCore(StringRef FileName,
switch (Entry.ID) {
case CONTROL_BLOCK_ID:
HaveReadControlBlock = true;
- switch (ReadControlBlock(F, Loaded, ImportedBy, ClientLoadCapabilities)) {
+ switch (ReadControlBlock(F, Loaded, ImportedBy, ExpectedModuleName,
+ ClientLoadCapabilities)) {
case Success:
// Check that we didn't try to load a non-module AST file as a module.
//
@@ -5075,6 +5098,8 @@ ASTReader::ReadASTCore(StringRef FileName,
case OutOfDate: return OutOfDate;
case VersionMismatch: return VersionMismatch;
case ConfigurationMismatch: return ConfigurationMismatch;
+ case ModuleMismatch:
+ return ModuleMismatch;
case HadErrors: return HadErrors;
}
break;
diff --git a/clang/test/Modules/fmodule-file-bad-transitive-mapping.cpp b/clang/test/Modules/fmodule-file-bad-transitive-mapping.cpp
new file mode 100644
index 0000000000000..9bfe3c786605b
--- /dev/null
+++ b/clang/test/Modules/fmodule-file-bad-transitive-mapping.cpp
@@ -0,0 +1,52 @@
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+// RUN: cd %t
+
+// RUN: %clang_cc1 -std=c++20 -emit-module-interface a.cppm -o a.pcm
+// RUN: %clang_cc1 -std=c++20 -emit-module-interface b.cppm -o b.pcm \
+// RUN: -fmodule-file=A=a.pcm
+
+// This test addresses issue #132059:
+// Bad use of fmodule-file=<name>=<path/to/bmi> previously caused the compiler
+// to crash for the following cases:
+
+//--- a.cppm
+export module A;
+
+export int a() {
+ return 41;
+}
+
+//--- b.cppm
+export module B;
+import A;
+
+export int b() {
+ return a() + 1;
+}
+
+// Test that when -fmodule-file=<name>=<path/to/bmi> mistakenly maps a module to
+// a BMI which depends on the module, the compiler doesn't crash.
+
+// RUN: not %clang_cc1 -std=c++20 main1.cpp-fmodule-file=B=b.pcm \
+// RUN: -fmodule-file=A=b.pcm
+
+//--- main1.cpp
+import B;
+
+int main() {
+ return b();
+}
+
+// Test that when -fmodule-file=<name>=<path/to/bmi> mistakenly maps a module
+// to a BMI file, and that BMI exposes the parts of the specified module as
+// transitive imports, the compiler doesn't crash.
+
+// RUN: not %clang_cc1 -std=c++20 main2.cpp -fmodule-file=A=b.pcm
+
+//--- main2.cpp
+import A;
+
+int main() {
+ return a();
+}
|
804617c
to
74ace4f
Compare
Hi @ChuanqiXu9, @mpark, @shafik, For context, here is how this PR would change clang's behavior when replicating the original crash in #132059: The original crash, where the module clang++ -fmodule-file=std=std.pcm -fmodule-file=std=hello.pcm -stdlib=libc++ -std=c++23 main.ccp main.cpp:2:8: fatal error: module 'hello' not found
2 | import hello;
| ~~~~~~~^~~~~
1 error generated. When assigning the matching BMI file to clang++ -stdlib=libc++ -std=c++23 -fmodule-file=std=std.pcm -fmodule-file=std=hello.pcm -fmodule-file=hello=hello.pcm main.ccp main.ccp:2:1: fatal error: tried loading module 'std' from 'hello.pcm' but found module 'hello' instead
2 | import hello;
| ^
main.ccp:2:1: note: imported by module 'hello' in 'hello.pcm'
1 error generated. Would you be able to review the PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The size of the patch is slightly more than what I thought. I thought this may be a simple change. Could you describe the crash point and the crash reason (from code's perspective)
I received a mail but not see it in the web page. Is something going wrong? |
Hi, sorry! I left a comment earlier but noticed that a large part of the explanation was incorrect and deleted it. Here is the fixed comment: At first I thought that the issue can be fixed by just checking for duplicates while parsing the command-line arguments, but that doesn't seem to be the case. I will try to describe the crash point and the crash reason with this example: //--- a.cppm
export module A;
export int a() {
return 41;
} //--- b.cppm
export module B;
import A;
export int b() {
return a() + 1;
} //--- c.cppm
export module C;
export int c() {
return 42;
} //--- main.cpp
import B;
int main() {
return b();
}
After … When reading a modules control block in Currently (prior to this PR), there is no validation to ensure that the BMI file actually contains the module clang is trying to load after having read the module name in Because of the faulty The crash occurs because Clang, assuming it has the correct (Depending on the specific example, the exact location of the crash will be different, but it will always happen while trying to resolve the remappings.) While writing this comment, I noticed that my commit and PR message describe the cause of the crash incorrectly and have edited both. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. What I feel not good is passing ExpectedName
and ModuleMismatch
all around. I am wondering if we can check it by inserting a check in ASTReader::ReadControlBlock
in case IMPORT:
where we can check the ImportedName
and the name stored in loaded module.
Updated! That really does make it a lot simpler. I was a bit unsure regarding the following lines: llvm-project/clang/lib/Serialization/ASTReader.cpp Lines 3342 to 3346 in b98d66f
Is it acceptable to also return Thank you for the review and my apologies for the delay! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Your concern makes sense. But this is existing and sepearate logic, let's fix in other patches. |
// Check the AST we just read from ImportedFile contains a different | ||
// module than we expected (ImportedName). | ||
if (auto *Imported = getModuleManager().lookupByFileName(ImportedFile); | ||
Imported != nullptr && Imported->ModuleName != ImportedName) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make it more controllable, let's add a check for IsImportingStdCXXModule
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
Theoretically also only applies to MK_PrebuiltModule
.
Should I also add it to the check?
[clang][modules] Guard against invalid -fmodule-file mappings Fixes llvm#132059. Providing incorrect mappings via `-fmodule-file=<name>=<path/to/bmi>` can cause the compiler to crash when loading a module that imports an incorrectly mapped module. The crash occurs during AST body deserialization when the compiler attempts to resolve remappings using the `ModuleFile` from the incorrectly mapped module's BMI file. The cause is an invalid access into an incorrectly loaded `ModuleFile`. This commit fixes the issue by verifying the identity of the imported module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks.
@naveen-seth Congratulations on having your first Pull Request (PR) merged into the LLVM Project! Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR. Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues. How to do this, and the rest of the post-merge process, is covered in detail here. If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again. If you don't get any reports, no action is required from you. Your changes are working as expected, well done! |
Local branch origin/amd-gfx 1602708 Merged main:707367621679 into origin/amd-gfx:4cab0b6fa1ab Remote branch main ac42b08 [clang][modules] Guard against bad -fmodule-file mappings (llvm#132059) (llvm#133462)
Fix #132059.
Providing incorrect mappings via
-fmodule-file=<name>=<path/to/bmi>
can crash the compiler when loading a module that imports an
incorrectly mapped module.
The crash occurs during AST body deserialization, when the compiler
attempts to resolve remappings using the
ModuleFile
from theincorrectly mapped module's BMI file.
The cause is an invalid access into an incorrectly loaded
ModuleFile
.This commit fixes the issue by verifying the identity of the imported
module.