Skip to content

Commit 6134588

Browse files
committed
Auto merge of #138591 - Kobzol:git-ci, r=Mark-Simulacrum
Refactor git change detection in bootstrap While working on #138395, I finally found the courage to delve into the insides of git path change detection in bootstrap, which is used (amongst other things) to detect if we should rebuilt od download `[llvm|rustc|gcc]`. I found it a bit hard to understand, and given that this code was historically quite fragile, I thought that it would be better to rebuild it from scratch. The previous approach had a bunch of limitations: - It separated the computation of "are there local changes?" and "what upstream SHA should we use?" even though these two things are intertwined. - It used hacks to work around what happens on CI. - It had special cases for CI scattered throughout the codebase, rather than centralized in one place. - It wasn't documented enough and didn't have tests for the git behavior. The current approach should hopefully resolve all of that. I implemented a single entrypoint called `check_path_modifications` (naming bikeshed pending, half of the time I spend on this PR was thinking about names, as it's quite tricky here..) that explicitly receives a mode of operation (in CI or outside CI), and accordingly figures out that upstream SHA that we should use for downloading artifacts and it also figures out if there are any local changes. Users of this function can then use this unified output to implement `download-ci-X` and other functionality. Notably, this change detection no longer uses `git merge-base`, which makes it easier to use and doesn't require setting up remotes. I also added a bunch of integration tests that literally spawn a git repository on disk and then check that the function can deal with various situations (PR CI, auto/try CI, local builds). After I built this inner layer, I used it for downloading GCC, LLVM and rustc. The latter two (and especially rustc) were using the `last_modified_commit` function before, but in all cases but one this function was actually only used to check if there are any local changes, which was IMO confusing. The LLVM handling would deserve a bit of refactoring, but that's a larger change that can be done as a follow-up. I hope that the implementation is now clear and easy to understand, so that in combination with the tests we can have more confidence that it does what we want. I tried to include a lot of documentation in the code, so I won't be repeating the actual implementation details here, if there are any questions, I'll add the answers to the documentation too :) The new approach explicitly supports three scenarios: - Running on PR CI, where we have one upstream bors parent commit and one PR merge commit made by GitHub. - Running on try/auto CI, where we have one upstream bors parent commit and one PR merge commit made by bors. - Running locally, where we assume that we have at least one upstream bors parent commit in our git history. I removed the handling of upstreams on CI, as I think that it shouldn't be needed and I considered it to be a hack. However, it's possible that there are other use-cases that I haven't considered, so I want to ask around if people have other situations than the three use-cases described above. If there are other such use-cases, I would like to include them in the new centralized implementation and add them to the git test suite, rather than going back to the old ways :) In particular, the code before relied on `git merge-base`, but I don't see why we can't just lookup the most recent bors commit and assume that is a merge commit that is also upstream? I might be running into Chesterton's Fence here :) CC `@pietroalbini` To make sure that this won't break downstream users of Rust's CI. Best reviewed commit by commit. Companion PRs: - For testing beta: #138597 r? `@onur-ozkan` Fixes: #101907 try-job: x86_64-gnu-aux try-job: aarch64-gnu
2 parents b8005bf + fbca453 commit 6134588

File tree

22 files changed

+877
-345
lines changed

22 files changed

+877
-345
lines changed

Diff for: .github/workflows/ci.yml

-3
Original file line numberDiff line numberDiff line change
@@ -125,9 +125,6 @@ jobs:
125125
# which then uses log commands to actually set them.
126126
EXTRA_VARIABLES: ${{ toJson(matrix.env) }}
127127

128-
- name: setup upstream remote
129-
run: src/ci/scripts/setup-upstream-remote.sh
130-
131128
- name: ensure the channel matches the target branch
132129
run: src/ci/scripts/verify-channel.sh
133130

Diff for: src/bootstrap/Cargo.lock

+84-9
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,7 @@ dependencies = [
5656
"sha2",
5757
"sysinfo",
5858
"tar",
59+
"tempfile",
5960
"termcolor",
6061
"toml",
6162
"tracing",
@@ -225,22 +226,28 @@ dependencies = [
225226

226227
[[package]]
227228
name = "errno"
228-
version = "0.3.9"
229+
version = "0.3.10"
229230
source = "registry+https://github.com/rust-lang/crates.io-index"
230-
checksum = "534c5cf6194dfab3db3242765c03bbe257cf92f22b38f6bc0c58d59108a820ba"
231+
checksum = "33d852cb9b869c2a9b3df2f71a3074817f01e1844f839a144f5fcef059a4eb5d"
231232
dependencies = [
232233
"libc",
233-
"windows-sys 0.52.0",
234+
"windows-sys 0.59.0",
234235
]
235236

237+
[[package]]
238+
name = "fastrand"
239+
version = "2.3.0"
240+
source = "registry+https://github.com/rust-lang/crates.io-index"
241+
checksum = "37909eebbb50d72f9059c3b6d82c0463f2ff062c9e95845c43a6c9c0355411be"
242+
236243
[[package]]
237244
name = "fd-lock"
238245
version = "4.0.2"
239246
source = "registry+https://github.com/rust-lang/crates.io-index"
240247
checksum = "7e5768da2206272c81ef0b5e951a41862938a6070da63bcea197899942d3b947"
241248
dependencies = [
242249
"cfg-if",
243-
"rustix",
250+
"rustix 0.38.40",
244251
"windows-sys 0.52.0",
245252
]
246253

@@ -266,6 +273,18 @@ dependencies = [
266273
"version_check",
267274
]
268275

276+
[[package]]
277+
name = "getrandom"
278+
version = "0.3.2"
279+
source = "registry+https://github.com/rust-lang/crates.io-index"
280+
checksum = "73fea8450eea4bac3940448fb7ae50d91f034f941199fcd9d909a5a07aa455f0"
281+
dependencies = [
282+
"cfg-if",
283+
"libc",
284+
"r-efi",
285+
"wasi",
286+
]
287+
269288
[[package]]
270289
name = "globset"
271290
version = "0.4.15"
@@ -334,9 +353,9 @@ checksum = "bbd2bcb4c963f2ddae06a2efc7e9f3591312473c50c6685e1f298068316e66fe"
334353

335354
[[package]]
336355
name = "libc"
337-
version = "0.2.167"
356+
version = "0.2.171"
338357
source = "registry+https://github.com/rust-lang/crates.io-index"
339-
checksum = "09d6582e104315a817dff97f75133544b2e094ee22447d2acf4a74e189ba06fc"
358+
checksum = "c19937216e9d3aa9956d9bb8dfc0b0c8beb6058fc4f7a4dc4d850edf86a237d6"
340359

341360
[[package]]
342361
name = "libredox"
@@ -355,6 +374,12 @@ version = "0.4.14"
355374
source = "registry+https://github.com/rust-lang/crates.io-index"
356375
checksum = "78b3ae25bc7c8c38cec158d1f2757ee79e9b3740fbc7ccf0e59e4b08d793fa89"
357376

377+
[[package]]
378+
name = "linux-raw-sys"
379+
version = "0.9.3"
380+
source = "registry+https://github.com/rust-lang/crates.io-index"
381+
checksum = "fe7db12097d22ec582439daf8618b8fdd1a7bef6270e9af3b1ebcd30893cf413"
382+
358383
[[package]]
359384
name = "log"
360385
version = "0.4.22"
@@ -486,6 +511,12 @@ dependencies = [
486511
"proc-macro2",
487512
]
488513

514+
[[package]]
515+
name = "r-efi"
516+
version = "5.2.0"
517+
source = "registry+https://github.com/rust-lang/crates.io-index"
518+
checksum = "74765f6d916ee2faa39bc8e68e4f3ed8949b48cccdac59983d287a7cb71ce9c5"
519+
489520
[[package]]
490521
name = "redox_syscall"
491522
version = "0.5.7"
@@ -548,10 +579,23 @@ dependencies = [
548579
"bitflags",
549580
"errno",
550581
"libc",
551-
"linux-raw-sys",
582+
"linux-raw-sys 0.4.14",
552583
"windows-sys 0.52.0",
553584
]
554585

586+
[[package]]
587+
name = "rustix"
588+
version = "1.0.2"
589+
source = "registry+https://github.com/rust-lang/crates.io-index"
590+
checksum = "f7178faa4b75a30e269c71e61c353ce2748cf3d76f0c44c393f4e60abf49b825"
591+
dependencies = [
592+
"bitflags",
593+
"errno",
594+
"libc",
595+
"linux-raw-sys 0.9.3",
596+
"windows-sys 0.59.0",
597+
]
598+
555599
[[package]]
556600
name = "ryu"
557601
version = "1.0.18"
@@ -678,6 +722,19 @@ dependencies = [
678722
"xattr",
679723
]
680724

725+
[[package]]
726+
name = "tempfile"
727+
version = "3.19.0"
728+
source = "registry+https://github.com/rust-lang/crates.io-index"
729+
checksum = "488960f40a3fd53d72c2a29a58722561dee8afdd175bd88e3db4677d7b2ba600"
730+
dependencies = [
731+
"fastrand",
732+
"getrandom",
733+
"once_cell",
734+
"rustix 1.0.2",
735+
"windows-sys 0.59.0",
736+
]
737+
681738
[[package]]
682739
name = "termcolor"
683740
version = "1.4.1"
@@ -824,6 +881,15 @@ dependencies = [
824881
"winapi-util",
825882
]
826883

884+
[[package]]
885+
name = "wasi"
886+
version = "0.14.2+wasi-0.2.4"
887+
source = "registry+https://github.com/rust-lang/crates.io-index"
888+
checksum = "9683f9a5a998d873c0d21fcbe3c083009670149a8fab228644b8bd36b2c48cb3"
889+
dependencies = [
890+
"wit-bindgen-rt",
891+
]
892+
827893
[[package]]
828894
name = "winapi"
829895
version = "0.3.9"
@@ -990,15 +1056,24 @@ version = "0.52.6"
9901056
source = "registry+https://github.com/rust-lang/crates.io-index"
9911057
checksum = "589f6da84c646204747d1270a2a5661ea66ed1cced2631d546fdfb155959f9ec"
9921058

1059+
[[package]]
1060+
name = "wit-bindgen-rt"
1061+
version = "0.39.0"
1062+
source = "registry+https://github.com/rust-lang/crates.io-index"
1063+
checksum = "6f42320e61fe2cfd34354ecb597f86f413484a798ba44a8ca1165c58d42da6c1"
1064+
dependencies = [
1065+
"bitflags",
1066+
]
1067+
9931068
[[package]]
9941069
name = "xattr"
9951070
version = "1.3.1"
9961071
source = "registry+https://github.com/rust-lang/crates.io-index"
9971072
checksum = "8da84f1a25939b27f6820d92aed108f83ff920fdf11a7b19366c27c4cda81d4f"
9981073
dependencies = [
9991074
"libc",
1000-
"linux-raw-sys",
1001-
"rustix",
1075+
"linux-raw-sys 0.4.14",
1076+
"rustix 0.38.40",
10021077
]
10031078

10041079
[[package]]

Diff for: src/bootstrap/Cargo.toml

+1
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,7 @@ features = [
8383

8484
[dev-dependencies]
8585
pretty_assertions = "1.4"
86+
tempfile = "3.15.0"
8687

8788
# We care a lot about bootstrap's compile times, so don't include debuginfo for
8889
# dependencies, only bootstrap itself.

Diff for: src/bootstrap/src/core/build_steps/compile.rs

+1-5
Original file line numberDiff line numberDiff line change
@@ -111,14 +111,10 @@ impl Step for Std {
111111
// the `rust.download-rustc=true` option.
112112
let force_recompile = builder.rust_info().is_managed_git_subrepository()
113113
&& builder.download_rustc()
114-
&& builder.config.last_modified_commit(&["library"], "download-rustc", true).is_none();
114+
&& builder.config.has_changes_from_upstream(&["library"]);
115115

116116
trace!("is managed git repo: {}", builder.rust_info().is_managed_git_subrepository());
117117
trace!("download_rustc: {}", builder.download_rustc());
118-
trace!(
119-
"last modified commit: {:?}",
120-
builder.config.last_modified_commit(&["library"], "download-rustc", true)
121-
);
122118
trace!(force_recompile);
123119

124120
run.builder.ensure(Std {

Diff for: src/bootstrap/src/core/build_steps/gcc.rs

+43-34
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,8 @@ pub enum GccBuildStatus {
9696
/// Returns a path to the libgccjit.so file.
9797
#[cfg(not(test))]
9898
fn try_download_gcc(builder: &Builder<'_>, target: TargetSelection) -> Option<PathBuf> {
99+
use build_helper::git::PathFreshness;
100+
99101
// Try to download GCC from CI if configured and available
100102
if !matches!(builder.config.gcc_ci_mode, crate::core::config::GccCiMode::DownloadFromCi) {
101103
return None;
@@ -104,18 +106,40 @@ fn try_download_gcc(builder: &Builder<'_>, target: TargetSelection) -> Option<Pa
104106
eprintln!("GCC CI download is only available for the `x86_64-unknown-linux-gnu` target");
105107
return None;
106108
}
107-
let sha =
108-
detect_gcc_sha(&builder.config, builder.config.rust_info.is_managed_git_subrepository());
109-
let root = ci_gcc_root(&builder.config);
110-
let gcc_stamp = BuildStamp::new(&root).with_prefix("gcc").add_stamp(&sha);
111-
if !gcc_stamp.is_up_to_date() && !builder.config.dry_run() {
112-
builder.config.download_ci_gcc(&sha, &root);
113-
t!(gcc_stamp.write());
109+
let source = detect_gcc_freshness(
110+
&builder.config,
111+
builder.config.rust_info.is_managed_git_subrepository(),
112+
);
113+
builder.verbose(|| {
114+
eprintln!("GCC freshness: {source:?}");
115+
});
116+
match source {
117+
PathFreshness::LastModifiedUpstream { upstream } => {
118+
// Download from upstream CI
119+
let root = ci_gcc_root(&builder.config);
120+
let gcc_stamp = BuildStamp::new(&root).with_prefix("gcc").add_stamp(&upstream);
121+
if !gcc_stamp.is_up_to_date() && !builder.config.dry_run() {
122+
builder.config.download_ci_gcc(&upstream, &root);
123+
t!(gcc_stamp.write());
124+
}
125+
126+
let libgccjit = root.join("lib").join("libgccjit.so");
127+
create_lib_alias(builder, &libgccjit);
128+
Some(libgccjit)
129+
}
130+
PathFreshness::HasLocalModifications { .. } => {
131+
// We have local modifications, rebuild GCC.
132+
eprintln!("Found local GCC modifications, GCC will *not* be downloaded");
133+
None
134+
}
135+
PathFreshness::MissingUpstream => {
136+
eprintln!("error: could not find commit hash for downloading GCC");
137+
eprintln!("HELP: maybe your repository history is too shallow?");
138+
eprintln!("HELP: consider disabling `download-ci-gcc`");
139+
eprintln!("HELP: or fetch enough history to include one upstream commit");
140+
None
141+
}
114142
}
115-
116-
let libgccjit = root.join("lib").join("libgccjit.so");
117-
create_lib_alias(builder, &libgccjit);
118-
Some(libgccjit)
119143
}
120144

121145
#[cfg(test)]
@@ -264,31 +288,16 @@ fn ci_gcc_root(config: &crate::Config) -> PathBuf {
264288
config.out.join(config.build).join("ci-gcc")
265289
}
266290

267-
/// This retrieves the GCC sha we *want* to use, according to git history.
291+
/// Detect whether GCC sources have been modified locally or not.
268292
#[cfg(not(test))]
269-
fn detect_gcc_sha(config: &crate::Config, is_git: bool) -> String {
270-
use build_helper::git::get_closest_merge_commit;
271-
272-
let gcc_sha = if is_git {
273-
get_closest_merge_commit(
274-
Some(&config.src),
275-
&config.git_config(),
276-
&["src/gcc", "src/bootstrap/download-ci-gcc-stamp"],
277-
)
278-
.unwrap()
293+
fn detect_gcc_freshness(config: &crate::Config, is_git: bool) -> build_helper::git::PathFreshness {
294+
use build_helper::git::PathFreshness;
295+
296+
if is_git {
297+
config.check_path_modifications(&["src/gcc", "src/bootstrap/download-ci-gcc-stamp"])
279298
} else if let Some(info) = crate::utils::channel::read_commit_info_file(&config.src) {
280-
info.sha.trim().to_owned()
299+
PathFreshness::LastModifiedUpstream { upstream: info.sha.trim().to_owned() }
281300
} else {
282-
"".to_owned()
283-
};
284-
285-
if gcc_sha.is_empty() {
286-
eprintln!("error: could not find commit hash for downloading GCC");
287-
eprintln!("HELP: maybe your repository history is too shallow?");
288-
eprintln!("HELP: consider disabling `download-ci-gcc`");
289-
eprintln!("HELP: or fetch enough history to include one upstream commit");
290-
panic!();
301+
PathFreshness::MissingUpstream
291302
}
292-
293-
gcc_sha
294303
}

Diff for: src/bootstrap/src/core/build_steps/llvm.rs

+7-18
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ use std::path::{Path, PathBuf};
1414
use std::sync::OnceLock;
1515
use std::{env, fs};
1616

17-
use build_helper::git::get_closest_merge_commit;
17+
use build_helper::git::PathFreshness;
1818
#[cfg(feature = "tracing")]
1919
use tracing::instrument;
2020

@@ -181,26 +181,15 @@ pub const LLVM_INVALIDATION_PATHS: &[&str] = &[
181181
"src/version",
182182
];
183183

184-
/// This retrieves the LLVM sha we *want* to use, according to git history.
185-
pub(crate) fn detect_llvm_sha(config: &Config, is_git: bool) -> String {
186-
let llvm_sha = if is_git {
187-
get_closest_merge_commit(Some(&config.src), &config.git_config(), LLVM_INVALIDATION_PATHS)
188-
.unwrap()
184+
/// Detect whether LLVM sources have been modified locally or not.
185+
pub(crate) fn detect_llvm_freshness(config: &Config, is_git: bool) -> PathFreshness {
186+
if is_git {
187+
config.check_path_modifications(LLVM_INVALIDATION_PATHS)
189188
} else if let Some(info) = crate::utils::channel::read_commit_info_file(&config.src) {
190-
info.sha.trim().to_owned()
189+
PathFreshness::LastModifiedUpstream { upstream: info.sha.trim().to_owned() }
191190
} else {
192-
"".to_owned()
193-
};
194-
195-
if llvm_sha.is_empty() {
196-
eprintln!("error: could not find commit hash for downloading LLVM");
197-
eprintln!("HELP: maybe your repository history is too shallow?");
198-
eprintln!("HELP: consider disabling `download-ci-llvm`");
199-
eprintln!("HELP: or fetch enough history to include one upstream commit");
200-
panic!();
191+
PathFreshness::MissingUpstream
201192
}
202-
203-
llvm_sha
204193
}
205194

206195
/// Returns whether the CI-found LLVM is currently usable.

Diff for: src/bootstrap/src/core/build_steps/tool.rs

+1-2
Original file line numberDiff line numberDiff line change
@@ -696,8 +696,7 @@ impl Step for Rustdoc {
696696
let files_to_track = &["src/librustdoc", "src/tools/rustdoc"];
697697

698698
// Check if unchanged
699-
if builder.config.last_modified_commit(files_to_track, "download-rustc", true).is_some()
700-
{
699+
if !builder.config.has_changes_from_upstream(files_to_track) {
701700
let precompiled_rustdoc = builder
702701
.config
703702
.ci_rustc_dir()

0 commit comments

Comments
 (0)