Skip to content

Refactor git change detection in bootstrap #138591

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

Kobzol
Copy link
Contributor

@Kobzol Kobzol commented Mar 17, 2025

While working on #138395, I finally found the courage to delve into the insides of git path change detection in bootstrap, which is used (amongst other things) to detect if we should rebuilt od download [llvm|rustc|gcc]. I found it a bit hard to understand, and given that this code was historically quite fragile, I thought that it would be better to rebuild it from scratch.

The previous approach had a bunch of limitations:

  • It separated the computation of "are there local changes?" and "what upstream SHA should we use?" even though these two things are intertwined.
  • It used hacks to work around what happens on CI.
  • It had special cases for CI scattered throughout the codebase, rather than centralized in one place.
  • It wasn't documented enough and didn't have tests for the git behavior.

The current approach should hopefully resolve all of that. I implemented a single entrypoint called check_path_modifications (naming bikeshed pending, half of the time I spend on this PR was thinking about names, as it's quite tricky here..) that explicitly receives a mode of operation (in CI or outside CI), and accordingly figures out that upstream SHA that we should use for downloading artifacts and it also figures out if there are any local changes. Users of this function can then use this unified output to implement download-ci-X and other functionality. Notably, this change detection no longer uses git merge-base, which makes it easier to use and doesn't require setting up remotes.

I also added a bunch of integration tests that literally spawn a git repository on disk and then check that the function can deal with various situations (PR CI, auto/try CI, local builds).

After I built this inner layer, I used it for downloading GCC, LLVM and rustc. The latter two (and especially rustc) were using the last_modified_commit function before, but in all cases but one this function was actually only used to check if there are any local changes, which was IMO confusing. The LLVM handling would deserve a bit of refactoring, but that's a larger change that can be done as a follow-up.

I hope that the implementation is now clear and easy to understand, so that in combination with the tests we can have more confidence that it does what we want. I tried to include a lot of documentation in the code, so I won't be repeating the actual implementation details here, if there are any questions, I'll add the answers to the documentation too :)

The new approach explicitly supports three scenarios:

  • Running on PR CI, where we have one upstream bors parent commit and one PR merge commit made by GitHub.
  • Running on try/auto CI, where we have one upstream bors parent commit and one PR merge commit made by bors.
  • Running locally, where we assume that we have at least one upstream bors parent commit in our git history.

I removed the handling of upstreams on CI, as I think that it shouldn't be needed and I considered it to be a hack. However, it's possible that there are other use-cases that I haven't considered, so I want to ask around if people have other situations than the three use-cases described above. If there are other such use-cases, I would like to include them in the new centralized implementation and add them to the git test suite, rather than going back to the old ways :)

In particular, the code before relied on git merge-base, but I don't see why we can't just lookup the most recent bors commit and assume that is a merge commit that is also upstream? I might be running into Chesterton's Fence here :)

CC @pietroalbini To make sure that this won't break downstream users of Rust's CI.

Best reviewed commit by commit.

Companion PRs:

r? @onur-ozkan

Fixes: #101907

try-job: x86_64-gnu-aux
try-job: aarch64-gnu
try-job: dist-x86_64-apple

@rustbot rustbot added A-compiletest Area: The compiletest test runner A-testsuite Area: The testsuite used to check the correctness of rustc S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue. labels Mar 17, 2025
@rustbot
Copy link
Collaborator

rustbot commented Mar 17, 2025

This PR changes how GCC is built. Consider updating src/bootstrap/download-ci-gcc-stamp.

These commits modify the Cargo.lock file. Unintentional changes to Cargo.lock can be introduced when switching branches and rebasing PRs.

If this was unintentional then you should revert the changes before this PR is merged.
Otherwise, you can ignore this comment.

This PR changes how LLVM is built. Consider updating src/bootstrap/download-ci-llvm-stamp.

Some changes occurred in src/tools/compiletest

cc @jieyouxu

This PR modifies src/bootstrap/src/core/config.

If appropriate, please update CONFIG_CHANGE_HISTORY in src/bootstrap/src/utils/change_tracker.rs.

@rustbot rustbot added the T-release Relevant to the release subteam, which will review and decide on the PR/issue. label Mar 17, 2025
@pietroalbini
Copy link
Member

LGTM on the Ferrocene side. There is nothing here that would break our downstream usage.

On the Rust side, I recommend opening this PR against stable and beta too, and running a full bors try on it. We had issues in past releases where changes to this code would unexpectedly break stable or beta CI, and I'd love for those to be catched before merging.

@Kobzol
Copy link
Contributor Author

Kobzol commented Mar 17, 2025

Yes, I planned to do that, it's a good idea. Actually, I can try that right away.

bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 17, 2025
[do not merge] beta test for git change detection (rust-lang#138591)

Opening to test CI/bootstrap changes.

r? `@ghost`

try-job: x86_64-gnu-stable
try-job: x86_64-gnu
try-job: x86_64-gnu-llvm-19-1
try-job: dist-x86_64-linux
bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 17, 2025
[do not merge] beta test for git change detection (rust-lang#138591)

Opening to test CI/bootstrap changes from rust-lang#138591.

r? `@ghost`

try-job: x86_64-gnu-stable
try-job: x86_64-gnu
try-job: x86_64-gnu-llvm-19-1
try-job: dist-x86_64-linux
bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 17, 2025
[do not merge] beta test for git change detection (rust-lang#138591)

Opening to test CI/bootstrap changes from rust-lang#138591.

r? `@ghost`

try-job: x86_64-gnu-stable
try-job: x86_64-gnu
try-job: x86_64-gnu-llvm-19-1
try-job: dist-x86_64-linux
@jieyouxu jieyouxu self-assigned this Mar 17, 2025
@onur-ozkan
Copy link
Member

The changes look good, but I am not sure if they will break the if-unchanged tests and logic in the following cases:

  • PR that is supposed to use ci-rustc and ci-llvm
  • PR that is not supposed to use ci-rustc and ci-llvm
  • Testing the above cases on both stable and beta PRs

I think it's safer to make sure these won't be a problem before merging this.

@jieyouxu jieyouxu removed their assignment Mar 18, 2025
bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 18, 2025
[do not merge] beta test for git change detection (rust-lang#138591)

Opening to test CI/bootstrap changes from rust-lang#138591.

r? `@ghost`

try-job: x86_64-gnu-aux
bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 18, 2025
[do not merge] beta test for git change detection (rust-lang#138591)

Opening to test CI/bootstrap changes from rust-lang#138591.

r? `@ghost`

try-job: x86_64-gnu-aux
@Kobzol
Copy link
Contributor Author

Kobzol commented Mar 18, 2025

@bors try

bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 18, 2025
Refactor git change detection in bootstrap

While working on rust-lang#138395, I finally found the courage to delve into the insides of git path change detection in bootstrap, which is used (amongst other things) to detect if we should rebuilt od download `[llvm|rustc|gcc]`. I found it a bit hard to understand, and given that this code was historically quite fragile, I thought that it would be better to rebuild it from scratch.

The previous approach had a bunch of limitations:
- It separated the computation of "are there local changes?" and "what upstream SHA should we use?" even though these two things are intertwined.
- It used hacks to work around what happens on CI.
- It had special cases for CI scattered throughout the codebase, rather than centralized in one place.
- It wasn't documented enough and didn't have tests for the git behavior.

The current approach should hopefully resolve all of that. I implemented a single entrypoint called `check_path_modifications` (naming bikeshed pending, half of the time I spend on this PR was thinking about names, as it's quite tricky here..) that explicitly receives a mode of operation (in CI or outside CI), and accordingly figures out that upstream SHA that we should use for downloading artifacts and it also figures out if there are any local changes. Users of this function can then use this unified output to implement `download-ci-X` and other functionality.

I also added a bunch of integration tests that literally spawn a git repository on disk and then check that the function can deal with various situations (PR CI, auto/try CI, local builds). The tests are super fast and run in parallel, as they are currently in `build_helper` and not in `bootstrap`.

After I built this inner layer, I used it for downloading GCC, LLVM and rustc. The latter two (and especially rustc) were using the `last_modified_commit` function before, but in all cases but one this function was actually only used to check if there are any local changes, which was IMO confusing. The LLVM handling would deserve a bit of refactoring, but that's a larger change that can be done as a follow-up.

In the future we could cache the results of `check_path_modifications` to reduce the number of git invocations, but I don't think that it should be excessive even now.

I hope that the implementation is now clear and easy to understand, so that in combination with the tests we can have more confidence that it does what we want. I tried to include a lot of documentation in the code, so I won't be repeating the actual implementation details here, if there are any questions, I'll add the answers to the documentation too :)

The new approach explicitly supports three scenarios:
- Running on PR CI, where we have one upstream bors parent commit and one PR merge commit made by GitHub.
- Running on try/auto CI, where we have one upstream bors parent commit and one PR merge commit made by bors.
- Running locally, where we assume that we have at least one upstream bors parent commit in our git history.

I removed the handling of upstreams on CI, as I think that it shouldn't be needed and I considered it to be a hack. However, it's possible that there are other use-cases that I haven't considered, so I want to ask around if people have other situations than the three use-cases described above. If there are other such use-cases, I would like to include them in the new centralized implementation and add them to the git test suite, rather than going back to the old ways :)

In particular, the code before relied on `git merge-base`, but I don't see why we can't just lookup the most recent bors commit and assume that is a merge commit that is also upstream? I might be running into Chesterton's Fence here :)

CC `@pietroalbini` To make sure that this won't break downstream users of Rust's CI.

Best reviewed commit by commit.

Companion PRs:
- For testing beta: rust-lang#138597

r? `@onur-ozkan`

try-job: x86_64-gnu-aux
@bors
Copy link
Collaborator

bors commented Mar 18, 2025

⌛ Trying commit afe1f99 with merge 27ee8fc...

@Kobzol
Copy link
Contributor Author

Kobzol commented Mar 18, 2025

Did a bunch of follow-up clean-ups. Let me know if you want me to split this into multiple PRs! :)

@Kobzol Kobzol force-pushed the git-ci branch 2 times, most recently from 63be8ba to e1fe7f2 Compare March 19, 2025 10:07
@Kobzol
Copy link
Contributor Author

Kobzol commented Apr 20, 2025

@bors r=Mark-Simulacrum

@bors
Copy link
Collaborator

bors commented Apr 20, 2025

📌 Commit fbca453 has been approved by Mark-Simulacrum

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 20, 2025
@bors
Copy link
Collaborator

bors commented Apr 21, 2025

⌛ Testing commit fbca453 with merge 6134588...

bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 21, 2025
Refactor git change detection in bootstrap

While working on rust-lang#138395, I finally found the courage to delve into the insides of git path change detection in bootstrap, which is used (amongst other things) to detect if we should rebuilt od download `[llvm|rustc|gcc]`. I found it a bit hard to understand, and given that this code was historically quite fragile, I thought that it would be better to rebuild it from scratch.

The previous approach had a bunch of limitations:
- It separated the computation of "are there local changes?" and "what upstream SHA should we use?" even though these two things are intertwined.
- It used hacks to work around what happens on CI.
- It had special cases for CI scattered throughout the codebase, rather than centralized in one place.
- It wasn't documented enough and didn't have tests for the git behavior.

The current approach should hopefully resolve all of that. I implemented a single entrypoint called `check_path_modifications` (naming bikeshed pending, half of the time I spend on this PR was thinking about names, as it's quite tricky here..) that explicitly receives a mode of operation (in CI or outside CI), and accordingly figures out that upstream SHA that we should use for downloading artifacts and it also figures out if there are any local changes. Users of this function can then use this unified output to implement `download-ci-X` and other functionality. Notably, this change detection no longer uses `git merge-base`, which makes it easier to use and doesn't require setting up remotes.

I also added a bunch of integration tests that literally spawn a git repository on disk and then check that the function can deal with various situations (PR CI, auto/try CI, local builds).

After I built this inner layer, I used it for downloading GCC, LLVM and rustc. The latter two (and especially rustc) were using the `last_modified_commit` function before, but in all cases but one this function was actually only used to check if there are any local changes, which was IMO confusing. The LLVM handling would deserve a bit of refactoring, but that's a larger change that can be done as a follow-up.

I hope that the implementation is now clear and easy to understand, so that in combination with the tests we can have more confidence that it does what we want. I tried to include a lot of documentation in the code, so I won't be repeating the actual implementation details here, if there are any questions, I'll add the answers to the documentation too :)

The new approach explicitly supports three scenarios:
- Running on PR CI, where we have one upstream bors parent commit and one PR merge commit made by GitHub.
- Running on try/auto CI, where we have one upstream bors parent commit and one PR merge commit made by bors.
- Running locally, where we assume that we have at least one upstream bors parent commit in our git history.

I removed the handling of upstreams on CI, as I think that it shouldn't be needed and I considered it to be a hack. However, it's possible that there are other use-cases that I haven't considered, so I want to ask around if people have other situations than the three use-cases described above. If there are other such use-cases, I would like to include them in the new centralized implementation and add them to the git test suite, rather than going back to the old ways :)

In particular, the code before relied on `git merge-base`, but I don't see why we can't just lookup the most recent bors commit and assume that is a merge commit that is also upstream? I might be running into Chesterton's Fence here :)

CC `@pietroalbini` To make sure that this won't break downstream users of Rust's CI.

Best reviewed commit by commit.

Companion PRs:
- For testing beta: rust-lang#138597

r? `@onur-ozkan`

Fixes: rust-lang#101907

try-job: x86_64-gnu-aux
try-job: aarch64-gnu
@rust-log-analyzer
Copy link
Collaborator

The job x86_64-msvc-ext1 failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
 Downloading crates ...
  Downloaded bar v0.2.0 (registry `dummy-registry`)
  Downloaded bar v0.1.0 (registry `dummy-registry`)
   Compiling bar v0.1.0
   Compiling bar v0.2.0
   Compiling foo v0.1.0 (D:\a\rust\rust\build\x86_64-pc-windows-msvc\stage1-tools\x86_64-pc-windows-msvc\tmp\cit\t1155\foo)
error: couldn't create a temp dir: Access is denied. (os error 5) at path "C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\rustcPXPnzf"

error: could not compile `bar` (lib) due to 1 previous error
warning: build failed, waiting for other jobs to finish...


@bors
Copy link
Collaborator

bors commented Apr 21, 2025

💔 Test failed - checks-actions

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Apr 21, 2025
@Kobzol
Copy link
Contributor Author

Kobzol commented Apr 21, 2025

@bors retry spurious Windows error

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 21, 2025
bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 21, 2025
Refactor git change detection in bootstrap

While working on rust-lang#138395, I finally found the courage to delve into the insides of git path change detection in bootstrap, which is used (amongst other things) to detect if we should rebuilt od download `[llvm|rustc|gcc]`. I found it a bit hard to understand, and given that this code was historically quite fragile, I thought that it would be better to rebuild it from scratch.

The previous approach had a bunch of limitations:
- It separated the computation of "are there local changes?" and "what upstream SHA should we use?" even though these two things are intertwined.
- It used hacks to work around what happens on CI.
- It had special cases for CI scattered throughout the codebase, rather than centralized in one place.
- It wasn't documented enough and didn't have tests for the git behavior.

The current approach should hopefully resolve all of that. I implemented a single entrypoint called `check_path_modifications` (naming bikeshed pending, half of the time I spend on this PR was thinking about names, as it's quite tricky here..) that explicitly receives a mode of operation (in CI or outside CI), and accordingly figures out that upstream SHA that we should use for downloading artifacts and it also figures out if there are any local changes. Users of this function can then use this unified output to implement `download-ci-X` and other functionality. Notably, this change detection no longer uses `git merge-base`, which makes it easier to use and doesn't require setting up remotes.

I also added a bunch of integration tests that literally spawn a git repository on disk and then check that the function can deal with various situations (PR CI, auto/try CI, local builds).

After I built this inner layer, I used it for downloading GCC, LLVM and rustc. The latter two (and especially rustc) were using the `last_modified_commit` function before, but in all cases but one this function was actually only used to check if there are any local changes, which was IMO confusing. The LLVM handling would deserve a bit of refactoring, but that's a larger change that can be done as a follow-up.

I hope that the implementation is now clear and easy to understand, so that in combination with the tests we can have more confidence that it does what we want. I tried to include a lot of documentation in the code, so I won't be repeating the actual implementation details here, if there are any questions, I'll add the answers to the documentation too :)

The new approach explicitly supports three scenarios:
- Running on PR CI, where we have one upstream bors parent commit and one PR merge commit made by GitHub.
- Running on try/auto CI, where we have one upstream bors parent commit and one PR merge commit made by bors.
- Running locally, where we assume that we have at least one upstream bors parent commit in our git history.

I removed the handling of upstreams on CI, as I think that it shouldn't be needed and I considered it to be a hack. However, it's possible that there are other use-cases that I haven't considered, so I want to ask around if people have other situations than the three use-cases described above. If there are other such use-cases, I would like to include them in the new centralized implementation and add them to the git test suite, rather than going back to the old ways :)

In particular, the code before relied on `git merge-base`, but I don't see why we can't just lookup the most recent bors commit and assume that is a merge commit that is also upstream? I might be running into Chesterton's Fence here :)

CC `@pietroalbini` To make sure that this won't break downstream users of Rust's CI.

Best reviewed commit by commit.

Companion PRs:
- For testing beta: rust-lang#138597

r? `@onur-ozkan`

Fixes: rust-lang#101907

try-job: x86_64-gnu-aux
try-job: aarch64-gnu
@bors
Copy link
Collaborator

bors commented Apr 21, 2025

⌛ Testing commit fbca453 with merge 92ff1f9...

@rust-log-analyzer
Copy link
Collaborator

The job dist-x86_64-apple failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
[TIMING] core::build_steps::doc::TheBook { compiler: Compiler { stage: 2, host: x86_64-apple-darwin, forced_compiler: false }, target: x86_64-apple-darwin } -- 2.904
##[group]Documenting stage2 standalone (x86_64-apple-darwin)
##[endgroup]
[TIMING] core::build_steps::doc::Standalone { compiler: Compiler { stage: 2, host: x86_64-apple-darwin, forced_compiler: false }, target: x86_64-apple-darwin } -- 1.091
dyld[45212]: Library not loaded: @rpath/librustc_driver-c5e90e1f1fb2a596.dylib
  Referenced from: <A607F385-88DE-30FB-ADD4-6116F18D7913> /Users/runner/work/rust/rust/build/x86_64-apple-darwin/stage1-rustc/x86_64-apple-darwin/release/rustc-main
  Reason: tried: '/Users/runner/work/rust/rust/build/x86_64-apple-darwin/stage1-rustc/x86_64-apple-darwin/release/../lib/librustc_driver-c5e90e1f1fb2a596.dylib' (no such file), '/Users/runner/work/rust/rust/build/x86_64-apple-darwin/stage1-rustc/x86_64-apple-darwin/release/../lib/librustc_driver-c5e90e1f1fb2a596.dylib' (no such file)
Command "/Users/runner/work/rust/rust/build/x86_64-apple-darwin/stage2/bin/rustc" "--target" "x86_64-apple-darwin" "--print=deployment-target" (failure_mode=Exit) has failed. Rerun with -v to see more details.
Build completed unsuccessfully in 1:44:32
  local time: Mon Apr 21 09:52:57 UTC 2025
  network time: Mon, 21 Apr 2025 09:52:57 GMT
##[error]Process completed with exit code 1.
Post job cleanup.

@bors
Copy link
Collaborator

bors commented Apr 21, 2025

💔 Test failed - checks-actions

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Apr 21, 2025
@Kobzol
Copy link
Contributor Author

Kobzol commented Apr 22, 2025

Hmm, weird, dist jobs definitely shouldn't try to use this logic to download anything...

@bors try

@bors
Copy link
Collaborator

bors commented Apr 22, 2025

⌛ Trying commit fbca453 with merge 45ecdb6...

bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 22, 2025
Refactor git change detection in bootstrap

While working on rust-lang#138395, I finally found the courage to delve into the insides of git path change detection in bootstrap, which is used (amongst other things) to detect if we should rebuilt od download `[llvm|rustc|gcc]`. I found it a bit hard to understand, and given that this code was historically quite fragile, I thought that it would be better to rebuild it from scratch.

The previous approach had a bunch of limitations:
- It separated the computation of "are there local changes?" and "what upstream SHA should we use?" even though these two things are intertwined.
- It used hacks to work around what happens on CI.
- It had special cases for CI scattered throughout the codebase, rather than centralized in one place.
- It wasn't documented enough and didn't have tests for the git behavior.

The current approach should hopefully resolve all of that. I implemented a single entrypoint called `check_path_modifications` (naming bikeshed pending, half of the time I spend on this PR was thinking about names, as it's quite tricky here..) that explicitly receives a mode of operation (in CI or outside CI), and accordingly figures out that upstream SHA that we should use for downloading artifacts and it also figures out if there are any local changes. Users of this function can then use this unified output to implement `download-ci-X` and other functionality. Notably, this change detection no longer uses `git merge-base`, which makes it easier to use and doesn't require setting up remotes.

I also added a bunch of integration tests that literally spawn a git repository on disk and then check that the function can deal with various situations (PR CI, auto/try CI, local builds).

After I built this inner layer, I used it for downloading GCC, LLVM and rustc. The latter two (and especially rustc) were using the `last_modified_commit` function before, but in all cases but one this function was actually only used to check if there are any local changes, which was IMO confusing. The LLVM handling would deserve a bit of refactoring, but that's a larger change that can be done as a follow-up.

I hope that the implementation is now clear and easy to understand, so that in combination with the tests we can have more confidence that it does what we want. I tried to include a lot of documentation in the code, so I won't be repeating the actual implementation details here, if there are any questions, I'll add the answers to the documentation too :)

The new approach explicitly supports three scenarios:
- Running on PR CI, where we have one upstream bors parent commit and one PR merge commit made by GitHub.
- Running on try/auto CI, where we have one upstream bors parent commit and one PR merge commit made by bors.
- Running locally, where we assume that we have at least one upstream bors parent commit in our git history.

I removed the handling of upstreams on CI, as I think that it shouldn't be needed and I considered it to be a hack. However, it's possible that there are other use-cases that I haven't considered, so I want to ask around if people have other situations than the three use-cases described above. If there are other such use-cases, I would like to include them in the new centralized implementation and add them to the git test suite, rather than going back to the old ways :)

In particular, the code before relied on `git merge-base`, but I don't see why we can't just lookup the most recent bors commit and assume that is a merge commit that is also upstream? I might be running into Chesterton's Fence here :)

CC `@pietroalbini` To make sure that this won't break downstream users of Rust's CI.

Best reviewed commit by commit.

Companion PRs:
- For testing beta: rust-lang#138597

r? `@onur-ozkan`

Fixes: rust-lang#101907

try-job: x86_64-gnu-aux
try-job: aarch64-gnu
try-job: dist-x86_64-apple
@bors
Copy link
Collaborator

bors commented Apr 22, 2025

☀️ Try build successful - checks-actions
Build commit: 45ecdb6 (45ecdb6e0c486363390bed44b1f43fb269eab8e7)

@Kobzol
Copy link
Contributor Author

Kobzol commented Apr 22, 2025

Hmm, looks like it might have been spurious.

@bors r=Mark-Simulacrum

@bors
Copy link
Collaborator

bors commented Apr 22, 2025

💡 This pull request was already approved, no need to approve it again.

@bors
Copy link
Collaborator

bors commented Apr 22, 2025

📌 Commit fbca453 has been approved by Mark-Simulacrum

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 22, 2025
bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 23, 2025
Refactor git change detection in bootstrap

While working on rust-lang#138395, I finally found the courage to delve into the insides of git path change detection in bootstrap, which is used (amongst other things) to detect if we should rebuilt od download `[llvm|rustc|gcc]`. I found it a bit hard to understand, and given that this code was historically quite fragile, I thought that it would be better to rebuild it from scratch.

The previous approach had a bunch of limitations:
- It separated the computation of "are there local changes?" and "what upstream SHA should we use?" even though these two things are intertwined.
- It used hacks to work around what happens on CI.
- It had special cases for CI scattered throughout the codebase, rather than centralized in one place.
- It wasn't documented enough and didn't have tests for the git behavior.

The current approach should hopefully resolve all of that. I implemented a single entrypoint called `check_path_modifications` (naming bikeshed pending, half of the time I spend on this PR was thinking about names, as it's quite tricky here..) that explicitly receives a mode of operation (in CI or outside CI), and accordingly figures out that upstream SHA that we should use for downloading artifacts and it also figures out if there are any local changes. Users of this function can then use this unified output to implement `download-ci-X` and other functionality. Notably, this change detection no longer uses `git merge-base`, which makes it easier to use and doesn't require setting up remotes.

I also added a bunch of integration tests that literally spawn a git repository on disk and then check that the function can deal with various situations (PR CI, auto/try CI, local builds).

After I built this inner layer, I used it for downloading GCC, LLVM and rustc. The latter two (and especially rustc) were using the `last_modified_commit` function before, but in all cases but one this function was actually only used to check if there are any local changes, which was IMO confusing. The LLVM handling would deserve a bit of refactoring, but that's a larger change that can be done as a follow-up.

I hope that the implementation is now clear and easy to understand, so that in combination with the tests we can have more confidence that it does what we want. I tried to include a lot of documentation in the code, so I won't be repeating the actual implementation details here, if there are any questions, I'll add the answers to the documentation too :)

The new approach explicitly supports three scenarios:
- Running on PR CI, where we have one upstream bors parent commit and one PR merge commit made by GitHub.
- Running on try/auto CI, where we have one upstream bors parent commit and one PR merge commit made by bors.
- Running locally, where we assume that we have at least one upstream bors parent commit in our git history.

I removed the handling of upstreams on CI, as I think that it shouldn't be needed and I considered it to be a hack. However, it's possible that there are other use-cases that I haven't considered, so I want to ask around if people have other situations than the three use-cases described above. If there are other such use-cases, I would like to include them in the new centralized implementation and add them to the git test suite, rather than going back to the old ways :)

In particular, the code before relied on `git merge-base`, but I don't see why we can't just lookup the most recent bors commit and assume that is a merge commit that is also upstream? I might be running into Chesterton's Fence here :)

CC `@pietroalbini` To make sure that this won't break downstream users of Rust's CI.

Best reviewed commit by commit.

Companion PRs:
- For testing beta: rust-lang#138597

r? `@onur-ozkan`

Fixes: rust-lang#101907

try-job: x86_64-gnu-aux
try-job: aarch64-gnu
try-job: dist-x86_64-apple
@bors
Copy link
Collaborator

bors commented Apr 23, 2025

⌛ Testing commit fbca453 with merge 645d0ad...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-compiletest Area: The compiletest test runner A-testsuite Area: The testsuite used to check the correctness of rustc S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue. T-release Relevant to the release subteam, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Merge commits break LLVM CI download
9 participants