From ee5f40e08fd53bfcb37f3a670b55855c6fb36167 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jakub=20Ber=C3=A1nek?= Date: Mon, 4 Sep 2023 21:31:00 +0200 Subject: [PATCH 1/2] Add section about building an optimized version of `rustc` --- src/SUMMARY.md | 1 + src/building/optimized-build.md | 131 ++++++++++++++++++++++++++++++++ 2 files changed, 132 insertions(+) create mode 100644 src/building/optimized-build.md diff --git a/src/SUMMARY.md b/src/SUMMARY.md index c8481567f..101fcf880 100644 --- a/src/SUMMARY.md +++ b/src/SUMMARY.md @@ -14,6 +14,7 @@ - [Building Documentation](./building/compiler-documenting.md) - [Rustdoc overview](./rustdoc.md) - [Adding a new target](./building/new-target.md) + - [Optimized build](./building/optimized-build.md) - [Testing the compiler](./tests/intro.md) - [Running tests](./tests/running.md) - [Testing with Docker](./tests/docker.md) diff --git a/src/building/optimized-build.md b/src/building/optimized-build.md new file mode 100644 index 000000000..9731e60c7 --- /dev/null +++ b/src/building/optimized-build.md @@ -0,0 +1,131 @@ +# Optimized build of the compiler + + + +There are multiple additional build configuration options and techniques that can used to compile a +build of `rustc` that is as optimized as possible (for example when building `rustc` for a Linux +distribution). The status of these configuration options for various Rust targets is tracked [here]. +This page describes how you can use these approaches when building `rustc` yourself. + +[here]: https://github.com/rust-lang/rust/issues/103595 + +## Link-time optimization + +Link-time optimization is a powerful compiler technique that can increase program performance. To +enable (Thin-)LTO when building `rustc`, set the `rust.lto` config option to `"thin"` +in `config.toml`: + +```toml +[rust] +lto = "thin" +``` + +> Note that LTO for `rustc` is currently supported and tested only for +> the `x86_64-unknown-linux-gnu` target. Other targets *may* work, but no guarantees are provided. +> Notably, LTO optimized `rustc` currently produces [miscompilations] on Windows. + +[miscompilations]: https://github.com/rust-lang/rust/issues/109114 + +Enabling LTO on Linux has [produced] speed-ups by up to 10%. + +[produced]: https://github.com/rust-lang/rust/pull/101403#issuecomment-1288190019 + +## Memory allocator + +Using a different memory allocator for `rustc` can provide significant performance benefits. If you +want to enable the `jemalloc` allocator, you can set the `rust.jemalloc` option to `true` +in `config.toml`: + +```toml +[rust] +jemalloc = true +``` + +> Note that this option is currently only supported for Linux and macOS targets. + +## Codegen units + +Reducing the amount of codegen units per `rustc` crate can produce a faster build of the compiler. +You can modify the number of codegen units for `rustc` and `libstd` in `config.toml` with the +following options: + +```toml +[rust] +codegen-units = 1 +codegen-units-std = 1 +``` + +## Instruction set + +By default, `rustc` is compiled for a generic (and conservative) instruction set architecture +(depending on the selected target), to make it support as many CPUs as possible. If you want to +compile `rustc` for a specific instruction set architecture, you can set the `target_cpu` compiler +option in `RUSTFLAGS`: + +```bash +$ RUSTFLAGS="-C target_cpu=x86-64-v3" x.py build ... +``` + +If you also want to compile LLVM for a specific instruction set, you can set `llvm` flags +in `config.toml`: + +```toml +[llvm] +cxxflags = "-march=x86-64-v3" +cflags = "-march=x86-64-v3" +``` + +## Profile-guided optimization + +Applying profile-guided optimizations (or more generally, feedback-directed optimizations) can +produce a large increase to `rustc` performance, by up to 25%. However, these techniques are not +simply enabled by a configuration option, but rather they require a complex build workflow that +compiles `rustc` multiple times and profiles it on selected benchmarks. + +There is a tool called `opt-dist` that is used to optimize `rustc` with [PGO] (profile-guided +optimizations) and [BOLT] (a post-link binary optimizer) for builds distributed to end users. You +can examine the tool, which is located in `src/tools/opt-dist`, and build a custom PGO build +workflow based on it, or try to use it directly. Note that the tool is currently quite hardcoded to +the way we use it in Rust's continuous integration workflows, and it might require some custom +changes to make it work in a different environment. + +[PGO]: https://doc.rust-lang.org/rustc/profile-guided-optimization.html + +[BOLT]: https://github.com/llvm/llvm-project/blob/main/bolt/README.md + +To use the tool, you will need to provide some external dependencies: + +- A Python3 interpreter (for executing `x.py`). +- Compiled LLVM toolchain, with the `llvm-profdata` binary. Optionally, if you want to use BOLT, + the `llvm-bolt` and + `merge-fdata` binaries have to be available in the toolchain. +- Downloaded [Rust benchmark suite]. + +These dependencies are provided to `opt-dist` by an implementation of the [`Environment`] trait. You +can either implement the trait for your custom environment, by providing paths to these dependencies +in its methods, or reuse one of the existing implementations (currently, there is an implementation +for Linux and Windows). If you want your environment to support BOLT, return `true` from +the `supports_bolt` method. + +Here is an example of how can `opt-dist` be used with the default Linux environment (it assumes that +you execute the following commands on a Linux system): + +1. Build the tool with the following command: + ```bash + $ python3 x.py build tools/opt-dist + ``` +2. Run the tool with the `PGO_HOST` environment variable set to the 64-bit Linux target: + ```bash + $ PGO_HOST=x86_64-unknown-linux-gnu ./build/host/stage0-tools-bin/opt-dist + ``` + Note that the default Linux environment expects several hardcoded paths to exist: + - `/checkout` should contain a checkout of the Rust compiler repository that will be compiled. + - `/rustroot` should contain the compiled LLVM toolchain (containing BOLT). + - A Python 3 interpreter should be available under the `python3` binary. + - `/tmp/rustc-perf` should contain a downloaded checkout of the Rust benchmark suite. + +You can modify `LinuxEnvironment` (or implement your own) to override these paths. + +[`Environment`]: https://github.com/rust-lang/rust/blob/65e468f9c259749c210b1ae8972bfe14781f72f1/src/tools/opt-dist/src/environment/mod.rs#L8-L7 + +[Rust benchmark suite]: https://github.com/rust-lang/rustc-perf From 31006202cedcee69bc7186393576efc61c68ad3d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jakub=20Ber=C3=A1nek?= Date: Tue, 5 Sep 2023 16:34:37 +0200 Subject: [PATCH 2/2] Review changes --- src/building/optimized-build.md | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/src/building/optimized-build.md b/src/building/optimized-build.md index 9731e60c7..ee33b0592 100644 --- a/src/building/optimized-build.md +++ b/src/building/optimized-build.md @@ -22,7 +22,7 @@ lto = "thin" > Note that LTO for `rustc` is currently supported and tested only for > the `x86_64-unknown-linux-gnu` target. Other targets *may* work, but no guarantees are provided. -> Notably, LTO optimized `rustc` currently produces [miscompilations] on Windows. +> Notably, LTO-optimized `rustc` currently produces [miscompilations] on Windows. [miscompilations]: https://github.com/rust-lang/rust/issues/109114 @@ -63,7 +63,7 @@ compile `rustc` for a specific instruction set architecture, you can set the `ta option in `RUSTFLAGS`: ```bash -$ RUSTFLAGS="-C target_cpu=x86-64-v3" x.py build ... +RUSTFLAGS="-C target_cpu=x86-64-v3" ./x build ... ``` If you also want to compile LLVM for a specific instruction set, you can set `llvm` flags @@ -78,9 +78,9 @@ cflags = "-march=x86-64-v3" ## Profile-guided optimization Applying profile-guided optimizations (or more generally, feedback-directed optimizations) can -produce a large increase to `rustc` performance, by up to 25%. However, these techniques are not -simply enabled by a configuration option, but rather they require a complex build workflow that -compiles `rustc` multiple times and profiles it on selected benchmarks. +produce a large increase to `rustc` performance, by up to 15% ([1], [2]). However, these techniques +are not simply enabled by a configuration option, but rather they require a complex build workflow +that compiles `rustc` multiple times and profiles it on selected benchmarks. There is a tool called `opt-dist` that is used to optimize `rustc` with [PGO] (profile-guided optimizations) and [BOLT] (a post-link binary optimizer) for builds distributed to end users. You @@ -89,6 +89,9 @@ workflow based on it, or try to use it directly. Note that the tool is currently the way we use it in Rust's continuous integration workflows, and it might require some custom changes to make it work in a different environment. +[1]: https://blog.rust-lang.org/inside-rust/2020/11/11/exploring-pgo-for-the-rust-compiler.html#final-numbers-and-a-benchmarking-plot-twist +[2]: https://github.com/rust-lang/rust/pull/96978 + [PGO]: https://doc.rust-lang.org/rustc/profile-guided-optimization.html [BOLT]: https://github.com/llvm/llvm-project/blob/main/bolt/README.md @@ -112,11 +115,11 @@ you execute the following commands on a Linux system): 1. Build the tool with the following command: ```bash - $ python3 x.py build tools/opt-dist + ./x build tools/opt-dist ``` 2. Run the tool with the `PGO_HOST` environment variable set to the 64-bit Linux target: ```bash - $ PGO_HOST=x86_64-unknown-linux-gnu ./build/host/stage0-tools-bin/opt-dist + PGO_HOST=x86_64-unknown-linux-gnu ./build/host/stage0-tools-bin/opt-dist ``` Note that the default Linux environment expects several hardcoded paths to exist: - `/checkout` should contain a checkout of the Rust compiler repository that will be compiled. @@ -126,6 +129,6 @@ you execute the following commands on a Linux system): You can modify `LinuxEnvironment` (or implement your own) to override these paths. -[`Environment`]: https://github.com/rust-lang/rust/blob/65e468f9c259749c210b1ae8972bfe14781f72f1/src/tools/opt-dist/src/environment/mod.rs#L8-L7 +[`Environment`]: https://github.com/rust-lang/rust/blob/65e468f9c259749c210b1ae8972bfe14781f72f1/src/tools/opt-dist/src/environment/mod.rs#L8-L70 [Rust benchmark suite]: https://github.com/rust-lang/rustc-perf