Skip to content

scripts: gen_handles: Sort the device handles #48195

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 5, 2022

Conversation

keith-zephyr
Copy link
Collaborator

Sort the device handles stored in the __device_handles section. This
ensures Zephyr builds are reproducible.

Signed-off-by: Keith Short [email protected]

@keith-zephyr keith-zephyr force-pushed the gen_handles-sort branch 2 times, most recently from fee903f to b79cf07 Compare July 25, 2022 20:01
@keith-zephyr
Copy link
Collaborator Author

Note that the twister failure is unrelated to this change and is fixed by PR48280

@JordanYates
Copy link
Collaborator

Just bringing your attention to #47582, which completely reworks this script.
I suspect it will be easier to make this change on top of that (and make my life easier as well)

coreboot-bot pushed a commit to coreboot/chrome-ec that referenced this pull request Jul 27, 2022
Add a the option "--static" to zmake configure/build. This is passed
through to the generate_ec_version.py script to generate static version
information for reproducible builds.

Note that Zephyr builds require
zephyrproject-rtos/zephyr#48195 to compare
correctly.

BUG=none
BRANCH=none
TEST=zmake testall
TEST=zmake build herobrine -B /tmp/zephyr-rev1 --static
TEST=zmake build herobrine -B /tmp/zephyr-rev2 --static
TEST=cmp /tmp/zephyr-rev1/herobrine/output/zephyr.bin \
  /tmp/zephyr-rev2/herobrine/output/zephyr.bin

Signed-off-by: Keith Short <[email protected]>
Change-Id: I07361f21c865e2b233ebdb2dc730d25afe13a843
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/platform/ec/+/3786857
Reviewed-by: Abe Levkoy <[email protected]>
@keith-zephyr keith-zephyr force-pushed the gen_handles-sort branch 2 times, most recently from e0425ac to ae29f5f Compare August 1, 2022 22:03
@keith-zephyr keith-zephyr marked this pull request as ready for review August 1, 2022 22:08
@keith-zephyr
Copy link
Collaborator Author

Rebased on top of @JordanYates changes.

@keith-zephyr keith-zephyr dismissed aaronemassey’s stale review August 2, 2022 16:40

Comments no longer apply after rebase on latest gen_handles script.

yperess
yperess previously approved these changes Aug 2, 2022
Copy link
Collaborator

@JordanYates JordanYates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks much better overall, some minor comments

Replicate the devicetree dependencies into a sorted list. This ensures
that the structures added to the .__device_handles_pass2 section are
reproducible from build to build.

Tested with: west build -b native_posix tests/drivers/build_all/sensor

Without this change, two consecutive builds do not compare.

Signed-off-by: Keith Short <[email protected]>
@carlescufi carlescufi merged commit f896fc2 into zephyrproject-rtos:main Aug 5, 2022
@keith-zephyr keith-zephyr deleted the gen_handles-sort branch August 5, 2022 15:44
@marc-hb
Copy link
Collaborator

marc-hb commented Sep 14, 2022

@keith-zephyr how did you find this? While debugging something else, private CI, other?

@keith-zephyr
Copy link
Collaborator Author

@keith-zephyr how did you find this? While debugging something else, private CI, other?

I was manually comparing Chromebook EC builds in our downstream repo and found that just running 2 consecutive builds yielded binaries that didn't compare. I isolated the problem to the generated dev_handles.c file.

We don't have a CI test in our downstream repo, but rather developers can manually run a script to compare builds.

@marc-hb
Copy link
Collaborator

marc-hb commented Oct 27, 2022

marc-hb added a commit to marc-hb/zephyr that referenced this pull request Nov 4, 2022
Temporary bugs, corner cases and obsolete toolchains aside, the Zephyr
build is most of the time reproducible: zephyrproject-rtos#50205 and zephyrproject-rtos#14593.

This means two different build machines using the same toolchain will
always produce the same binary output. The one-line addition in this
commit makes it trivial to verify that binary outputs are indeed the
same by adding a single checksum line in the build logs:

```
[16/16] Linking C executable zephyr/zephyr.elf
Memory region         Used Size  Region Size  %age Used
             RAM:       53280 B         3 MB      1.69%
        IDT_LIST:          0 GB         2 KB      0.00%

fdd2ddf2ad7d5da5bbd79b41cef...7b16ef549a8281111d8e205  zephyr.strip
```

This commit makes a non-measurable build time difference.

Build reproducibility matters for (at least) two important reasons:

- Security / supply chain attacks, see https://www.cisa.gov/sbom, zephyrproject-rtos#50205,
  https://reproducible-builds.org/ and many others.
- Making sure build configurations are strictly identical when trying to
  reproduce elusive issues or when issuing releases.

Displaying a reproducible checksum accelerates the investigation of
temporary reproducibility issues like zephyrproject-rtos#48195.

Signed-off-by: Marc Herbert <[email protected]>
marc-hb added a commit to marc-hb/zephyr that referenced this pull request Nov 4, 2022
Temporary bugs, corner cases and obsolete toolchains aside, the Zephyr
build is reproducible most of the time: zephyrproject-rtos#50205 and zephyrproject-rtos#14593

This means two different build machines using the same toolchain will
always produce the same binary output. The previous, one-line commit made
it trivial to verify that binary outputs are indeed the same by adding this
single line in the buid logs:

```
[16/16] Linking C executable zephyr/zephyr.elf
Memory region         Used Size  Region Size  %age Used
             RAM:       53280 B         3 MB      1.69%
        IDT_LIST:          0 GB         2 KB      0.00%
fdd2ddf2ad7d5da5bbd79b41cef8d7...1a896b989a8281111d8e205  zephyr.strip
```

This commit enables that feature by default because build
reproducibility matters for (at least) two important reasons:

- Security / supply chain attacks, see https://www.cisa.gov/sbom, zephyrproject-rtos#50205,
  https://reproducible-builds.org/ and many others.
- Making sure build configurations are strictly identical when trying to
  reproduce elusive issues or when issuing releases.

It was of course already possible to _manually_ make this Kconfig change
and manually compute this checksum. However this can be impossible when
dealing with an automated build system that does not archive all
_intermediate_ (zephyrproject-rtos#5009) files like `zephyr.elf`. Tweaking the build
configuration can also be difficult and error-prone for people who are
not Zephyr developers.

Most automated CI systems preserve build logs by default.

Displaying the reproducible checksum by default accelerates the
discovery of reproducibility bugs like zephyrproject-rtos#48195.

When measured with `west build -p -b qemu_x86 samples/hello_world/`, the
additional `build/zephyr/zephyr.strip` disk space required is 43
kilobytes compared to a total of 11 Megabytes. Measuring a more
realistic SOF example, `zephyr.strip` weighed 690 kb which was about
0.1% of a total `build/` directory weighing 65M.

To measure the build time cost I ran `west build -p -b qemu_x86
samples/hello_world/` many times in a loop with and without this PR on
my Linux workstation. Stripping and checksumming made literally no time
difference compared to the "noise" observed when building the same
configuration. This is not surprising considering how small
`zephyr.strip`: so the extra cost is most likely dominated by process
creation and the total number of processes created during a Zephyr build
dwarfs the few extra processes required by this feature.

More surprisingly, I measured incremental builds by running `touch
kernel/timer.c; west build ...` in a loop and I could not observe any
visible time difference either.

Signed-off-by: Marc Herbert <[email protected]>
@marc-hb
Copy link
Collaborator

marc-hb commented Nov 4, 2022

@keith-zephyr would these two additional lines and checksum have sped things up?

@marc-hb
Copy link
Collaborator

marc-hb commented Dec 1, 2023

Just bringing your attention to #47582, which completely reworks this script.

Rebased on top of @JordanYates changes.

I find this timing ironic: while this PR was making the build reproducible, 47582 was adding a non-deterministic graph. Fix just submitted one year later:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants