Reproducible Package Builds
Summary
A post-build cleanup is integrated into the RPM build process so that common causes of build irreproducibility in packages are removed, making most of Fedora packages reproducible.
Owner
- Name: Davide Cavalca
- Name: Neil Hanlon
- Name: Miro Hrončok
- Name: Zbigniew Jędrzejewski-Szmek
- Email: dcavalca@fedoraproject.org
- Email: neil at shrug.pw
- Email: mhroncok at redhat.com
- Email: zbyszek at in.waw.pl
Current status
- Targeted release: Fedora Linux 41
- Last updated: 2024-05-08
- Announced
- Discussion Thread
- FESCo issue: #3201
- Tracker bug: #2279765
- Release notes tracker: <will be assigned by the Wrangler>
Detailed Description
As of 2023 there is an active effort to implement Reproducible builds in Fedora. Reproducible builds will allow our users to be able to independently verify that the RPMs have not been tampered with (either maliciously or via hardware/software fault): someone can do an independent rebuild of a package and confirm that they get identical binaries when building with the same versions of the compiler and other tools. This Change allows us to move forward in this direction by removing the common sources of irreproducibility.
add-determinism is a Rust program which, as its name suggests, adds determinism to files that are given as input by attempting to standardize metadata contained in binary or source files to ensure consistency and clamping to $SOURCE_DATE_EPOCH in all instances. add-determinism
is the "Fedora version" of strip-nondeterminism from the Debian project. Since strip-nondeterminism is written in perl, it is undesirable for use in Fedora, as we don't want to pull perl in the buildroot for every package.
It's worth noting that this Change does not intend to impose any specific reproducibility requirements on Fedora packages. Once this Change is implemented and we have been through a mass rebuild and can verify that the common causes of irreproducibility have indeed been removed, we can consider further steps. But that will be at least one release later.
This change does add a small amount of time to the processing of RPMs at the end of a build. Accordingly, packages containing large quantities or sizes of files be slower, but this effect is not expected to be noticeable. add-determinism
takes steps to ensure it does not interfere with other buildroot post processors like mangle-shebangs
, python-hardlink
, python-bytecompile
. It defaults to not doing any modifications in case it doesn't understand the input file or there are any other problems.
add-determinism
uses Python marshalparser module for pyc
files and links to libpython3.xx.so
. This functionality will be made optional, so that the dependencies are only pulled in when python3
is already installed in the buildroot.
A mechanism to opt-out will be provided: to either completely disable the postprocessing step or to disable specific "handlers" (i.e. implementations of cleanup for specific file types, for example static archives). See macros.build-reproducibility.
Related Changes
- Clamp build mtimes to SOURCE_DATE_EPOCH
- RPM 4.20 — this pulls in changes to
%autosetup -S git
which removed a source of irreproducibility.
Feedback
Benefit to Fedora
Adding determinism (i.e., removing non-determinsim) enables the Fedora community to have confidence that, if given the same source code, build environment, build instructions, and metadata from the build artifacts, any party can recreate copies of the artifacts that are identical except for the signatures and some parts of metadata.
Reproducibility of builds leads to packages of higher quality. It turns out that quite often those irreproducible bits are caused by an error or sloppiness in the code. In particular, any dependence on architecture in noarch packages is almost always unwanted and/or a bug. Test builds that check reproducibility will expose such instances.
Reproducibility of builds makes it easier to develop packages: when a small change is made and a package is rebuilt (in the same environment), then with a reproducible package, the only difference is directly caused by the change. If the package is different every time it is rebuilt, making a comparison is much harder.
Build reproducibility for noarch subpackages solves the problem where package builds on different architectures are different, causing mock to reject the whole build. In particular, this issue occurs for pyc files. This will now be solved without requiring opt-in from individual packages.
Scope
- Proposal Owners:
- Integrate
add-determinism
as a BuildRoot Policy script - Add a dependency on
marshalparser
topython3
(probably conditionalized onrpm-build
)
- Integrate
- Other Developers:
- Test their packages with the additional phase, report problems
- Potentially integrate changes to packages to enable reproducibility
- Release Engineering: Ideally we want this to happen before the mass rebuild, but that is not strictly required.
- Policies and Guidelines: Fedora Packaging Guidelines should be updated to include information on the add-determinism BuildRoot Policy. User documentation should be amended to include instructions on how to verify reproducibility for a given package, and what packages are known to be non-reproducible, and how to opt-out.
- Trademark approval: N/A (not needed for this Change)
- Alignment with Community Initiatives: All software and requests are consistent with the decision process and similar across other groups in Fedora. The Fedora Reproducibility Working group begin at Flock 2023 in Cork.
Upgrade/compatibility impact
No impact is expected.
How To Test
To test on the level of individual files:
- install
add-determinism
- call
SOURCE_DATE_EPOCH=… add-determinism -v ./path/to/file
To test package builds:
- build a local copy of
redhat-rpm-config
with https://src.fedoraproject.org/rpms/redhat-rpm-config/pull-request/293 - install
add-determinism
- build packages ;)
(This can be done on a normal system or in a mock chroot.)
User Experience
No impact is expected.
Dependencies
Contingency Plan
- Contingency mechanism:
- In case of major problems, disable the change in
redhat-rpm-config
. - In case of problems with specific packages, opt-out by setting a macro.
- In case of major problems, disable the change in
- Contingency deadline: No limit really.
- Blocks release? No.
Documentation
Release Notes
Fedora package builds are now more deterministic, bringing the distribution closer to the goal of achieving fully reproducible builds for all of its packages.