Skip to content

Fix rust nondeterminism#170981

Merged
happysalada merged 3 commits intoNixOS:masterfrom
davidscherer:fix-rust-nondeterminism
May 1, 2022
Merged

Fix rust nondeterminism#170981
happysalada merged 3 commits intoNixOS:masterfrom
davidscherer:fix-rust-nondeterminism

Conversation

@davidscherer
Copy link
Copy Markdown
Contributor

Description of changes

Fix the nondeterminism of buildRustCrate reported in #130309

Essentially reverts #57936 which introduced the nondeterminism while also making codegen-units configurable as in #132721

The codegen-units option of rustc affects the output of the build (as is documented), so it's not safe to default it to NIX_BUILD_CORES which may differ between builds of the same derivation (especially on different machines). Default it to 1 since that supposedly produces the most efficient binaries but also make it configurable. As far as I know, any fixed value should be deterministic.

Things done
  • Built on platform(s)

    • x86_64-linux
  • Tested, as applicable:

Before (d731df5):

$ nix-build -A cargo-download --no-out-link --cores 1 --no-substitute
/nix/store/cknil3jb6h81f52rklmxsnpcyll786a2-rust_cargo-download-0.1.2
$ nix-build -A cargo-download --no-out-link --cores 1 --no-substitute --check
/nix/store/cknil3jb6h81f52rklmxsnpcyll786a2-rust_cargo-download-0.1.2
$ nix-build -A cargo-download --no-out-link --cores 3 --no-substitute --check
error: derivation '/nix/store/yb0gfvzip97d6h5cdva2bjjixnxpgs12-rust_cargo-download-0.1.2.drv' may not be deterministic: output '/nix/store/cknil3jb6h81f52rklmxsnpcyll786a2-rust_cargo-download-0.1.2' differs

After:

$ nix-build -A cargo-download --no-out-link --cores 1
/nix/store/ny86k3pp79y4biwzlaybh63rp207si20-rust_cargo-download-0.1.2
$ nix-build -A cargo-download --no-out-link --cores 1 --check
/nix/store/ny86k3pp79y4biwzlaybh63rp207si20-rust_cargo-download-0.1.2
$ nix-build -A cargo-download --no-out-link --cores 3 --check
/nix/store/ny86k3pp79y4biwzlaybh63rp207si20-rust_cargo-download-0.1.2
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 22.05 Release Notes (or backporting 21.11 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
    • (Release notes changes) Ran nixos/doc/manual/md-to-db.sh to update generated release notes
  • Fits CONTRIBUTING.md.

Fuuzetsu and others added 3 commits October 12, 2021 09:25
This parameter is being set to `$NIX_BUILD_CORES` by default. This is a
standard practice but there's a suspicion that this can produce broken
builds. For some details see
cargo2nix/cargo2nix#184 . As a
work-around/test, it'd be good if codegen-units can be set to something
constant, such as `1`. This PR allows it.

Note that the default of `$NIX_BUILD_CORES` is preserved so this MR
causes no change in default behaviour and no rebuilds.
@davidscherer davidscherer requested a review from zowoq as a code owner April 30, 2022 02:41
@github-actions github-actions bot added the 6.topic: rust General-purpose programming language emphasizing performance, type safety, and concurrency. label Apr 30, 2022
@ofborg ofborg bot requested a review from adisbladis April 30, 2022 02:55
@ofborg ofborg bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. labels Apr 30, 2022
@Fuuzetsu
Copy link
Copy Markdown
Member

I don't think rust guys are promising deterministic builds at any codegen level which is usually fine as long as the produced ABI is the same. AFAIU GHC has the same approach: you can get different binary by compiling 10 times with parallelism enabled but as long as it's producing same ABI, it's considered OK.

For example, I think for a while (maybe still) unstripped binaries contained local build-time paths.

So I'm actually unsure about this change: on one hand it does tend to make binary output more stable and in theory produces best runtime performance but on another hand, it makes compilations slow by default and goes against the precedent of using NIX_BUILD_CORES as a value to things like ghc -j or make -j, I think.

At the very least, I think we should at least not say it fixes a bug: there's no bug in principle though of course it's best if we can have compiler produce same output every time.

@davidscherer
Copy link
Copy Markdown
Contributor Author

"Reproducible builds" is literally the banner claim on nixos.org. This is well understood in the industry to mean bit-for-bit determinism.

rustc does attempt to provide reproducible builds and even has a label in their issue tracker for bugs that break them. Of course upstream may have bugs from time to time, and Rust's macro system may be powerful enough that you could make a nondeterministic crate even in the Nix sandbox.

codegen-units is not really very similar to make -j. Cargo, Rust's own build system, does not change codegen-units based on its -j parameter (and, accordingly, buildRustPackage does, probably safely, set the -j parameter to cargo to $NIX_BUILD_CORES). buildRustCrate-based builds normally get the type of parallelism that cargo gets by building each crate in a separate Nix build.

I think what you want to argue for, if you want faster build time at the expense of run-time performance, is defaulting codegen-units to 16 as rustc and cargo do (or equivalently omitting it from the command line when not specified). I don't know how LLVM actually decides how many threads to use for code generation, but it clearly is not using this parameter - it defaults to 256 in incremental builds! As far as I know this would be just as deterministic as setting it to 1 - the only problem is with setting it to a nondeterministic value - and it seems reasonable enough to me to use the default from upstream (who presumably have a pretty good idea of the tradeoffs).

For what it's worth, in the single case I benchmarked, the difference between codegen-units=1 and codegen-units=16 was a bit less than a factor of 2 of build time on my 24-core machine. I don't have a great test handy for executable performance.

@davidscherer
Copy link
Copy Markdown
Contributor Author

I don't personally have a very strong opinion on 1 vs 16; I went with 1 because that seemed to be a weak consensus in the discussion in #130309. @kevincox do you have an opinion?

@kevincox
Copy link
Copy Markdown
Contributor

1 has slightly better performance at what is usually a tiny build speed cost. I think 1 is fine for now as it seemed the most popular. We can always hear arguments to change the value later.

@happysalada happysalada merged commit 13a9006 into NixOS:master May 1, 2022
@davidscherer davidscherer deleted the fix-rust-nondeterminism branch May 1, 2022 16:24
@Fuuzetsu
Copy link
Copy Markdown
Member

Fuuzetsu commented May 2, 2022

rustc does attempt to provide reproducible builds

wasn't aware this was a bit-for-bit goal; then fair enough

Rust's macro system may be powerful enough that you could make a nondeterministic crate even in the Nix sandbox

Just as a bit of trivia, this is actually very simple to do: so simple some crates do it on accident which ends up being pretty unfun experience. For example JelteF/derive_more#177 but there are more examples.

@nixos-discourse
Copy link
Copy Markdown

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/crate2nix-setting-codegenunits-for-all-crates/53014/6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

6.topic: rust General-purpose programming language emphasizing performance, type safety, and concurrency. 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. 12.first-time contribution This PR is the author's first one; please be gentle!

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants