fix: Use BTreeMap for deterministic lockfile dependency ordering#12254
Merged
Conversation
Replace HashMap with BTreeMap throughout the lockfile dependency resolution pipeline to ensure hashOfExternalDependencies is deterministic across turbo invocations. HashMap uses per-process random hash seeds (RandomState), so each turbo invocation iterates lockfile dependency entries in a different order. This non-determinism propagated through the transitive closure computation and parallel DashMap caching, producing different hashes for the same unchanged lockfile. The fix changes the Lockfile trait's all_dependencies return type from HashMap to BTreeMap, and updates all 5 lockfile implementations (pnpm, npm, yarn1, berry, bun) plus callers. For pnpm specifically, the internal data was already stored as BTreeMap — the dependencies() method was needlessly converting it to HashMap. Closes #12252
Contributor
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
github-actions Bot
added a commit
that referenced
this pull request
Mar 12, 2026
## Release v2.8.17-canary.5 Versioned docs: https://v2-8-17-canary-5.turborepo.dev ### Changes - release(turborepo): 2.8.17-canary.4 (#12249) (`c0ddebe`) - fix: Use BTreeMap for deterministic lockfile dependency ordering (#12254) (`3fc7d49`) Co-authored-by: Turbobot <turbobot@vercel.com>
anthonyshew
added a commit
that referenced
this pull request
Mar 12, 2026
## Summary - Switches `bundled_deps` in `PackageIndex` from `HashMap` to `BTreeMap` for deterministic iteration order in `find_package()` ## Context Follow-up to #12254. `find_package()` iterates `bundled_deps` and returns the first match when a package isn't found via workspace-scoped or top-level lookups. With `HashMap`, the per-process random seed produces different iteration orders across `turbo` invocations, making the result non-deterministic when multiple parents bundle the same dependency name. This is unlikely to trigger in practice (bundled deps are rare, and having the same dep bundled by multiple parents is rarer), but it closes the last remaining `HashMap` iteration on a hash-affecting code path across all five lockfile implementations. No performance impact — `bundled_deps` is almost always empty. Closes #12252
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
hashOfExternalDependenciesacrossturboinvocations (Non-deterministic hashOfExternalDependencies with pnpm lockfile v9 #12252)HashMapwithBTreeMapthroughout the lockfile dependency resolution pipeline (trait + all 5 implementations + callers)Why
HashMapuses per-process random hash seeds (RandomState), so eachturboinvocation iterates lockfile dependency entries in a different order. This propagated through the transitive closure computation and parallelDashMapcaching inall_transitive_closures, producing different external dependency hashes for the same unchanged lockfile.The
PackageSnapshotV7::dependencies()method was particularly ironic: the internal data was already stored asBTreeMap(viatype Map<K, V> = BTreeMap<K, V>), but the method converted it into aHashMapbefore returning, needlessly discarding the deterministic ordering.Testing
The
test_dependency_index_is_deterministically_orderedtest parses the same lockfile 50 times and asserts thatall_dependencies()always returns entries in the same order. Before this change, it found 50 distinct iteration orders from 50 parses. This tests goes out of its way to force non-determinism into these codepaths.How did we not have coverage over this before?
We have many places where we ensure our hashes are deterministic, but this slipped through for a tricky reason. Tests are single-process, and HashMap happens to use seeds from the same thread-local RNG. The
DashMaps there were introduced for performance interleave non-determinstically, but the inputs were always deterministic because of the thread scheduling.Outside of the testing environment, there are, of course, more threads that vary across process invocations, so this bug shows up. We were testing that the inputs into hashes were deterministic, but not the ordering of those inputs.
Closes #12252