fix: stabilize chunk content hashes#9356
Closed
DaZuiZui wants to merge 2 commits into
Closed
Conversation
Merging this PR will not alter performance
Comparing Footnotes
|
6b0c778 to
4d9fd8a
Compare
4d9fd8a to
facde99
Compare
Member
hyf0
added a commit
that referenced
this pull request
May 19, 2026
…ded (#9444) ## Summary Fixes #9339: adding or removing an isolated entry currently changes the file names of every other chunk in the bundle, even when those chunks are byte-identical between builds. The reported example uses `entryFileNames` / `chunkFileNames` with `codeSplitting` and observes the rolldown runtime chunk's hash flip across builds that should produce identical output. ## Root cause Hash placeholders (`!~{000}~`, `!~{001}~`, …) are assigned in rendering order, so their numeric indices shift whenever the chunk graph changes. Those indices leak into the hash from two places in `finalize_chunks.rs`: 1. **Standalone content hash**: the chunk's raw content includes placeholders pointing at sibling chunks (e.g. `import "./shared-!~{001}~.js"`), so when a sibling's index moves, this chunk's content bytes — and therefore its content hash — change. 2. **Final hash**: `preliminary_filename` (which contains the chunk's own placeholder) is mixed in directly, so when the chunk's own index moves, its final hash moves too. ## Fix Mirrors Rollup's `generateFinalHashes`: 1. **Normalize generated placeholders before hashing content.** A new `visit_with_placeholders_defaulted` helper in `rolldown_utils::hash_placeholder` walks the chunk content and streams it into `Xxh3` byte-slice by byte-slice, replacing each rolldown-generated `!~{xxx}~` with a zero-filled placeholder of the same shape. Placeholders rolldown didn't generate (literals in user source code that just happen to match the shape) are emitted verbatim so changes to them still flow into the hash — same `placeholders.has(placeholder)` check Rollup performs in `replacePlaceholdersWithDefaultAndGetContainedPlaceholders`. Streaming avoids materializing a chunk-sized normalized `String` per chunk; for a typical bundle with ~MB chunks containing cross-chunk imports, that's ~bundle-size's worth of throwaway allocations per build avoided. 2. **Drop `preliminary_filename` from the final hash, deconflict at the file-name layer instead.** Two chunks whose resolved file names would collide (case-insensitively, for HFS+/NTFS safety) are now disambiguated by rehashing the colliding chunk's hash until its file name is unique — the same `do { … } while (collision)` loop Rollup uses. A design doc covering the three invariants (stability across builds, sensitivity to real content, uniqueness within a build), the two-phase parallel pipeline + sequential deconflict pass, and known limitations is added at `meta/design/chunk-hash.md`. ## Why a separate PR, not a follow-up to #9356 #9356 fixed the content-hash side of the bug but had two issues that made it not the right shape to land as-is: - It kept the buggy `preliminary_filename.hash(&mut hasher)` line under an `if options.sourcemap_debug_ids` branch, so users with `sourcemapDebugIds: true` would still hit the original non-determinism. - It removed `preliminary_filename` from the final hash in the default mode without adding any collision handling, regressing the case Rollup explicitly covers in its [`deconflict-hashes` test](https://github.com/rollup/rollup/tree/master/test/chunking-form/samples/hashing/deconflict-hashes) (two byte-identical entries + `entryFileNames: '[hash].js'` → Rollup produces two distinct file names via rehash; #9356 would silently produce two assets with the same name). This PR takes the same overall direction (normalize placeholders in content) but lands the rehash loop alongside it so both invariants hold simultaneously, and applies uniformly regardless of `sourcemapDebugIds`. Thanks to @DaZuiZui for the original investigation and for surfacing the issue clearly — they're credited as co-author on the commit. ## Tests - New `packages/rolldown/tests/behaviors/hash-stability.test.ts`: builds the same set of inputs twice, once with an extra unrelated entry, and asserts shared chunks (`rolldown-runtime`, `react`, `a`, `b`) keep the same file name and code. This is a multi-build invariant that single-build snapshot tests can't capture. - `cargo test -p rolldown_utils` covers `visit_with_placeholders_defaulted`, including the case where an unknown placeholder-shaped literal is emitted verbatim. - The `crates/rolldown/tests/rolldown/hash/content_include_placeholder` fixture (source `console.log('_shared-!~{003}~.js');`) doubles as a regression test for the user-literal carve-out — its snapshot would silently change if we ever started normalizing unknown markers again. - Existing hash-related snapshots have been refreshed; the changes are hash-only, content unchanged. ## Known CI failure: `Vite Test Ubuntu` The `Vite Test Ubuntu` check fails on two `js-sourcemap.spec.ts` cases (`sourcemap is correct when preload information is injected`, `sourcemap is correct when using object as "define" value`). These tests use `toMatchInlineSnapshot` with concrete chunk hash strings (e.g. `dynamic-foo-B4JkVMbo.js`, `after-preload-dynamic-Dxsmo7dM.js.map`) hardcoded in the Vite `rolldown-canary` branch — those values were produced by `main`'s hashing algorithm. This PR changes the algorithm, so the snapshots no longer match. This is purely a cross-repo snapshot-pinning artefact, not a regression in behavior: the sourcemaps themselves are correct, only the embedded chunk file-name hashes differ. The Vite-side inline snapshots will need to be refreshed once this PR lands and a new rolldown canary is published. Asking reviewers to accept this failure. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: DaZuiZui <66861267+DaZuiZui@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
V1OL3TF0X
pushed a commit
to V1OL3TF0X/rolldown
that referenced
this pull request
May 25, 2026
…ded (rolldown#9444) Fixes rolldown#9339: adding or removing an isolated entry currently changes the file names of every other chunk in the bundle, even when those chunks are byte-identical between builds. The reported example uses `entryFileNames` / `chunkFileNames` with `codeSplitting` and observes the rolldown runtime chunk's hash flip across builds that should produce identical output. Hash placeholders (`!~{000}~`, `!~{001}~`, …) are assigned in rendering order, so their numeric indices shift whenever the chunk graph changes. Those indices leak into the hash from two places in `finalize_chunks.rs`: 1. **Standalone content hash**: the chunk's raw content includes placeholders pointing at sibling chunks (e.g. `import "./shared-!~{001}~.js"`), so when a sibling's index moves, this chunk's content bytes — and therefore its content hash — change. 2. **Final hash**: `preliminary_filename` (which contains the chunk's own placeholder) is mixed in directly, so when the chunk's own index moves, its final hash moves too. Mirrors Rollup's `generateFinalHashes`: 1. **Normalize generated placeholders before hashing content.** A new `visit_with_placeholders_defaulted` helper in `rolldown_utils::hash_placeholder` walks the chunk content and streams it into `Xxh3` byte-slice by byte-slice, replacing each rolldown-generated `!~{xxx}~` with a zero-filled placeholder of the same shape. Placeholders rolldown didn't generate (literals in user source code that just happen to match the shape) are emitted verbatim so changes to them still flow into the hash — same `placeholders.has(placeholder)` check Rollup performs in `replacePlaceholdersWithDefaultAndGetContainedPlaceholders`. Streaming avoids materializing a chunk-sized normalized `String` per chunk; for a typical bundle with ~MB chunks containing cross-chunk imports, that's ~bundle-size's worth of throwaway allocations per build avoided. 2. **Drop `preliminary_filename` from the final hash, deconflict at the file-name layer instead.** Two chunks whose resolved file names would collide (case-insensitively, for HFS+/NTFS safety) are now disambiguated by rehashing the colliding chunk's hash until its file name is unique — the same `do { … } while (collision)` loop Rollup uses. A design doc covering the three invariants (stability across builds, sensitivity to real content, uniqueness within a build), the two-phase parallel pipeline + sequential deconflict pass, and known limitations is added at `meta/design/chunk-hash.md`. made it not the right shape to land as-is: - It kept the buggy `preliminary_filename.hash(&mut hasher)` line under an `if options.sourcemap_debug_ids` branch, so users with `sourcemapDebugIds: true` would still hit the original non-determinism. - It removed `preliminary_filename` from the final hash in the default mode without adding any collision handling, regressing the case Rollup explicitly covers in its [`deconflict-hashes` test](https://github.com/rollup/rollup/tree/master/test/chunking-form/samples/hashing/deconflict-hashes) (two byte-identical entries + `entryFileNames: '[hash].js'` → Rollup produces two distinct file names via rehash; rolldown#9356 would silently produce two assets with the same name). This PR takes the same overall direction (normalize placeholders in content) but lands the rehash loop alongside it so both invariants hold simultaneously, and applies uniformly regardless of `sourcemapDebugIds`. Thanks to @DaZuiZui for the original investigation and for surfacing the issue clearly — they're credited as co-author on the commit. - New `packages/rolldown/tests/behaviors/hash-stability.test.ts`: builds the same set of inputs twice, once with an extra unrelated entry, and asserts shared chunks (`rolldown-runtime`, `react`, `a`, `b`) keep the same file name and code. This is a multi-build invariant that single-build snapshot tests can't capture. - `cargo test -p rolldown_utils` covers `visit_with_placeholders_defaulted`, including the case where an unknown placeholder-shaped literal is emitted verbatim. - The `crates/rolldown/tests/rolldown/hash/content_include_placeholder` fixture (source `console.log('_shared-!~{003}~.js');`) doubles as a regression test for the user-literal carve-out — its snapshot would silently change if we ever started normalizing unknown markers again. - Existing hash-related snapshots have been refreshed; the changes are hash-only, content unchanged. The `Vite Test Ubuntu` check fails on two `js-sourcemap.spec.ts` cases (`sourcemap is correct when preload information is injected`, `sourcemap is correct when using object as "define" value`). These tests use `toMatchInlineSnapshot` with concrete chunk hash strings (e.g. `dynamic-foo-B4JkVMbo.js`, `after-preload-dynamic-Dxsmo7dM.js.map`) hardcoded in the Vite `rolldown-canary` branch — those values were produced by `main`'s hashing algorithm. This PR changes the algorithm, so the snapshots no longer match. This is purely a cross-repo snapshot-pinning artefact, not a regression in behavior: the sourcemaps themselves are correct, only the embedded chunk file-name hashes differ. The Vite-side inline snapshots will need to be refreshed once this PR lands and a new rolldown canary is published. Asking reviewers to accept this failure. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: DaZuiZui <66861267+DaZuiZui@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
augmentChunkHashcontributions in the hash calculationFixes #9339.
Tests
just testAI Usage
This PR was prepared with assistance from OpenAI Codex. I reviewed the issue, implementation, snapshots, and test results before submitting.