Skip to content

Turbopack: switch chunk/asset hashes from hex to base40 encoding#91137

Merged
sokra merged 10 commits into
canaryfrom
sokra/base64-hashes
Mar 13, 2026
Merged

Turbopack: switch chunk/asset hashes from hex to base40 encoding#91137
sokra merged 10 commits into
canaryfrom
sokra/base64-hashes

Conversation

@sokra

@sokra sokra commented Mar 10, 2026

Copy link
Copy Markdown
Member

What?

Switch Turbopack's hash encoding for chunk and asset output filenames from hexadecimal (base16) to base40, using the alphabet `0-9 a-z _ - ~ .`. Version hashes (used for HMR update comparison, not filenames) use base64 instead.

Why?

Base40 encodes the same number of bits in fewer characters than hex, producing shorter output filenames. All 40 characters are RFC 3986 unreserved (URL-safe) and safe on case-insensitive filesystems (macOS HFS+/APFS, Windows NTFS).

Hash truncation lengths are reduced proportionally to maintain equivalent collision resistance:

Context Before (hex) After (base40) Entropy
Content hash in chunk filenames 16 chars 13 chars ~69 bits
Content hash in asset filenames 8 chars 13 chars ~69 bits
Ident disambiguator hash 8 chars 7 chars ~37 bits
Long-path prefix hash 5 chars 4 chars ~21 bits

How?

New encoding module (`turbo-tasks-hash/src/base40.rs`):

  • Defines the base40 alphabet and length constants (`BASE40_LEN_64 = 13`, `BASE40_LEN_128 = 25`)
  • Implements a generic `encode_base40_fixed` helper to avoid duplication
  • Public API: `encode_base40(u64) -> String` and `encode_base40_128(u128) -> String`

New base64 encoding (`turbo-tasks-hash/src/base64.rs`):

  • `encode_base64(u64) -> String` — 11-char base64 (no padding) for version hashes
  • Version hashes don't appear in URLs or filenames, so base64 is safe and shorter

New `HashAlgorithm` variants (`turbo-tasks-hash/src/lib.rs`):

  • `Xxh3Hash64Base40` and `Xxh3Hash128Base40` added alongside existing hex variants
  • Existing hex variants kept for internal manifests and identifiers

`ContentHashing` moved to `turbopack-core`:

  • Moved from `turbopack-browser` to `turbopack-core/src/chunk/mod.rs` so both `BrowserChunkingContext` and `NodeJsChunkingContext` can use it

Separate chunk vs asset content hashing:

  • `BrowserChunkingContext`: `content_hashing` renamed to `chunk_content_hashing` (optional), new `asset_content_hashing: ContentHashing` field (non-optional, defaults to 13 chars)
  • `NodeJsChunkingContext`: new `asset_content_hashing: ContentHashing` field (non-optional, defaults to 13 chars)
  • Builder methods: `use_content_hashing()` renamed to `chunk_content_hashing()`, new `asset_content_hashing()`

Version hashes switched to base64:

  • `turbopack-nodejs/src/ecmascript/node/version.rs`
  • `turbopack-dev-server/src/html.rs`
  • `turbopack-browser/src/ecmascript/version.rs`, `merged/version.rs`, `list/version.rs`

Other callers updated (15 files across turbopack and next-core):

  • All chunk/asset content hashing switched from `Xxh3Hash128Hex` → `Xxh3Hash128Base40`
  • `ContentHashing::Direct { length }` reduced from 16 → 13
  • Asset path truncations use full 13-char base40 hash (matching chunk filenames)

Exception — `wasm_edge_var_name` (`turbopack-wasm/src/lib.rs`):

  • Kept as `Xxh3Hash128Hex` because the hash is used as part of a JavaScript variable name (`wasm_{hash}`), and base40 characters `-`, `~`, `.` are not valid JS identifier characters.

Scope — NOT changed:

  • Webpack configuration (unchanged)
  • Internal manifests (`routes_hashes_manifest`, `project_asset_hashes_manifest`)
  • Internal identifiers (font naming, external module hashing, data URI sources, debug IDs)
  • SRI hashes (SHA-based Base64, different purpose)

@nextjs-bot nextjs-bot added created-by: Turbopack team PRs by the Turbopack team. Turbopack Related to Turbopack with Next.js. labels Mar 10, 2026
@sokra sokra force-pushed the sokra/base64-hashes branch from 6aa3f4c to a8e5979 Compare March 10, 2026 05:14
Comment thread turbopack/crates/turbopack-wasm/src/lib.rs Outdated
@nextjs-bot

nextjs-bot commented Mar 10, 2026

Copy link
Copy Markdown
Contributor

Tests Passed

@codspeed-hq

codspeed-hq Bot commented Mar 10, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

✅ 17 untouched benchmarks
⏩ 3 skipped benchmarks1


Comparing sokra/base64-hashes (f37a37f) with canary (236a76d)

Open in CodSpeed

Footnotes

  1. 3 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@nextjs-bot

nextjs-bot commented Mar 10, 2026

Copy link
Copy Markdown
Contributor

Allow CI Workflow Run

  • approve CI run for commit: 9acc28a

Note: this should only be enabled once the PR is ready to go and can only be enabled by a maintainer

Comment thread turbopack/crates/turbopack-nodejs/src/ecmascript/node/version.rs Outdated
Comment thread turbopack/crates/turbopack-dev-server/src/html.rs
Comment thread turbopack/crates/turbopack-browser/src/chunking_context.rs Outdated
@sokra sokra force-pushed the sokra/base64-hashes branch from bced220 to 70a7b11 Compare March 12, 2026 10:33
@nextjs-bot

nextjs-bot commented Mar 12, 2026

Copy link
Copy Markdown
Contributor

Stats from current PR

✅ No significant changes detected

📊 All Metrics
📖 Metrics Glossary

Dev Server Metrics:

  • Listen = TCP port starts accepting connections
  • First Request = HTTP server returns successful response
  • Cold = Fresh build (no cache)
  • Warm = With cached build artifacts

Build Metrics:

  • Fresh = Clean build (no .next directory)
  • Cached = With existing .next directory

Change Thresholds:

  • Time: Changes < 50ms AND < 10%, OR < 2% are insignificant
  • Size: Changes < 1KB AND < 1% are insignificant
  • All other changes are flagged to catch regressions

⚡ Dev Server

Metric Canary PR Change Trend
Cold (Listen) 456ms 455ms ▁▂▁▆▅
Cold (Ready in log) 439ms 438ms ▁▁▁▄▃
Cold (First Request) 1.284s 1.284s ▁▂▁▄▃
Warm (Listen) 456ms 457ms ▁▂▁▅▅
Warm (Ready in log) 448ms 443ms ▁▁▁▄▄
Warm (First Request) 349ms 347ms ▁▁▁▂▃
📦 Dev Server (Webpack) (Legacy)

📦 Dev Server (Webpack)

Metric Canary PR Change Trend
Cold (Listen) 456ms 455ms ▁▁▃▃▁
Cold (Ready in log) 437ms 435ms ▃▂▆▅▂
Cold (First Request) 1.855s 1.873s ▂▁▅▄▁
Warm (Listen) 456ms 456ms ▁▁▅▅▁
Warm (Ready in log) 435ms 435ms ▃▃▆▅▃
Warm (First Request) 1.892s 1.864s ▁▁▆▄▁

⚡ Production Builds

Metric Canary PR Change Trend
Fresh Build 3.895s 3.838s ▁▂▁▄▃
Cached Build 3.906s 3.874s ▁▂▁▄▃
📦 Production Builds (Webpack) (Legacy)

📦 Production Builds (Webpack)

Metric Canary PR Change Trend
Fresh Build 14.172s 14.169s ▁▁▆▄▁
Cached Build 14.381s 14.403s ▁▁▆▄▁
node_modules Size 482 MB 482 MB ▁▁▁▁▁
📦 Bundle Sizes

Bundle Sizes

⚡ Turbopack

Client

Main Bundles: **408 kB** → **408 kB** ✅ -19 B

80 files with content-based hashes (individual files not comparable between builds)

Server

Middleware
Canary PR Change
middleware-b..fest.js gzip 762 B 759 B
Total 762 B 759 B ✅ -3 B
Build Details
Build Manifests
Canary PR Change
_buildManifest.js gzip 450 B 451 B
Total 450 B 451 B ⚠️ +1 B

📦 Webpack

Client

Main Bundles
Canary PR Change
5528-HASH.js gzip 5.54 kB N/A -
6280-HASH.js gzip 59.9 kB N/A -
6335.HASH.js gzip 169 B N/A -
912-HASH.js gzip 4.59 kB N/A -
e8aec2e4-HASH.js gzip 62.7 kB N/A -
framework-HASH.js gzip 59.7 kB 59.7 kB
main-app-HASH.js gzip 256 B 253 B 🟢 3 B (-1%)
main-HASH.js gzip 39.2 kB 39.2 kB
webpack-HASH.js gzip 1.68 kB 1.68 kB
262-HASH.js gzip N/A 4.59 kB -
2889.HASH.js gzip N/A 169 B -
5602-HASH.js gzip N/A 5.55 kB -
6948ada0-HASH.js gzip N/A 62.7 kB -
9544-HASH.js gzip N/A 60.6 kB -
Total 234 kB 235 kB ⚠️ +720 B
Polyfills
Canary PR Change
polyfills-HASH.js gzip 39.4 kB 39.4 kB
Total 39.4 kB 39.4 kB
Pages
Canary PR Change
_app-HASH.js gzip 194 B 194 B
_error-HASH.js gzip 183 B 180 B 🟢 3 B (-2%)
css-HASH.js gzip 331 B 330 B
dynamic-HASH.js gzip 1.81 kB 1.81 kB
edge-ssr-HASH.js gzip 256 B 256 B
head-HASH.js gzip 351 B 352 B
hooks-HASH.js gzip 384 B 383 B
image-HASH.js gzip 580 B 581 B
index-HASH.js gzip 260 B 260 B
link-HASH.js gzip 2.51 kB 2.51 kB
routerDirect..HASH.js gzip 320 B 319 B
script-HASH.js gzip 386 B 386 B
withRouter-HASH.js gzip 315 B 315 B
1afbb74e6ecf..834.css gzip 106 B 106 B
Total 7.98 kB 7.98 kB ✅ -1 B

Server

Edge SSR
Canary PR Change
edge-ssr.js gzip 125 kB 125 kB
page.js gzip 267 kB 267 kB
Total 392 kB 392 kB ✅ -240 B
Middleware
Canary PR Change
middleware-b..fest.js gzip 615 B 614 B
middleware-r..fest.js gzip 156 B 155 B
middleware.js gzip 44 kB 43.9 kB
edge-runtime..pack.js gzip 842 B 842 B
Total 45.6 kB 45.5 kB ✅ -61 B
Build Details
Build Manifests
Canary PR Change
_buildManifest.js gzip 715 B 718 B
Total 715 B 718 B ⚠️ +3 B
Build Cache
Canary PR Change
0.pack gzip 4.24 MB 4.23 MB 🟢 8.01 kB (0%)
index.pack gzip 108 kB 109 kB
index.pack.old gzip 107 kB 108 kB
Total 4.45 MB 4.45 MB ✅ -6.59 kB

🔄 Shared (bundler-independent)

Runtimes
Canary PR Change
app-page-exp...dev.js gzip 332 kB 332 kB
app-page-exp..prod.js gzip 180 kB 180 kB
app-page-tur...dev.js gzip 332 kB 332 kB
app-page-tur..prod.js gzip 180 kB 180 kB
app-page-tur...dev.js gzip 328 kB 328 kB
app-page-tur..prod.js gzip 178 kB 178 kB
app-page.run...dev.js gzip 329 kB 329 kB
app-page.run..prod.js gzip 178 kB 178 kB
app-route-ex...dev.js gzip 76 kB 76 kB
app-route-ex..prod.js gzip 51.7 kB 51.7 kB
app-route-tu...dev.js gzip 76 kB 76 kB
app-route-tu..prod.js gzip 51.7 kB 51.7 kB
app-route-tu...dev.js gzip 75.6 kB 75.6 kB
app-route-tu..prod.js gzip 51.5 kB 51.5 kB
app-route.ru...dev.js gzip 75.5 kB 75.5 kB
app-route.ru..prod.js gzip 51.4 kB 51.4 kB
dist_client_...dev.js gzip 324 B 324 B
dist_client_...dev.js gzip 326 B 326 B
dist_client_...dev.js gzip 318 B 318 B
dist_client_...dev.js gzip 317 B 317 B
pages-api-tu...dev.js gzip 43.3 kB 43.3 kB
pages-api-tu..prod.js gzip 33 kB 33 kB
pages-api.ru...dev.js gzip 43.3 kB 43.3 kB
pages-api.ru..prod.js gzip 33 kB 33 kB
pages-turbo....dev.js gzip 52.7 kB 52.7 kB
pages-turbo...prod.js gzip 38.6 kB 38.6 kB
pages.runtim...dev.js gzip 52.7 kB 52.7 kB
pages.runtim..prod.js gzip 38.6 kB 38.6 kB
server.runti..prod.js gzip 62.4 kB 62.4 kB
Total 2.95 MB 2.95 MB ✅ -5 B
📝 Changed Files (2 files)

Files with changes:

  • pages-api.runtime.dev.js
  • pages.runtime.dev.js
View diffs
pages-api.runtime.dev.js

Diff too large to display

pages.runtime.dev.js

Diff too large to display

📎 Tarball URL
https://vercel-packages.vercel.app/next/commits/f37a37f6aebba8bbe6043a34b3f484acf8279a80/next

@sokra sokra marked this pull request as ready for review March 12, 2026 11:30
@sokra sokra requested a review from mischnic March 12, 2026 11:35
Comment thread crates/next-core/src/next_client/context.rs Outdated
Comment thread turbopack/crates/turbopack-browser/src/chunking_context.rs Outdated
Comment thread turbopack/crates/turbo-tasks-hash/src/base40.rs Outdated
Comment thread turbopack/crates/turbo-tasks-hash/src/base40.rs Outdated
sokra and others added 9 commits March 13, 2026 19:19
Use a base40 alphabet (0-9 a-z _ - ~ .) for hash encoding in output
filenames instead of hexadecimal. This produces shorter hashes while
maintaining equivalent collision resistance:
- 16 hex chars → 13 base40 chars (~64 bits)
- 8 hex chars → 7 base40 chars (~32 bits)
- 5 hex chars → 4 base40 chars (~20 bits)

The alphabet is URL-safe (RFC 3986 unreserved) and filesystem-safe on
all OSes including case-insensitive filesystems.

Internal manifests and identifiers remain hex-encoded.
- Extract shared `encode_base40_fixed<N>` generic helper to eliminate
  duplication between `encode_base40` and `encode_base40_128`
- Export `BASE40_LEN_64` and `BASE40_LEN_128` constants for the full
  hash string widths
- Add comments at truncation sites documenting the approximate
  bit-strength of each truncated hash
- Extract `short_hash` variable in asset_path methods to avoid
  repeating the truncation slice in both format! arms
- Update module doc comment to mention base40 encoding
Match chunk filename hash length (13 base40 chars ≈ 69 bits) for asset
filenames instead of the previous 7-char truncation.
…variable names containing invalid identifier characters (`-`, `~`, `.`), causing syntax errors in generated WASM loader code.

This commit fixes the issue reported at turbopack/crates/turbopack-wasm/src/lib.rs:29

**Bug explanation:**

The `wasm_edge_var_name` function generates a JavaScript variable name of the form `wasm_{hash}`. This variable name is interpolated directly into JavaScript code in `loader.rs` at two call sites (lines 51 and 78), appearing in expressions like:

```js
const { exports } = await __turbopack_wasm__(wasmPath, () => wasm_abc123, imports);
```

and:

```js
const mod = await __turbopack_wasm_module__(wasmPath, () => wasm_abc123);
```

The hash algorithm was changed from `Xxh3Hash128Hex` to `Xxh3Hash128Base40`. The base40 alphabet is `0123456789abcdefghijklmnopqrstuvwxyz_-~.`, which includes `-`, `~`, and `.` — characters that are NOT valid JavaScript identifier characters. A 25-character base40 hash has approximately an 87% chance of containing at least one of these three characters, making this effectively a guaranteed failure.

For example, a generated variable name like `wasm_abc-def~ghi.jkl` would produce invalid JavaScript:
```js
() => wasm_abc-def~ghi.jkl  // SyntaxError: `-` is subtraction, `~` is bitwise NOT, `.` is property access
```

**Fix explanation:**

Reverted the hash algorithm for `wasm_edge_var_name` back to `Xxh3Hash128Hex`, which only produces `0-9a-f` characters — all valid in JavaScript identifiers. Other base40 usages (for filenames, version strings, output asset paths) are correct and left unchanged, as the base40 alphabet is URL-safe and filesystem-safe for those contexts.

Co-authored-by: Vercel <vercel[bot]@users.noreply.github.com>
Co-authored-by: sokra <tobias.koppers@googlemail.com>
…to base64

- Move ContentHashing enum from turbopack-browser to turbopack-core
  so both Browser and NodeJs chunking contexts can use it
- Rename content_hashing -> chunk_content_hashing in BrowserChunkingContext
- Add separate asset_content_hashing field to both BrowserChunkingContext
  and NodeJsChunkingContext
- Rename builder method use_content_hashing() -> chunk_content_hashing()
- Add builder method asset_content_hashing()
- Switch all version hashes from base40 to base64 encoding since
  version identifiers don't need URL/filesystem safety
- Add encode_base64 helper for u64 -> base64 encoding
asset_content_hashing is always needed for asset paths, so make it a
plain ContentHashing instead of Option<ContentHashing>. Defaults to
ContentHashing::Direct { length: 13 } (69 bits of collision resistance).
Regenerate all Turbopack snapshot test files to match new base40 hash
format. Update test helper regexes (stripTestHash, stripVercelPngHash)
and inline patterns from [0-9a-f] to [0-9a-z_.~-] to match the base40
character set.
…age hashes

Co-Authored-By: Claude <noreply@anthropic.com>
…oded comment

- Remove `.asset_content_hashing(ContentHashing::Direct { length: 13 })` from
  BrowserChunkingContext builder since 13 is already the default
- Remove hardcoded "13 base40 chars" comments from asset_path() in both
  BrowserChunkingContext and NodeJsChunkingContext since length is now dynamic
Replace hardcoded `BASE40_LEN_64 = 13` and `BASE40_LEN_128 = 25` with
a const fn `digits_for_bits()` that computes the number of base-N digits
needed to represent all values of a given bit width. Static assertions
verify the computed values match expectations.
@sokra sokra force-pushed the sokra/base64-hashes branch from fe36984 to f37a37f Compare March 13, 2026 19:19
@sokra sokra enabled auto-merge (squash) March 13, 2026 20:11
@sokra sokra merged commit e22988e into canary Mar 13, 2026
283 of 287 checks passed
@sokra sokra deleted the sokra/base64-hashes branch March 13, 2026 20:38
@github-actions github-actions Bot locked as resolved and limited conversation to collaborators Mar 28, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

created-by: Turbopack team PRs by the Turbopack team. locked tests Turbopack Related to Turbopack with Next.js.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants