perf: remove the url crate (-226 KB)#9790
Closed
Boshen wants to merge 2 commits into
Closed
Conversation
`url` pulls in `idna` -> `idna_adapter` -> ICU4X purely for internationalized-domain-name handling that rolldown never exercises (it only parses `file://`, sourcemap and option URLs, all ASCII). Replace the three `Url` call sites with targeted parsing and drop the dependency: - file_url.rs: parse `file://` URLs by hand (host/path/query/fragment), keeping the existing fileURLToPath posix/windows logic and rejection rules. Adds unit tests. - normalize_binding_options.rs: validate `sourcemapBaseUrl` is an absolute URL and append a trailing `/`, without `Url::set_path`. Adds unit tests. - process_code_and_sourcemap.rs: the base is normalized to end with `/`, so joining the relative map filename is a plain append. Removes `url` + `idna` + `idna_adapter` + the ICU4X stack from the shipped binary (idna/ICU remain only for the jsonschema test-dependency). Measured (aarch64 LTO release): librolldown_binding.dylib 23,675,392 -> 23,443,904 bytes (-226 KB).
✅ Deploy Preview for rolldown-rs ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
✅ Deploy Preview for rolldown-rs ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
This was referenced Jun 16, 2026
Member
Author
graphite-app Bot
pushed a commit
that referenced
this pull request
Jun 17, 2026
…29 KB) (#9811) ## Disable `idna`'s ICU backend instead of removing `url` `url` is pulled in **only** by rolldown crates, and it drags the entire IDNA/Unicode stack along: ``` url -> idna -> idna_adapter -> ICU4X (icu_normalizer / icu_properties + *_data + zerovec / yoke / tinystr / ... — ~20 crates) ``` That ships **~129 KB** of Unicode tables in `librolldown_binding.node` purely for internationalized-domain-name handling (`münchen.de` → `xn--mnchen-3ya.de`) that rolldown never needs. `url` mandates `idna` for special-scheme hosts (incl. `file:`) and has **no feature** to turn it off. ### This PR — the least-invasive option [`idna_adapter`](https://docs.rs/crate/idna_adapter) is the indirection layer that lets `idna` pick a Unicode backend. Its **1.0.0** release is the official *no-ICU* backend (zero dependencies); **1.1+** switched to ICU4X. Pinning it to `=1.0.0` makes `idna` resolve to the ASCII-only implementation and drops the entire ICU4X stack from the build — while **`url` and its WHATWG behaviour stay exactly as-is (zero new code)**. - `Cargo.toml`: `idna_adapter = "=1.0.0"` in `[workspace.dependencies]`, referenced from `rolldown_plugin_vite_resolve` (the crate behind the `url` edge). - The exact `=1.0.0` requirement makes any ICU-backed version **unresolvable**, so a blanket `cargo update` can't silently pull it back in — verified: `cargo update -p idna_adapter --precise 1.2.2` is rejected with `failed to select a version for the requirement idna_adapter = "=1.0.0"`. - Excluded from `cargo-shear` (`[package.metadata.cargo-shear]`) since it's a build-only pin we never import. ### Trade-off Per the [`idna_adapter` docs](https://docs.rs/crate/idna_adapter/latest#:~:text=Turning%20off%20IDNA%20support), the no-Unicode backend rejects non-ASCII domain inputs and skips UTS-46 enforcement on Punycode labels. rolldown only ever feeds `url` ASCII (file paths / sourcemap URLs), so this is a no-op in practice. ### Impact (darwin-arm64, fat-LTO release, same-machine A/B) | build | `rolldown-binding.node` | saved | |---|---|---| | baseline (`url` + `idna` + ICU4X) | 23,409,856 | — | | **this PR (no-ICU `idna_adapter`)** | **23,277,696** | **−132,160 bytes (−129 KB)** | The `icu_*` / `zerovec` / `yoke` / `tinystr` / … crates are gone from the build (`idna` + ICU remain in `Cargo.lock` only behind the `jsonschema` **test** dependency). The −132,160-byte saving is byte-for-byte the same as the "keep `url`" row measured in #9794, confirming the official `idna_adapter 1.0.0` matches a vendored ASCII stub. ### Relationship to #9790 / #9794 Both #9790 (hand-rolled parser, −226 KB) and #9794 (`iri-string`, −210 KB) removed `url` entirely and explicitly flagged this as the safe, minimal alternative. This captures ~57% of the maximum saving with **no behaviour change and no URL-parsing code to own**, so it supersedes both — they're closed in favour of this.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Do we really need the
urlcrate?This is a draft to ask exactly that.
urlis one of our heavier dependencies relative to what it actually does for us.What it costs
urlis pulled in only by rolldown crates, and it drags the entire IDNA/Unicode stack along:That is ~226 KB of the shipped
librolldown_binding.node(__constUnicode tables +url/idnacode), all for internationalized-domain-name handling (münchen.de->xn--mnchen-3ya.de) that rolldown never needs.urlmandatesidnafor special-scheme hosts (incl.file:) and has no feature to turn it off.What we actually use
urlfor (3 sites)rolldown_plugin_vite_resolve/src/file_url.rsfile://URLs -> path (Vite'sfileURLToPath)rolldown_binding/.../normalize_binding_options.rssourcemapBaseUrl+ ensure trailing/rolldown/.../process_code_and_sourcemap.rssourcemapBaseUrlAll three only ever see ASCII (file paths / sourcemap URLs), never IDN hosts.
This PR
Replaces those three sites with targeted parsing and drops
url:file://splitter (host / path / query / fragment), keeping the exact posix/windowsfileURLToPathrules and the%2F/ hostname rejectionsnormalize_sourcemap_base_url(): validatescheme://…and append a trailing//, so the join becomes a plain append11 new unit tests, all passing; matches the existing
output/sourcemap-base-urlfixture's expected//# sourceMappingURL=…output.Impact (aarch64, LTO release, same worktree A/B)
url+idna+ ICU4X)urlremoved)url/idna/idna_adapter/icu_*are gone from the binary (nmconfirms 0 symbols; they remain inCargo.lockonly behind thejsonschematest dependency).Open questions (why this is a draft)
url's WHATWG-compliant implementation. Unit tests cover the common cases, buturl's edge cases (path./..normalization, exotic percent-encoding) are no longer guaranteed. Acceptable forsourcemapBaseUrl, a public option that mirrors Rollup?file://path.urland only swapidna's ICU backend for a tiny ASCII-onlyidna_adaptervia[patch.crates-io]— saves ~129 KB (57% of this) with zero change tourl's behavior. Easy to switch to if we'd rather not bring URL handling in-house.Opening as a draft to get a call on the size-vs-correctness trade before polishing.