perf(utils): avoid wasted allocation in legitimize_identifier_name#9926
Merged
Conversation
✅ Deploy Preview for rolldown-rs canceled.
|
f0cf573 to
ed4d4c1
Compare
Merging this PR will not alter performance
Comparing Footnotes
|
IWANABETHATGUY
approved these changes
Jun 22, 2026
Member
Merge activity
|
…9926) `legitimize_identifier_name` allocated `String::with_capacity(name.len())` at the top of the function, but the common case — a name that is already a valid identifier — returns `Cow::Borrowed(name)` early and never touches that String. So every already-valid identifier allocated and immediately freed a String. Move the allocation past the early return so it only happens when the name actually needs legitimizing. Measured with a dhat heap profile of a three.js bundle: - `legitimize_identifier_name`: **1,334 → 256 allocations (−81%)**; whole-bundle 84,652 → 83,566. No behaviour change (all 57 `rolldown_utils` tests pass); identical output, the String is just allocated later.
ed4d4c1 to
e0beeb8
Compare
graphite-app Bot
pushed a commit
that referenced
this pull request
Jun 23, 2026
… names (#9928) ## Summary `default_sanitize_file_name` (called once per output chunk/asset filename) unconditionally allocated `String::with_capacity(str.len())` and rebuilt the name char-by-char — even though the common case is a filename that contains **no** invalid characters (`index.js`, `react.production.min.js`, any path without shell/NTFS-unsafe chars) and needs no rewriting at all. This is the same wasted-allocation pattern recently fixed in `legitimize_identifier_name` (#9926). The function now returns `Cow<str>`: - **Clean path** (common): scan for the first invalid char; if none, return `Cow::Borrowed(str)` — zero allocation, zero copy. - **Dirty path**: allocate only when a replacement is actually needed, and bulk-copy the valid prefix (drive letter included) with `push_str` instead of one `char` at a time. The scan is done over **bytes, not chars**. Every invalid character is ASCII (≤ 0x7F), and UTF-8 guarantees that no byte of a multi-byte character is < 0x80, so a byte scan finds exactly the same positions as a char scan without per-char UTF-8 decoding, and every match lands on a char boundary. The dirty-path rewrite uses `u8::try_from(char)` rather than `char as u8` so a non-ASCII char (e.g. `😀`, whose low byte is `0x00`) is never truncated into a false match. Both call sites in `rolldown_common` already do `.into()` into `ArcStr` (which implements `From<Cow<str>>`), so they are unchanged. ## Measured impact Measured locally with a Criterion microbench (not committed), original `String` version vs this change: | input | before | after | change | |---|---|---|---| | `clean_short` (`index.js`) | 21.4 ns | 6.7 ns | **−69%** | | `clean_long` (78-char ASCII path) | 122 ns | 46 ns | **−63%** | | `clean_unicode` (30-char Cyrillic path) | 85 ns | 32 ns | **−63%** | | `dirty` (needs rewriting) | 60 ns | 49 ns | **−18%** | All changes statistically significant (p < 0.05). The byte scan is what unlocks the gains on longer and non-ASCII paths (no per-char decoding). ## Correctness Output is byte-for-byte identical to the previous implementation, including Windows drive-letter semantics (`C:/foo.js` preserved, later `:` still replaced: `C:/a:b.js` → `C:/a_b.js`). All `rolldown_utils` tests pass. - `test_sanitize_file_name` — the borrowed/owned split, empty string, and the Windows-drive paths. - `test_sanitize_unicode` — clean multi-byte names (2-, 3-, and 4-byte sequences: `café.js`, `日本語.js`, `компоненты/Кнопка.js`, `emoji_😀.js`) returned borrowed, and multi-byte chars surviving the rewrite path verbatim (`a?é` → `a_é`, `a?😀` → `a_😀`, `日本?語` → `日本_語`, `café:dir.js` → `café_dir.js`). 🤖 Generated with [Claude Code](https://claude.com/claude-code)
Merged
shulaoda
added a commit
that referenced
this pull request
Jun 24, 2026
## [1.1.3] - 2026-06-24 ### 🐛 Bug Fixes - `defer_drop` crashes the browser main thread (#9942) by @shulaoda - camel-case: correct camel case for nested values (#9933) by @kb019 - cli: display --help options in camelCase (#9941) by @IWANABETHATGUY - preserve used re-exports under preserveModules (#9122) (#9934) by @IWANABETHATGUY - watch: make close reentrant in event callbacks (#9904) by @hyf0 - git for windows treats symlink files as regular files (#9915) by @AliceLanniste - dev: cancel pending full reload on build error (#9903) by @h-a-n-a - chunking: pass plugin meta to codeSplitting groups name function (#9267) by @Kyujenius - dev: serve assets emitted during HMR/lazy compile (vite#22596) (#9815) by @h-a-n-a - release: dry-run step no longer publishes binding packages (#9866) by @Boshen ### 🚜 Refactor - rolldown_common: model ModuleId as a classified Path/Virtual/Bare enum (#9927) by @Boshen - remove unused LegacyModuleIdx (#9872) by @shulaoda - remove unused StmtInfos::get_namespace_stmt_info (#9870) by @shulaoda - remove unused Module::as_external_mut (#9871) by @shulaoda - remove unused EcmaAst::is_body_empty (#9869) by @shulaoda - drop dead is_css_module handling in resolve_dependencies (#9867) by @shulaoda - drop redundant with_commonjs on cjs source type (#9868) by @shulaoda ### 📚 Documentation - clarify on drafting PRs (#9952) by @h-a-n-a - update contribution guidelines (#9944) by @fubhy - note Rust crates don't follow semver in AGENTS.md (#9905) by @IWANABETHATGUY - add feedback form (#9159) by @TheAlexLichter ### ⚡ Performance - utils: avoid allocation in default_sanitize_file_name for clean names (#9928) by @Boshen - binding: box once-per-build futures before spawn_future (#9864) by @Boshen - utils: avoid wasted allocation in legitimize_identifier_name (#9926) by @Boshen - rolldown: fuse the canonical-name dedup and insert in the renamer (#9900) by @Boshen - rolldown: probe the name map once in ConflictResolver::resolve (#9899) by @Boshen - cut two heap allocations from wrapped ESM init finalize (#9901) by @Boshen - rolldown_plugin_vite_reporter: hoist invariant out_dir prefix out of reporter loop (#9873) by @shulaoda - drop throwaway Vec in wrapped esm init stmt (#9878) by @shulaoda - borrow owner_filename in build-import-analysis AddDeps (#9874) by @shulaoda ### 🧪 Testing - cover preserveModules named export via namespace re-export (#6010) (#9937) by @IWANABETHATGUY ### ⚙️ Miscellaneous Tasks - deps: update napi to v3.9.4 (#9954) by @shulaoda - reduce noise from CODEOWNERS for trival changes (#9953) by @h-a-n-a - deps: update mimalloc-safe to 0.1.64 (#9950) by @shulaoda - deps: update rollup submodule for tests to v4.62.2 (#9931) by @rolldown-guard[bot] - deps: test mimalloc-safe upstream-mimalloc switch in CI (#9930) by @shulaoda - rolldown_plugin_vite_build_import_analysis: remove unused v2 code path (#9917) by @shulaoda - rolldown_plugin_vite_manifest: remove unused is_enable_v2 code path (#9916) by @shulaoda - rolldown_plugin_vite_asset_import_meta_url: remove unexposed native vite plugin (#9896) by @shulaoda - rolldown_plugin_vite_asset: remove unexposed native vite plugin (#9895) by @shulaoda - rolldown_plugin_vite_css_post: remove unexposed native vite plugin (#9894) by @shulaoda - rolldown_plugin_vite_css: remove unexposed native vite plugin (#9893) by @shulaoda - rolldown_plugin_vite_html_inline_proxy: remove unexposed native vite plugin (#9892) by @shulaoda - rolldown_plugin_vite_html: remove unexposed native vite plugin (#9891) by @shulaoda - deps: update github actions (#9909) by @renovate[bot] - deps: update rust crate oxc_sourcemap to v8.0.2 (#9910) by @renovate[bot] - deps: update npm packages (#9912) by @renovate[bot] - deps: update github actions to v7 (#9913) by @renovate[bot] - deps: update rolldown-plugin-dts to ^0.26.0 (#9897) by @renovate[bot] - remove rolldown_filter_analyzer crate (#9865) by @Boshen ### ❤️ New Contributors * @fubhy made their first contribution in [#9944](#9944) Co-authored-by: shulaoda <165626830+shulaoda@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
legitimize_identifier_nameallocatedString::with_capacity(name.len())at the top of the function, but the common case — a name that is already a valid identifier — returnsCow::Borrowed(name)early and never touches that String. So every already-valid identifier allocated and immediately freed a String. Move the allocation past the early return so it only happens when the name actually needs legitimizing.Measured with a dhat heap profile of a three.js bundle:
legitimize_identifier_name: 1,334 → 256 allocations (−81%); whole-bundle 84,652 → 83,566.No behaviour change (all 57
rolldown_utilstests pass); identical output, the String is just allocated later.