Skip to content

fix: correct UTF-16 index handling in native MagicString#8693

Merged
graphite-app[bot] merged 1 commit intomainfrom
03-14-fix_8685
Mar 15, 2026
Merged

fix: correct UTF-16 index handling in native MagicString#8693
graphite-app[bot] merged 1 commit intomainfrom
03-14-fix_8685

Conversation

@IWANABETHATGUY
Copy link
Copy Markdown
Member

@IWANABETHATGUY IWANABETHATGUY commented Mar 14, 2026

Summary

  • Fix CharToByteMapper to index by UTF-16 code unit position (matching JS string indices) instead of Rust char position, and accumulate UTF-8 byte offsets instead of UTF-16 lengths
  • When slice indices fall mid-surrogate-pair, emit lone surrogates via napi_create_string_utf16 to match original magic-string behavior exactly
  • Add comprehensive unicode test suite covering emoji, CJK, mixed scripts, negative indices, and surrogate pair boundary slicing

Fixes #8685

Test plan

  • New magic-string-unicode.test.ts covers emoji slice/overwrite/remove, CJK, mixed scripts, negative indices, and lone surrogate emission
  • All 157 existing magic-string tests continue to pass
  • Exact repro from issue (slice on "some 🤷‍♂️ string") no longer panics

🤖 Generated with Claude Code

Copy link
Copy Markdown
Member Author

IWANABETHATGUY commented Mar 14, 2026


How to use the Graphite Merge Queue

Add the label graphite: merge-when-ready to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@netlify
Copy link
Copy Markdown

netlify bot commented Mar 14, 2026

Deploy Preview for rolldown-rs canceled.

Name Link
🔨 Latest commit f8be84a
🔍 Latest deploy log https://app.netlify.com/projects/rolldown-rs/deploys/69b6345033af070008d4f8f3

@IWANABETHATGUY IWANABETHATGUY changed the title fix: 8685 fix: correct UTF-16 index handling in native MagicString Mar 14, 2026
@IWANABETHATGUY IWANABETHATGUY force-pushed the 03-14-fix_8685 branch 4 times, most recently from 6ed7f47 to e03a0d1 Compare March 14, 2026 16:44
@IWANABETHATGUY IWANABETHATGUY marked this pull request as ready for review March 14, 2026 16:48
Copy link
Copy Markdown
Member Author

IWANABETHATGUY commented Mar 14, 2026

Merge activity

  • Mar 14, 4:49 PM UTC: The merge label 'graphite: merge-when-ready' was detected. This PR will be added to the Graphite merge queue once it meets the requirements.
  • Mar 15, 4:23 AM UTC: IWANABETHATGUY added this pull request to the Graphite merge queue.
  • Mar 15, 4:28 AM UTC: Merged by the Graphite merge queue.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes native MagicString index handling so JS-facing indices behave correctly with Unicode (UTF-16 code unit indexing), including exact surrogate-boundary slicing behavior to match magic-string.

Changes:

  • Reworked the native CharToByteMapper to map UTF-16 code unit indices (JS string indices) to UTF-8 byte offsets.
  • Updated slice to correctly handle indices that land inside surrogate pairs by emitting lone surrogates via UTF-16 N-API string creation.
  • Added a new unicode-focused test suite covering emoji, CJK, mixed scripts, negative indices, and surrogate-boundary slicing.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
packages/rolldown/tests/magic-string/magic-string-unicode.test.ts Adds coverage for unicode + surrogate-pair boundary behavior across MagicString APIs.
packages/rolldown/src/binding.d.cts Extends slice docs to document surrogate-boundary behavior and UTF-16 string creation details.
crates/rolldown_binding/src/types/binding_magic_string.rs Implements UTF-16-index → UTF-8-byte mapping and surrogate-aware slice returning UTF-16 JS strings when needed.

@github-actions
Copy link
Copy Markdown
Contributor

Benchmarks Rust

group                                                        pr                                     target
-----                                                        --                                     ------
bundle/bundle@multi-duplicated-top-level-symbol              1.05     67.5±1.69ms        ? ?/sec    1.00     64.4±1.75ms        ? ?/sec
bundle/bundle@multi-duplicated-top-level-symbol-sourcemap    1.04     75.8±1.96ms        ? ?/sec    1.00     72.9±2.39ms        ? ?/sec
bundle/bundle@rome_ts                                        1.05    146.0±5.93ms        ? ?/sec    1.00    138.5±3.30ms        ? ?/sec
bundle/bundle@rome_ts-sourcemap                              1.05    164.6±5.33ms        ? ?/sec    1.00    157.3±3.45ms        ? ?/sec
bundle/bundle@threejs                                        1.01     62.4±1.73ms        ? ?/sec    1.00     61.8±1.70ms        ? ?/sec
bundle/bundle@threejs-sourcemap                              1.02     72.5±2.49ms        ? ?/sec    1.00     71.1±2.00ms        ? ?/sec
bundle/bundle@threejs10x                                     1.00    707.4±6.53ms        ? ?/sec    1.00    704.7±6.67ms        ? ?/sec
bundle/bundle@threejs10x-sourcemap                           1.00    809.7±5.89ms        ? ?/sec    1.02    824.3±9.02ms        ? ?/sec

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq bot commented Mar 14, 2026

Merging this PR will not alter performance

✅ 6 untouched benchmarks
⏩ 8 skipped benchmarks1


Comparing 03-14-fix_8685 (e03a0d1) with main (e088676)2

Open in CodSpeed

Footnotes

  1. 8 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

  2. No successful run was found on main (68f6274) during the generation of this report, so e088676 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

Copilot AI review requested due to automatic review settings March 14, 2026 17:35
@IWANABETHATGUY IWANABETHATGUY requested review from Copilot and removed request for Copilot March 14, 2026 17:35
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes native MagicString index handling to align with JavaScript UTF-16 code unit indices (including surrogate-pair boundary behavior), preventing panics and matching magic-string semantics.

Changes:

  • Replace the char-based index→byte mapper with a UTF-16 code-unit index→UTF-8 byte offset mapper.
  • Update slice to correctly handle surrogate-pair boundary indices by emitting lone surrogates via UTF-16 N-API string creation.
  • Add a dedicated unicode regression test suite covering emoji/CJK/mixed scripts/negative indices/surrogate boundary slicing.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
packages/rolldown/tests/magic-string/magic-string-unicode.test.ts Adds unicode-focused regression tests for UTF-16 indexing and surrogate boundary slicing.
packages/rolldown/src/binding.d.cts Updates slice documentation to explicitly describe UTF-16 index semantics and lone-surrogate behavior.
crates/rolldown_binding/src/types/binding_magic_string.rs Implements UTF-16 index→byte mapping and updates binding methods (notably slice) to avoid panics and match JS behavior.

## Summary

- Fix `CharToByteMapper` to index by UTF-16 code unit position (matching JS string indices) instead of Rust char position, and accumulate UTF-8 byte offsets instead of UTF-16 lengths
- When `slice` indices fall mid-surrogate-pair, emit lone surrogates via `napi_create_string_utf16` to match original magic-string behavior exactly
- Add comprehensive unicode test suite covering emoji, CJK, mixed scripts, negative indices, and surrogate pair boundary slicing

Fixes #8685

## Test plan

- [x] New `magic-string-unicode.test.ts` covers emoji slice/overwrite/remove, CJK, mixed scripts, negative indices, and lone surrogate emission
- [x] All 157 existing magic-string tests continue to pass
- [x] Exact repro from issue (`slice` on `"some 🤷‍♂️ string"`) no longer panics

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Copilot AI review requested due to automatic review settings March 15, 2026 04:23
@IWANABETHATGUY IWANABETHATGUY review requested due to automatic review settings March 15, 2026 04:23
@graphite-app graphite-app bot merged commit f8be84a into main Mar 15, 2026
32 checks passed
@github-actions github-actions bot mentioned this pull request Mar 18, 2026
@github-actions github-actions bot mentioned this pull request Mar 18, 2026
shulaoda added a commit that referenced this pull request Mar 18, 2026
## [1.0.0-rc.10] - 2026-03-18

### 🚀 Features

- add indentExclusionRanges property to MagicString (#8746) by @IWANABETHATGUY
- expose `oxcRuntimePlugin` (#8654) by @sapphi-red
- rust: make bundler generic over FileSystem for in-memory benchmarks (#8652) by @Boshen

### 🐛 Bug Fixes

- rolldown_plugin_vite_dynamic_import_vars: align dynamic import fast check with Vite (#8760) by @shulaoda
- renamer: handle existing bindings in nested scopes when finding unique names (#8741) by @drewolson
- pass `yarn_pnp` option where needed (#8736) by @sapphi-red
- preserve optional chaining in namespace member expr rewrite (#8712) by @Copilot
- correct UTF-16 index handling in native MagicString (#8693) by @IWANABETHATGUY
- mark failing doctests as ignore (#8700) by @Boshen
- prevent may_partial_namespace from leaking through include_module (#8682) by @IWANABETHATGUY
- ci: bump native-build cache key to invalidate stale napi-rs artifacts (#8678) by @Boshen
- `comments.annotation: false` breaking tree-shaking (#8657) by @IWANABETHATGUY
- validate filenames for NUL bytes from chunkFileNames/entryFileNames (#8644) by @IWANABETHATGUY
- dce-only minify should not set NODE_ENV to production (#8651) by @IWANABETHATGUY

### 🚜 Refactor

- rust: remove dead `CrossModuleOptimizationConfig::side_effects_free_function_optimization` (#8673) by @Dunqing
- rust: simplify `cross_module_optimization` by removing redundant scope tracking (#8672) by @Dunqing
- simplify string repeat in guess_indentor (#8753) by @IWANABETHATGUY
- consolidate custom magic-string tests into one file (#8696) by @IWANABETHATGUY
- extract CJS bailout checks from include_symbol (#8683) by @IWANABETHATGUY
- rust: remove `BindingIdentifierExt` to use `BindingIdentifier::symbol_id()` instead (#8667) by @Dunqing
- bench: add bench_preset helper and inline presets (#8658) by @Boshen
- rust: filter external modules from entries instead of mapping bit positions (#8637) by @Dunqing

### 📚 Documentation

- clarify watch mode behavior and its limitations (#8751) by @sapphi-red
- add external link icon to GitHub button in Hero section (#8731) by @thisisnkc
- guide: clarify that `inject` option is only conceptually similar to esbuild's one (#8743) by @sapphi-red
- meta/design: add `devtools.md` (#8663) by @hyf0
- add viteplus alpha announcement banner (#8668) by @shulaoda

### ⚡ Performance

- rolldown: some minor perf optimization found by autoresearch (#8730) by @Brooooooklyn
- replace Vec allocation with lazy iterator in find_hash_placeholders (#8703) by @Boshen
- replace TypedDashMap with TypedMap in CustomField (#8708) by @Boshen
- bench: remove scan benchmark binary to halve LTO link time (#8694) by @Boshen

### 🧪 Testing

- watch: increase timeout for error output (#8766) by @sapphi-red
- vite-tests: remove JS plugin tests (#8767) by @sapphi-red
- watch: add CLI exit code test (#8752) by @sapphi-red
- normalize paths on Windows even if `resolve.symlinks` is false (#8483) by @sapphi-red

### ⚙️ Miscellaneous Tasks

- correct comment in bundle-analyzer-plugin.ts (#8770) by @origami-z
- upgrade oxc to 0.120.0 (#8764) by @Boshen
- enable all test for `reset` category in MagicString.test.ts (#8749) by @IWANABETHATGUY
- deps: update test262 submodule for tests (#8742) by @sapphi-red
- deps: update oxc apps (#8734) by @renovate[bot]
- deps: update softprops/action-gh-release action to v2.6.1 (#8724) by @renovate[bot]
- deps: update npm packages (major) (#8722) by @renovate[bot]
- deps: update github-actions (major) (#8721) by @renovate[bot]
- deps: update softprops/action-gh-release action to v2.6.0 (#8720) by @renovate[bot]
- deps: update npm packages (#8718) by @renovate[bot]
- deps: update rust crates (#8717) by @renovate[bot]
- deps: update github-actions (#8716) by @renovate[bot]
- deps: update dependency oxlint-tsgolint to v0.17.0 (#8713) by @renovate[bot]
- deps: bump cargo-shear to v1.11.2 (#8711) by @Boshen
- use org level `CODE_OF_CONDUCT.md` (#8706) by @sapphi-red
- fix cache key mismatch and remove redundant cache saves (#8695) by @Boshen
- deps: update oxc apps (#8692) by @renovate[bot]
- deps: update oxc apps (#8649) by @renovate[bot]
- should do matrix out side of reusable workflows 2 (#8691) by @hyf0
- should do matrix out side of reusable workflows (#8690) by @hyf0
- deps: update dependency rolldown-plugin-dts to v0.22.5 (#8689) by @renovate[bot]
- upgrade oxc to 0.119.0 and oxc_resolver to 11.19.1 (#8686) by @Boshen
- correct if condition of `type-check` job (#8677) by @hyf0
- Gate CI type-check job on node changes (#8669) by @Copilot
- benchmark: improve codspeed build (#8665) by @Boshen
- deps: update oxc to v0.118.0 (#8650) by @renovate[bot]
- deps: update crate-ci/typos action to v1.44.0 (#8647) by @renovate[bot]
- deps: update oxc resolver to v11.19.1 (#8646) by @renovate[bot]
- deps: update dependency rust to v1.94.0 (#8648) by @renovate[bot]
- deps: update dependency rolldown-plugin-dts to v0.22.4 (#8645) by @renovate[bot]

### ◀️ Revert

- Revert "ci: Gate CI type-check job on node changes" (#8674) by @hyf0
- "chore(deps): update dependency rust to v1.94.0 (#8648)" (#8660) by @shulaoda

### ❤️ New Contributors

* @origami-z made their first contribution in [#8770](#8770)
* @drewolson made their first contribution in [#8741](#8741)
* @thisisnkc made their first contribution in [#8731](#8731)

Co-authored-by: shulaoda <165626830+shulaoda@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Native Magic String index issues with UTF-16 characters

3 participants