Skip to content

fix: replace source for multi byte chars#198

Merged
SyMind merged 6 commits intomainfrom
fix-multi-byte-chars
Nov 5, 2025
Merged

fix: replace source for multi byte chars#198
SyMind merged 6 commits intomainfrom
fix-multi-byte-chars

Conversation

@SyMind
Copy link
Copy Markdown
Collaborator

@SyMind SyMind commented Nov 4, 2025

This PR adds UTF-16 encoding support for source maps to correctly handle multi-byte Unicode characters. Source maps use UTF-16 code units for column positions, so this change ensures proper handling of characters like emojis and CJK characters that occupy multiple UTF-16 code units.

  • Added utf16_len() methods to Rope and SourceText trait for calculating UTF-16 length
  • Modified WithIndices to build index mappings based on UTF-16 code units instead of UTF-8 characters
  • Updated column offset calculations throughout to use UTF-16 lengths instead of byte lengths
  • Added comprehensive test coverage for UTF-16 handling in multiple modules

Copilot AI review requested due to automatic review settings November 4, 2025 02:45
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds UTF-16 encoding support for source maps to correctly handle multi-byte Unicode characters. Source maps use UTF-16 code units for column positions, so this change ensures proper handling of characters like emojis and CJK characters that occupy multiple UTF-16 code units.

  • Added utf16_len() methods to Rope and SourceText trait for calculating UTF-16 length
  • Modified WithIndices to build index mappings based on UTF-16 code units instead of UTF-8 characters
  • Updated column offset calculations throughout to use UTF-16 lengths instead of byte lengths
  • Added comprehensive test coverage for UTF-16 handling in multiple modules

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
src/with_indices.rs Modified character indexing logic to handle UTF-16 code units and updated test to use emoji characters
src/rope.rs Added utf16_len() method to calculate rope length in UTF-16 code units
src/replace_source.rs Updated column offset calculations to use UTF-16 lengths and added UTF-16 handling test
src/helpers.rs Added utf16_len() to SourceText trait, updated final column calculations, and added comprehensive UTF-16 tests

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq bot commented Nov 4, 2025

CodSpeed Performance Report

Merging #198 will degrade performances by 11.21%

Comparing fix-multi-byte-chars (3939bc2) with main (4310b76)

Summary

❌ 1 regression
✅ 9 untouched
⏩ 6 skipped1

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark BASE HEAD Change
complex_replace_source_map 43 ms 48.4 ms -11.21%

Footnotes

  1. 6 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@SyMind SyMind force-pushed the fix-multi-byte-chars branch 6 times, most recently from 9b65c00 to adace86 Compare November 4, 2025 13:23
@SyMind SyMind marked this pull request as draft November 4, 2025 13:25
@SyMind SyMind force-pushed the fix-multi-byte-chars branch 9 times, most recently from 2b20f05 to de1a5d0 Compare November 5, 2025 03:52
@SyMind SyMind force-pushed the fix-multi-byte-chars branch from de1a5d0 to 3939bc2 Compare November 5, 2025 03:58
@SyMind SyMind marked this pull request as ready for review November 5, 2025 04:02
@SyMind SyMind merged commit b1eb535 into main Nov 5, 2025
9 of 10 checks passed
@SyMind SyMind deleted the fix-multi-byte-chars branch November 5, 2025 04:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants