Skip to content

editor: Fix multi-line cursor expansion when multi-byte characters are involved#51780

Merged
SomeoneToIgnore merged 6 commits intozed-industries:mainfrom
feitreim:bugfix-multibyte-multicursor
Mar 19, 2026
Merged

editor: Fix multi-line cursor expansion when multi-byte characters are involved#51780
SomeoneToIgnore merged 6 commits intozed-industries:mainfrom
feitreim:bugfix-multibyte-multicursor

Conversation

@feitreim
Copy link
Copy Markdown
Contributor

Closes #51740

The multi-line cursor expansion operates off of byte offsets, instead of character offsets, so multi-byte characters like the umlaut cause the multi-line cursors to be weirdly offset. To fix we just convert the expansion logic to rely on utf16 characters instead of bytes.

before behavior:

broken.mov

after behavior:

fixed.mov
  • test to verify functionality.

Before you mark this PR as ready for review, make sure that you have:

  • Added a solid test coverage and/or screenshots from doing manual testing
  • Done a self-review taking into account security and performance aspects
  • Aligned any UI changes with the UI checklist

Release Notes:

  • editor: fixed multi-line cursor expansion dealing with multi-byte characters.

@cla-bot cla-bot bot added the cla-signed The user has signed the Contributor License Agreement label Mar 17, 2026
@zed-community-bot zed-community-bot bot added the guild Pull requests by someone in Zed Guild. NOTE: the label application is automated via github actions label Mar 17, 2026
@maxdeviant maxdeviant changed the title editor: Fix multi-line cursor expansion when mult-byte characters are involved. editor: Fix multi-line cursor expansion when multi-byte characters are involved Mar 17, 2026
@feitreim
Copy link
Copy Markdown
Contributor Author

Brought in the Commits from my other PR, tldr: helix's multiline selection system faced the same problem as the main editor version, I fixed the helix version and tried to refactor it a bit to make things nicer.

Behavior Before:

broken_helix.mov

Behavior After:

fixed_helix.mov
  • tests.

Copy link
Copy Markdown
Contributor

@SomeoneToIgnore SomeoneToIgnore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@SomeoneToIgnore SomeoneToIgnore self-assigned this Mar 19, 2026
@SomeoneToIgnore SomeoneToIgnore enabled auto-merge (squash) March 19, 2026 15:39
@SomeoneToIgnore
Copy link
Copy Markdown
Contributor

Oh, seems like we need a formatter pass to merge it, would you be able to do so?

@feitreim
Copy link
Copy Markdown
Contributor Author

yes will do rn.

@github-actions github-actions bot added size/M and removed size/M labels Mar 19, 2026
@nathansobo
Copy link
Copy Markdown
Contributor

Thanks for this fix! The multi-byte alignment issue was a real bug and the UTF-16 approach is the right call.

I pushed one small refinement on top: a MultiBufferSnapshot::line_len_utf16 method that computes the UTF-16 line length in a single tree traversal rather than the two-step line_lenpoint_to_point_utf16 pattern, and updated the call sites to use it.

@SomeoneToIgnore SomeoneToIgnore merged commit d94aa26 into zed-industries:main Mar 19, 2026
30 checks passed
@feitreim feitreim deleted the bugfix-multibyte-multicursor branch March 19, 2026 16:21
AmaanBilwar pushed a commit to AmaanBilwar/zed that referenced this pull request Mar 20, 2026
…e involved (zed-industries#51780)

Closes zed-industries#51740 

The multi-line cursor expansion operates off of byte offsets, instead of
character offsets, so multi-byte characters like the umlaut cause the
multi-line cursors to be weirdly offset. To fix we just convert the
expansion logic to rely on utf16 characters instead of bytes.

before behavior:


https://github.com/user-attachments/assets/320e24e9-0fdd-4d16-a9e8-ca17c9e21ff2

after behavior: 


https://github.com/user-attachments/assets/c4f0334b-dffc-4530-91ee-577b4fab75dd

+ test to verify functionality.

Before you mark this PR as ready for review, make sure that you have:
- [x] Added a solid test coverage and/or screenshots from doing manual
testing
- [x] Done a self-review taking into account security and performance
aspects
- [x] Aligned any UI changes with the [UI
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)

Release Notes:

- editor: fixed multi-line cursor expansion dealing with multi-byte
characters.

---------

Co-authored-by: Kirill Bulatov <mail4score@gmail.com>
toshmukhamedov pushed a commit to toshmukhamedov/zed that referenced this pull request Mar 20, 2026
…e involved (zed-industries#51780)

Closes zed-industries#51740 

The multi-line cursor expansion operates off of byte offsets, instead of
character offsets, so multi-byte characters like the umlaut cause the
multi-line cursors to be weirdly offset. To fix we just convert the
expansion logic to rely on utf16 characters instead of bytes.

before behavior:


https://github.com/user-attachments/assets/320e24e9-0fdd-4d16-a9e8-ca17c9e21ff2

after behavior: 


https://github.com/user-attachments/assets/c4f0334b-dffc-4530-91ee-577b4fab75dd

+ test to verify functionality.

Before you mark this PR as ready for review, make sure that you have:
- [x] Added a solid test coverage and/or screenshots from doing manual
testing
- [x] Done a self-review taking into account security and performance
aspects
- [x] Aligned any UI changes with the [UI
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)

Release Notes:

- editor: fixed multi-line cursor expansion dealing with multi-byte
characters.

---------

Co-authored-by: Kirill Bulatov <mail4score@gmail.com>
AmaanBilwar pushed a commit to AmaanBilwar/zed that referenced this pull request Mar 23, 2026
…e involved (zed-industries#51780)

Closes zed-industries#51740 

The multi-line cursor expansion operates off of byte offsets, instead of
character offsets, so multi-byte characters like the umlaut cause the
multi-line cursors to be weirdly offset. To fix we just convert the
expansion logic to rely on utf16 characters instead of bytes.

before behavior:


https://github.com/user-attachments/assets/320e24e9-0fdd-4d16-a9e8-ca17c9e21ff2

after behavior: 


https://github.com/user-attachments/assets/c4f0334b-dffc-4530-91ee-577b4fab75dd

+ test to verify functionality.

Before you mark this PR as ready for review, make sure that you have:
- [x] Added a solid test coverage and/or screenshots from doing manual
testing
- [x] Done a self-review taking into account security and performance
aspects
- [x] Aligned any UI changes with the [UI
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)

Release Notes:

- editor: fixed multi-line cursor expansion dealing with multi-byte
characters.

---------

Co-authored-by: Kirill Bulatov <mail4score@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed The user has signed the Contributor License Agreement guild Pull requests by someone in Zed Guild. NOTE: the label application is automated via github actions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Multi-line cursor expansion does not like umlauts

4 participants