Skip to content

vim/helix: Use grapheme count on replace#51776

Merged
dinocosta merged 4 commits intozed-industries:mainfrom
feitreim:bugfix-multibyte-replace-chars
Mar 25, 2026
Merged

vim/helix: Use grapheme count on replace#51776
dinocosta merged 4 commits intozed-industries:mainfrom
feitreim:bugfix-multibyte-replace-chars

Conversation

@feitreim
Copy link
Copy Markdown
Contributor

@feitreim feitreim commented Mar 17, 2026

Closes #51772

Multibyte characters are not handled properly with vim/helix replace, they replace based on bytes, instead of characters, so you will end up with too many new characters.

Vim Behavior Before:

Before_replace.mov

Vim Behavior After:

after.replace.mov

Helix Before:

helix.before.mov

Helix After:

helix.after.mov

I also added tests for the functionality.

Before you mark this PR as ready for review, make sure that you have:

  • Added a solid test coverage and/or screenshots from doing manual testing
  • Done a self-review taking into account security and performance aspects
  • Aligned any UI changes with the UI checklist

Release Notes:

  • Fixed vim/helix's replace action to take into consideration grapheme count

@cla-bot cla-bot bot added the cla-signed The user has signed the Contributor License Agreement label Mar 17, 2026
@zed-community-bot zed-community-bot bot added the guild Pull requests by someone in Zed Guild. NOTE: the label application is automated via github actions label Mar 17, 2026
@maxdeviant maxdeviant changed the title vim/helix: fix replacing multi-byte characters vim/helix: Fix replacing multi-byte characters Mar 17, 2026
@zed-industries-bot
Copy link
Copy Markdown
Contributor

zed-industries-bot commented Mar 17, 2026

Warnings
⚠️
vim/helix: Use grapheme count on replace
^

Write PR titles using sentence case.

Have feedback on this plugin? Let's hear it!

Generated by 🚫 dangerJS against 8ce5c9d

Update the current replace implementation for both vim and helix to take
into account the grapheme count instead of the unicode scalar count.

Using the unicode scalar count was already an improvement compared to
what we had before, but it doesn't cover all scenarios, for example,
when using some emoji, which although unlikely in source code, is a
possibility and is currently correctly handled by Neovim.

This commit introduces a new
`multi_buffer::MultiBufferSnapshot::grapheme_count_for_range` method
which returns the number of graphemes for a given range in the
multibuffer, which then allows us to correctly determine the number of
times the character being used for replace should repeat.

Existing tests have also been updated with test cases that would fail
without these new changes, to catch any potential regressions.
Copy link
Copy Markdown
Member

@dinocosta dinocosta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @feitreim ! 🙂

Pushed a smaller commit updating this to take into consideration graphemes instead so as to tackle even more cases like certain emoji and unicode decomposed characters.

@dinocosta dinocosta changed the title vim/helix: Fix replacing multi-byte characters vim/helix: Use grapheme count on replace Mar 25, 2026
@dinocosta dinocosta enabled auto-merge (squash) March 25, 2026 23:10
@dinocosta dinocosta merged commit 3684b5a into zed-industries:main Mar 25, 2026
31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed The user has signed the Contributor License Agreement guild Pull requests by someone in Zed Guild. NOTE: the label application is automated via github actions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Multi-byte characters end up as multiple charaters when replaced with vim/helix

5 participants