Skip to content

markdown: Fix escaping non-ASCII chars#55782

Merged
agu-z merged 2 commits into
zed-industries:mainfrom
alkinun:fix/markdown-escaper-non-ascii
May 19, 2026
Merged

markdown: Fix escaping non-ASCII chars#55782
agu-z merged 2 commits into
zed-industries:mainfrom
alkinun:fix/markdown-escaper-non-ascii

Conversation

@alkinun

@alkinun alkinun commented May 5, 2026

Copy link
Copy Markdown
Contributor

Fixes #55704

The escape function in crates/markdown/src/markdown.rs was calling c as u8 on the chars before passing to MarkdownEscaper::next(). This strips non ASCII Unicode codepoints down to just their low 8 bits which might be in the ASCII punctuation range and thus cause an extra backslash to be added in front of these non ASCII chars.

Release Notes:

  • Fixed a bug where non-ASCII chars in diagnostic messages were incorrectly rendered with spurious \ characters

@cla-bot cla-bot Bot added the cla-signed The user has signed the Contributor License Agreement label May 5, 2026
@zed-community-bot zed-community-bot Bot added the first contribution the author's first pull request to Zed. NOTE: the label application is automated via github actions label May 5, 2026
@SomeoneToIgnore SomeoneToIgnore added the area:preview/markdown Feedback for Zed's Markdown preview label May 5, 2026
@agu-z agu-z self-assigned this May 15, 2026

@agu-z agu-z left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of dealing with it at the call site, why don't we just make MarkdownEscaper operate on char?

fn next(&mut self, c: char) -> EscapeAction

That makes the whole class of "low byte of a non-ASCII char looks like punctuation" bugs unrepresentable, instead of relying on callers to pre-validate.

If we do that, we should remove this comment:

// Valid to operate on raw bytes since multi-byte UTF-8
// sequences never contain ASCII-range bytes.

@alkinun

alkinun commented May 18, 2026

Copy link
Copy Markdown
Contributor Author

Sure, that seems to be a better way, just commited the changes.

@agu-z agu-z left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thank you for your contribution

@agu-z agu-z enabled auto-merge May 19, 2026 17:11
@agu-z agu-z disabled auto-merge May 19, 2026 17:11
@agu-z agu-z enabled auto-merge May 19, 2026 17:13
@agu-z agu-z changed the title Fix MarkdownEscaper escaping non ASCII chars due to lossy u8 cast markdown: Fix escaping non-ASCII chars May 19, 2026
@agu-z agu-z disabled auto-merge May 19, 2026 17:13
@agu-z agu-z enabled auto-merge May 19, 2026 17:13
@agu-z agu-z added this pull request to the merge queue May 19, 2026
Merged via the queue into zed-industries:main with commit c0596fa May 19, 2026
36 checks passed
TomPlanche pushed a commit to TomPlanche/zed that referenced this pull request May 20, 2026
Fixes zed-industries#55704

The `escape` function in `crates/markdown/src/markdown.rs` was calling
`c as u8` on the `char`s before passing to `MarkdownEscaper::next()`.
This strips non ASCII Unicode codepoints down to just their low 8 bits
which might be in the ASCII punctuation range and thus cause an extra
backslash to be added in front of these non ASCII chars.
 
Release Notes:

- Fixed a bug where non-ASCII chars in diagnostic messages were
incorrectly rendered with spurious `\` characters
TomPlanche pushed a commit to TomPlanche/zed that referenced this pull request Jun 2, 2026
Fixes zed-industries#55704

The `escape` function in `crates/markdown/src/markdown.rs` was calling
`c as u8` on the `char`s before passing to `MarkdownEscaper::next()`.
This strips non ASCII Unicode codepoints down to just their low 8 bits
which might be in the ASCII punctuation range and thus cause an extra
backslash to be added in front of these non ASCII chars.
 
Release Notes:

- Fixed a bug where non-ASCII chars in diagnostic messages were
incorrectly rendered with spurious `\` characters
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:preview/markdown Feedback for Zed's Markdown preview cla-signed The user has signed the Contributor License Agreement first contribution the author's first pull request to Zed. NOTE: the label application is automated via github actions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

markdown: MarkdownEscaper escapes non-ASCII characters due to lossy as u8 cast

4 participants