-
-
Notifications
You must be signed in to change notification settings - Fork 9k
markdown: MarkdownEscaper escapes non-ASCII characters due to lossy as u8 cast #55704
Copy link
Copy link
Closed
Copy link
Labels
area:internationalizationFeedback for human language support, translations, etcFeedback for human language support, translations, etcarea:languages/markdownMarkdown markup supportMarkdown markup supportfrequency:uncommonBugs that happen for a small subset of users, special configurations, rare circumstances, etcBugs that happen for a small subset of users, special configurations, rare circumstances, etcmeta:awesomeexemplary issue/PR from the communityexemplary issue/PR from the communitypriority:P2Average run-of-the-mill bugsAverage run-of-the-mill bugsstate:reproducibleVerified steps to reproduce included and someone on the team managed to reproduceVerified steps to reproduce included and someone on the team managed to reproduce
Metadata
Metadata
Assignees
Labels
area:internationalizationFeedback for human language support, translations, etcFeedback for human language support, translations, etcarea:languages/markdownMarkdown markup supportMarkdown markup supportfrequency:uncommonBugs that happen for a small subset of users, special configurations, rare circumstances, etcBugs that happen for a small subset of users, special configurations, rare circumstances, etcmeta:awesomeexemplary issue/PR from the communityexemplary issue/PR from the communitypriority:P2Average run-of-the-mill bugsAverage run-of-the-mill bugsstate:reproducibleVerified steps to reproduce included and someone on the team managed to reproduceVerified steps to reproduce included and someone on the team managed to reproduce
Type
Fields
Give feedbackNo fields configured for Bug.
Reproduction steps
Diagnostic.message.U+043A–U+0440(к л м н о п р) orU+0421–U+042F(С Т У Ф Х Ц Ч Ш Щ Ъ Ы Ь Э Ю Я), and',",(,),#,/, …).Current behavior
Spurious
\characters appear before specific Cyrillic letters:Copying the message yields the clean text — the backslashes are inserted by the renderer.
Expected behavior
Display the message exactly as the LSP server sent it, e.g.:
Root cause
crates/markdown/src/markdown.rs,pub fn escape(~line 457):MarkdownEscaper::nextdecides to prefix a backslash viabyte.is_ascii_punctuation(). For non-ASCII codepoints,c as u8truncates to the low 8 bits, which can land inside ASCII-punctuation ranges — e.g.р(U+0440) →0x40→@,о(U+043E) →0x3E→>,Э(U+042D) →0x2D→-. Per CommonMark §6.1,\Xis consumed only whenXis ASCII punctuation, so for non-ASCII characters the backslash is rendered literally.The earlier fix in #51766 addressed visible escapes inside indented code blocks but did not touch this lossy-cast path. The bug affects any script outside ASCII whose codepoints land in those low-bit ranges — Cyrillic, Greek, Hebrew, Arabic, CJK, etc.
Zed version and system specs
Zed 1.0.1, macOS 15.7.4
Attach Zed log file
n/a
Relevant Zed settings / Keymap
n/a
(for AI issues) Model provider details
n/a
WSL
No