fix: correct wide character (CJK) rendering corruption#103
Conversation
Wide characters (CJK, emoji, etc.) occupy two terminal columns but are stored as a primary cell (Width=2) followed by a zero-width placeholder (Width=0). Two independent code paths failed to account for this: 1. transformLine optimization paths (ICH, DCH, ECH, REP, EL1, putRange same-cell skip) assume each cell is one column and corrupt wide characters by shifting, erasing, or skipping individual columns. Fix: add transformLineWide that bypasses all optimizations for lines containing wide characters. 2. relativeCursorMove overwrite optimization writes cell content to move the cursor rightward. When traversing a zero-width placeholder, it emits a space that destroys the wide character on the terminal. Since curbuf is not updated by cursor movement, the damage is never repaired. Fix: disable overwrite for lines containing wide characters.
5b019e7 to
64062a4
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #103 +/- ##
==========================================
+ Coverage 58.41% 58.55% +0.13%
==========================================
Files 52 52
Lines 6695 6746 +51
==========================================
+ Hits 3911 3950 +39
- Misses 2539 2545 +6
- Partials 245 251 +6 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
In TerminalRenderer.transformLine, the for loop that skips zero-width trailing cells of wide characters reads `next` once before the loop but never re-reads it inside the loop body. When next.IsZero() returns true, n increments but next stays the same pointer, causing an infinite loop that pegs CPU at 100%. This was observed in production with a Bubble Tea TUI rendering CJK content, where the goroutine stuck in transformLine consumed a full CPU core indefinitely. Fix: add `next = newLine.At(n + 1)` inside the loop to advance the pointer, matching the pattern used in the symmetric backward loop (lines 927-934). Note: PR charmbracelet#103 works around this by routing wide-char lines to a new transformLineWide function, but the underlying bug in this loop remains and should still be fixed. Add test that reproduces the infinite loop (hangs without fix, passes with fix).
In TerminalRenderer.transformLine, the for loop that skips zero-width trailing cells of wide characters reads `next` once before the loop but never re-reads it inside the loop body. When next.IsZero() returns true, n increments but next stays the same pointer, causing an infinite loop that pegs CPU at 100%. This was observed in production with a Bubble Tea TUI rendering CJK content, where the goroutine stuck in transformLine consumed a full CPU core indefinitely. Fix: add `next = newLine.At(n + 1)` inside the loop to advance the pointer, matching the pattern used in the symmetric backward loop (lines 927-934). Note: PR charmbracelet#103 works around this by routing wide-char lines to a new transformLineWide function, but the underlying bug in this loop remains and should still be fixed. Add test that reproduces the infinite loop (hangs without fix, passes with fix).
…ero (#109) In TerminalRenderer.transformLine, the for loop that skips zero-width trailing cells of wide characters reads `next` once before the loop but never re-reads it inside the loop body. When next.IsZero() returns true, n increments but next stays the same pointer, causing an infinite loop that pegs CPU at 100%. This was observed in production with a Bubble Tea TUI rendering CJK content, where the goroutine stuck in transformLine consumed a full CPU core indefinitely. Fix: add `next = newLine.At(n + 1)` inside the loop to advance the pointer, matching the pattern used in the symmetric backward loop (lines 927-934). Note: PR #103 works around this by routing wide-char lines to a new transformLineWide function, but the underlying bug in this loop remains and should still be fixed. Add test that reproduces the infinite loop (hangs without fix, passes with fix). Co-authored-by: Chronostasys <Chronostasys@users.noreply.github.com>
|
Thanks @aymanbagabas! I took a look at #109 — it fixes the infinite loop caused by
So #109 is a complementary fix for the underlying loop bug, but it doesn't cover the full scope of rendering corruption that this PR addresses. Would you be open to reconsidering? |
These optimizations don't fire on wide cells, they're only executed on single cells.
This was fixed in 9b00276. Now |
Summary
Wide characters (CJK, emoji, etc.) intermittently disappear or show only background color during streaming terminal updates (e.g. when an LLM streams CJK content character-by-character in a Bubble Tea TUI). This is caused by two independent bugs in the rendering pipeline.
Root Causes
1.
transformLineoptimization paths corrupt wide characterstransformLine()has multiple column-based optimizations (ICH, DCH, ECH, REP, EL1,putRangesame-cell skip) that assume each cell occupies exactly 1 terminal column. Wide characters occupy 2 columns: column N holds the real cell (Width=2), column N+1 holds a zero-width placeholder (Width=0, IsZero=true). These optimizations shift, erase, or skip individual columns, splitting wide characters and producing corrupted output.Fix: When either old or new line contains wide chars,
transformLinedelegates to a newtransformLineWidefunction that bypasses all optimizations and uses the simplest possible approach — find the diff range, emit every cell viaputCell, optionally clear to end of line, and copy newbuf to curbuf.2.
relativeCursorMoveoverwrite optimization corrupts wide charactersWhen moving the cursor rightward,
relativeCursorMovecan "overwrite" cells fromnewbufinstead of emitting escape sequences (e.g., writinghellois shorter than\x1b[5C). The overwrite loop handlesWidth == 0cells (zero-width placeholders) by writing a space, which destroys the wide character on the terminal. Sincecurbufis not updated during cursor movement, the damage is never repaired on subsequent renders.Fix: Before the overwrite loops, check if the target line has any wide characters. If so, disable overwrite and fall back to escape sequences (
CursorForward).Approach
This fix is deliberately conservative — it disables optimizations for lines containing wide characters rather than trying to make each optimization wide-char-aware. This is simpler, easier to verify, and avoids introducing subtle edge cases into the optimization paths.
Testing
Added 6 test functions covering:
All existing tests continue to pass.