Skip to content

Sidebar Terminal: Korean (CJK) IME input not delivered to PTY #1272

@ldybob

Description

@ldybob

Summary

Sidebar Terminal (xterm.js v5.3.0) has two separate Korean / CJK bugs:

  1. Input: Korean (CJK) text typed via IME composition does not reach the PTY correctly.
  2. Output: Korean text rendered by xterm.js is visually broken — overlapping glyphs, garbled fragments, cursor misalignment.

Other input/output (English, ASCII, control sequences) works perfectly.

Environment

  • gstack version: 1.20.0.0
  • OS: WSL2 (Ubuntu 24.04.4 LTS) under Windows 11, WSLg active (DISPLAY=:0, WAYLAND_DISPLAY=wayland-0)
  • Browser: Playwright Chromium 145.0.7632.6 (auto-launched by $B connect)
  • IME: Microsoft 한글 IME (Windows 11 default Korean IME, passed through WSLg)
  • xterm bundled at extension/lib/xterm.jsnode_modules/xterm package version 5.3.0

Bug 1 — Input: IME composition not delivered to PTY

Reproduction

  1. /connect-chrome to launch GStack Browser
  2. Open Side Panel → Terminal pane loads with live Claude PTY
  3. Switch IME to Korean (e.g., Right-Alt or Win+Space)
  4. Type 안녕하세요

Expected: Composed Korean characters arrive at the PTY and Claude sees them.

Actual: Either nothing reaches the PTY, or partial composition fragments leak through, breaking the input. English input via the same terminal works perfectly. Same input via clipboard paste (Ctrl+Shift+V) works perfectly.

Root cause analysis

extension/sidepanel-terminal.js:199 wires PTY input via:

term.onData((data) => {
  if (ws && ws.readyState === WebSocket.OPEN) {
    ws.send(new TextEncoder().encode(data));
  }
});

xterm.js v5.3.0 routes IME composition through its internal CompositionHelper, which is supposed to fire onData only on compositionend with the final composed string. In practice, the helper has known interaction bugs in some Chromium contexts (chrome-extension iframes, Korean Microsoft IME on Windows/WSLg) where:

  1. input events fire before compositionend, sending hangul jamo (e.g., , , ) one at a time as the user types,
  2. compositionend then fires with the final composed character, doubling the input,
  3. Or the textarea loses focus mid-composition because of competing focus handlers, dropping the entire composition.

Suggested fix

Wrap term.onData with explicit composition handling on the underlying textarea (xterm exposes it via term.textarea in v5.x):

let composing = false;
const ta = term.textarea;
if (ta) {
  ta.addEventListener('compositionstart', () => { composing = true; });
  ta.addEventListener('compositionend', (e) => {
    composing = false;
    if (e.data && ws && ws.readyState === WebSocket.OPEN) {
      ws.send(new TextEncoder().encode(e.data));
    }
  });
}
term.onData((data) => {
  if (composing) return;  // suppress partial input events
  if (ws && ws.readyState === WebSocket.OPEN) {
    ws.send(new TextEncoder().encode(data));
  }
});

This pattern is the standard workaround documented in xterm.js issues like xtermjs/xterm.js#3545 and similar.


Bug 2 — Output: Korean text from PTY renders broken

Reproduction

  1. Same setup as Bug 1.
  2. Trigger any output containing Korean text. Easiest: paste 한국어 출력 테스트 into the terminal with Ctrl+Shift+V (paste path bypasses IME). The shell echoes back via PTY.
  3. Or have Claude print Korean: echo '안녕하세요 반갑습니다'.

Expected: Hangul characters render cleanly, double-width cells, cursor advances correctly after each char.

Actual: Characters appear fragmented or overlapping, cursor position drifts, occasional replacement glyphs (▯, ?). The same byte sequence rendered in a real terminal (Windows Terminal, GNOME Terminal under WSLg) is fine.

Root cause analysis

There are two compounding issues on the output path:

2a. Font fallback breaks cell-width math

extension/sidepanel-terminal.js:156-164 declares the xterm font as:

fontFamily: '"JetBrains Mono", "SF Mono", Menlo, monospace',

None of these fonts ship Hangul glyphs. Chromium falls back to a system Korean font (Malgun Gothic on Windows, Noto Sans CJK on Linux), but xterm.js v5.3.0 measures the cell width once at startup using the primary font. When the actual rendered glyph from the fallback font has a different advance width, characters either overflow their cell (overlapping) or under-fill it (cursor drifts left of the next char).

Mirror config in sidepanel.css:41:

--font-mono: 'JetBrains Mono', 'SF Mono', 'Fira Code', 'Cascadia Code', monospace;

Same issue — no CJK fallback declared.

Fix: append a known-good CJK monospace fallback:

fontFamily: '"JetBrains Mono", "SF Mono", Menlo, "Noto Sans Mono CJK KR", "Malgun Gothic", monospace',

This forces a font Chromium can measure against during the initial cell-width calc, giving xterm consistent advance widths.

2b. PTY → WebSocket binary chunks split mid-codepoint

browse/src/terminal-agent.ts:364-366:

const proc = spawnClaude(session.cols, session.rows, (chunk) => {
  try { ws.sendBinary(chunk); } catch {}
});

The PTY emits raw byte chunks. A 3-byte hangul codepoint can be split across two adjacent chunks (e.g., last 2 bytes of one chunk, first byte of the next). xterm receives the partial sequences as binary, decodes them via its UTF-8 parser, and emits replacement characters or fragments for the orphaned bytes.

This is the same class of bug PR #1007 fixed for the sidebar-agent stdout path. The fix did not cover the terminal-agent → WebSocket path, which is the surface used by the Sidebar Terminal in v1.14.0.0+.

Fix: buffer pending bytes in the spawn callback until each chunk ends on a complete UTF-8 sequence boundary before forwarding:

let leftover: Buffer = Buffer.alloc(0);
const proc = spawnClaude(session.cols, session.rows, (chunk) => {
  const combined = Buffer.concat([leftover, Buffer.from(chunk)]);
  // Find the last index where a UTF-8 codepoint ends. Look back at most 3 bytes.
  let safeEnd = combined.length;
  for (let i = combined.length - 1; i >= Math.max(0, combined.length - 3); i--) {
    const b = combined[i];
    if ((b & 0x80) === 0) { safeEnd = i + 1; break; }              // ASCII
    if ((b & 0xC0) === 0x80) continue;                             // continuation
    const expected = (b & 0xE0) === 0xC0 ? 2 : (b & 0xF0) === 0xE0 ? 3 : 4;
    safeEnd = (combined.length - i >= expected) ? combined.length : i;
    break;
  }
  const flush = combined.slice(0, safeEnd);
  leftover = combined.slice(safeEnd);
  if (flush.length) {
    try { ws.sendBinary(flush); } catch {}
  }
});

Same buffering pattern as sidebar-agent already uses post-#1007.


Workarounds users can apply today

For input:

  • Type Korean elsewhere and paste with Ctrl+Shift+V (paste handler bypasses IME).
  • Use the toolbar Cleanup button or Inspector "Send to Code" to inject pre-written text.

For output:

  • No clean workaround inside the sidebar terminal. Open a real terminal (Windows Terminal under WSLg, or use Claude Code in a regular terminal session) for any work involving Korean output.

Why this matters

Korean and other CJK developers using gstack on WSL2 / Windows can't drive or read the Sidebar Terminal in their native language. Both input and output are broken, so the v1.14.0.0 win ("interactive REPL right in the browser") is effectively unavailable for native Korean workflows.

Bug 2b in particular is a regression that the original PR #1007 was meant to address class-wide but missed the terminal-agent path — likely because terminal-agent shipped in v1.14.0.0 after #1007 was authored.

Happy to test a fix branch on WSL2 + Korean Microsoft IME if you want a confirmed reproduction baseline.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions