Skip to content

feat(desktop): per-message automatic RTL/bidi text direction#44065

Closed
Adolanium wants to merge 1 commit into
NousResearch:mainfrom
Adolanium:feat/desktop-rtl-bidi
Closed

feat(desktop): per-message automatic RTL/bidi text direction#44065
Adolanium wants to merge 1 commit into
NousResearch:mainfrom
Adolanium:feat/desktop-rtl-bidi

Conversation

@Adolanium

@Adolanium Adolanium commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

Adds automatic bidirectional text support to the desktop app. Hebrew, Arabic and other RTL-script messages currently render left-aligned with mangled punctuation order, because nothing in the renderer sets a text direction: the user bubble hardcodes text-left (thread.tsx, USER_BUBBLE_BASE_CLASS), the markdown surface inherits LTR, and the composer is a plain LTR contenteditable.

The fix resolves direction per block with the first-strong-character heuristic, applied to the block's prose:

  • every prose block in assistant markdown (p, h1-h4, li, ul/ol, blockquote) and every user message text segment gets a dir attribute resolved from its own text, so mixed Hebrew/English answers render each paragraph correctly
  • code spans and math don't get a vote: a technical RTL message often starts with a command (./run.sh ... followed by an Arabic/Hebrew explanation), and plain first-strong would flip that whole block to LTR; blocks with no prose at all fall back to dir="auto"
  • inline code and KaTeX output inside resolved blocks are pinned to an isolated LTR run, so their neutrals (dots, slashes, dashes) keep their order inside RTL sentences; fenced code blocks keep LTR entirely
  • both composers (main and inline edit) carry dir="auto" and re-resolve on every input, so the box flips as you type
  • text-start accompanies the dir attribute where an ancestor pins text-align (the user bubble's text-left), so RTL blocks actually right-align
  • the blockquote border moves from border-l-2/pl-3 to the logical border-s-2/ps-3, and list markers already follow direction because Tailwind Typography uses padding-inline-start

LTR content resolves to LTR, so English-only users see no change. No new dependencies, no settings, no locale work. App chrome stays LTR.

Related Issue

Fixes #44150 (filed shortly after this PR was opened; requests exactly this behavior).

Related work on other surfaces, no overlap with this PR:

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 🔒 Security fix
  • 📝 Documentation update
  • ✅ Tests (adding or improving test coverage)
  • ♻️ Refactor (no behavior change)
  • 🎯 New skill (bundled or hub)

Changes Made

  • apps/desktop/src/lib/text-direction.ts: first-strong direction detection over a caller-chosen slice of text
  • apps/desktop/src/components/assistant-ui/markdown-text.tsx: prose block component overrides resolve dir from their prose (code spans and math excluded) with text-start; logical border/padding on blockquote
  • apps/desktop/src/components/assistant-ui/user-message-text.tsx: same per text segment, from the segment's non-code text; fences untouched so code stays LTR
  • apps/desktop/src/components/assistant-ui/thread.tsx: dir="auto" + text-start on the inline edit composer's contenteditable
  • apps/desktop/src/app/chat/composer/index.tsx: dir="auto" on the main composer contenteditable
  • apps/desktop/src/components/assistant-ui/message-direction.test.tsx: pins the contract for user and assistant messages: prose blocks carry the resolved direction, code-first blocks follow their prose, code blocks never carry one
  • apps/desktop/src/styles.css: inline code and KaTeX output inside direction-resolved blocks are pinned to an isolated LTR run, so their neutrals (dots, slashes, dashes) keep their order inside RTL sentences; fenced code cards are pinned LTR so their chrome and code lines don't mirror when they sit inside an RTL list item

How to Test

  1. cd apps/desktop && npx vitest run --environment jsdom src/components/assistant-ui/message-direction.test.tsx - 4 passed
  2. npm run typecheck - clean
  3. npx eslint on the touched files - clean (repo-wide npm run lint has pre-existing findings on main, none in these files)
  4. Manual: run the desktop app, send a Hebrew message such as מה שלומך? - the bubble right-aligns; ask for a Hebrew answer with a list and a code block - paragraphs, headings and bullets right-align with markers on the right, the code block stays LTR
  5. Send a message that starts with code, e.g. `./scripts/run.sh -v` מה הפקודה הזאת עושה? - the block still right-aligns (code doesn't vote on direction) and the chip keeps its internal order
  6. Type Hebrew in the composer - it right-aligns as you type; delete it and type English - it flips back
  7. Click an existing Hebrew user bubble to edit it - the edit box opens right-aligned
  8. Send English messages - rendering is visually identical to main (LTR prose resolves to LTR)

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix/feature (no unrelated commits)
  • I've run pytest tests/ -q and all tests pass (renderer-only change; desktop vitest suite run instead, failures on main are unchanged)
  • I've added tests for my changes (required for bug fixes, strongly encouraged for features)
  • I've tested on my platform: Windows 11, dev build (npm run dev)

Documentation & Housekeeping

  • I've updated relevant documentation (README, docs/, docstrings) — or N/A
  • I've updated cli-config.yaml.example if I added/changed config keys — or N/A
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — or N/A
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guide — or N/A (rendering only, no platform code)
  • I've updated tool descriptions/schemas if I changed tool behavior — or N/A

Screenshots / Logs

Before / after on the same conversation (Hebrew answer with heading, list and code block; English follow-up unchanged):

Before:
before-final2-pr

After:
after-final2-pr

Composer auto-directing while typing Hebrew:
composer-rtl-pr

A Hebrew paragraph that starts with code: plain first-strong would resolve it LTR (before); resolved from prose it follows the sentence, and the chips keep their internal order (after):

Before:
rtl-codefirst-before

After:
rtl-codefirst-after

@alt-glitch alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have labels Jun 11, 2026
@Adolanium

Adolanium commented Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

Linking #44150 here too, since it was filed shortly after this PR was opened and requests exactly this behavior.

Since alt-glitch noted on #44169 that a maintainer should pick one of the two desktop RTL approaches, here is a factual comparison to make that easier. Both PRs rely on first-strong-character direction detection; the difference is the mechanism. This PR resolves a dir attribute per message block from the block's prose, #44169 applies CSS unicode-bidi: plaintext.

  • Code-first messages: a technical RTL message often starts with a command (./run.sh ... followed by a Hebrew/Arabic explanation). Plain first-strong sees the code's first Latin letter and resolves the whole block LTR, misaligning the entire sentence. unicode-bidi: plaintext is hard-wired to first-strong (UAX#9 P2), so this cannot be fixed in pure CSS. This PR resolves direction from the block's prose: code spans and math don't get a vote, and blocks with no prose fall back to dir="auto".
  • Direction in the DOM: dir exposes the resolved direction, so :dir() selectors, assistive tech and any future RTL styling can key off it. unicode-bidi is invisible to all of those.
  • Lists: with the dir attribute on ul/ol, RTL lists render with markers on the right (Tailwind Typography already uses padding-inline-start, so it flips for free). feat(desktop): auto-detect RTL paragraph direction in chat #44169 intentionally keeps list layout LTR, so fully Arabic or Hebrew lists keep left-side bullets. Same story for the blockquote border, which flips to the logical side here.
  • Inline code inside RTL sentences: both PRs pin inline code and KaTeX output to an isolated LTR run, so paths, flags and dotted names keep their internal order.
  • Per-line granularity: plaintext resolves direction per line within a block, this PR resolves per block/segment. Matters only for multi-line user messages that alternate direction line by line.
  • Tests: this PR adds a DOM contract test suite (4 tests) covering both message paths, including code-first blocks. feat(desktop): auto-detect RTL paragraph direction in chat #44169 ships without tests.
  • Validation: the screenshots above are from a live Windows build, including the composer flipping as you type and a before/after of the code-first case.

Whichever approach gets picked, glad to converge the remaining edge cases from the other one.

@Adolanium Adolanium force-pushed the feat/desktop-rtl-bidi branch from 8037b08 to 12a7271 Compare June 11, 2026 15:07
Resolve each message block's direction from its prose (code spans and
math do not vote, blocks with no prose fall back to dir="auto") and
right-align RTL blocks via text-align:start. Inline code and KaTeX
output inside resolved blocks render as isolated LTR runs so their
neutrals keep their order in RTL sentences. Both composers carry
dir="auto" and flip as you type. Code blocks stay LTR and LTR content
renders unchanged.
@OutThisLife

Copy link
Copy Markdown
Collaborator

Superseded by #44596, which unifies this with #44169 into a CSS-only change. I verified in Chromium that unicode-bidi: isolate on inline code already removes it from the paragraph's first-strong direction vote — so the code-first case this PR's text-direction.ts + dir plumbing solves is handled in pure CSS, no JS needed. Credited you via Co-authored-by. Thanks for the thorough groundwork on the code-first edge case — that's what pointed at the right fix.

@OutThisLife

Copy link
Copy Markdown
Collaborator

Superseded by #44596 (unified CSS-only RTL/bidi approach). Closing per the superseding PR.

ilyanaxo pushed a commit to ilyanaxo/hermes-agent that referenced this pull request Jun 12, 2026
Arabic/Hebrew/Persian/Urdu chat text rendered left-to-right and
left-aligned, and mixed RTL/English technical messages (the common case)
read backwards. Resolve each chat block's base direction from its own
first strong character (UAX#9) with pure CSS, scoped to the chat
surfaces only:

- `unicode-bidi: plaintext` + `text-align: start` on assistant prose
  blocks (p, h1-h6, li, blockquote), the user bubble's text lines, and
  both composers (main + edit share the composer-rich-input slot). RTL
  blocks read and right-align RTL; English stays LTR; mixed
  conversations resolve per block. `text-align: start` is required
  because the user bubble hardcodes `text-left`.
- Inline `code` and KaTeX are pinned `direction: ltr; unicode-bidi:
  isolate`, so the bidi first-strong heuristic skips them: a sentence
  that *starts* with a command (`./run.sh ...`) followed by Arabic
  still resolves RTL, and the command's own neutrals keep their order.
- Fenced code surfaces (code-card, user fences) are pinned LTR so they
  never mirror or right-align inside an RTL list item or blockquote.

`direction` is never forced, so app chrome, layout, and list indent
stay LTR per the issue's request not to flip the whole UI. English-only
content is byte-for-byte unchanged.

Salvaged and unified from NousResearch#44065 and NousResearch#44169; verified in Chromium that
isolate removes inline code from the paragraph direction vote (the
code-first case), making the JS dir-resolution in NousResearch#44065 unnecessary.

Fixes NousResearch#44150

Co-authored-by: Adolanium <Adolanium@users.noreply.github.com>
Co-authored-by: Adalsteinn Helgason <AIalliAI@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P3 Low — cosmetic, nice to have type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Native RTL support for Arabic and mixed RTL/LTR text in Hermes Chat

3 participants