fix(desktop): render LLM math in chat messages by SivanCola · Pull Request #3251 · esengine/DeepSeek-Reasonix

SivanCola · 2026-06-05T11:23:00Z

Summary

Replace raw-string math normalization with an LLM-aware Markdown rendering pipeline for chat messages.
Render math-like inline/fenced code as KaTeX while preserving real code in CodeViewer/code spans.
Normalize confirmed LaTeX math for KaTeX compatibility and keep source/copy affordances.
Add golden-case tests and dedicated test typecheck config for math classification and normalization.

Testing

pnpm run build
pnpm run test:all
pnpm run test:todo-visibility

Notes

Removed unused remark-math / rehype-katex dependencies from the desktop frontend.
Kept the latest main-v2 deferred Markdown rendering behavior via useDeferredValue.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f4f10e815e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

lightfront · 2026-06-05T18:13:05Z

Review of PR #3251

Both reported bugs are fixed — I verified end-to-end with the chiral-decomposition example and the \|x\| case. 17KB of clean KaTeX HTML, all four \underbrace braces present, zero katex-error spans. The custom remark-plugin + components architecture is a reasonable refactor.

I have three substantive concerns, one of them significant. Then a few smaller items.

1. The greedy walker breaks for any `$X…$Y` pattern (significant)

The walker in Markdown.tsx finds the first inner $ as the closer. For any string that has a $X followed later by a $Y (with the Y not closing a complete pair), the walker takes X… (everything up to that first inner $) as the math span and leaves $Y orphaned as plain text.

Real inputs that hit this:

Input	Walker result	Visual
`$X$Y`	math `X` + plain `Y`	`XY` ✗
`$5$6`	text `5` (number reject) + plain `6`	`56` (works by accident)
`$5 and $6`	math `5 and` + plain `6`	`5 and6` ✗
`$5 or $6`	math `5 or` + plain `6`	`5 or6` ✗
`$PATH=$PATH`	math `PATH=` + plain `PATH`	`PATH=PATH` ✗
`$PATH and $HOME`	math `PATH and` + plain `HOME`	`PATH andHOME` ✗
`key=$a : $b$`	math `a :` + plain `b`	`key=a :b` ✗
`$5 = $6`	math `5 =` + plain `6`	`5 =6` ✗
`cost $5 and a$6`	text `5 and a` (rejected)	plain ✓ (works by accident)

The classifier's letter-space-letter heuristic ([A-Za-z]\s+[A-Za-z]) catches some cases accidentally — $5 and a$6 works because d space a matches. But most cases in the table are broken because the heuristic only fires when the prose between the tokens has a letter-space-letter sequence, which isn't true for 5 or, 5 and, 5 =, etc.

Suggested fix (in findDollarClose): only form a math pair if the content between the $s has a strong math signal (any of: \command, ^, _, |, {, }, named math function). If the content is ambiguous, look for a later $ on the same line. If none works, bail out (return -1) and let the whole span be plain text.

A sketch:

function findDollarClose(s: string, start: number): number {
  for (let i = start; i < s.length; i++) {
    if (s[i] === "\n") return -1;
    if (s[i] === "\\") { i += 1; continue; }
    if (s[i] === "$" && !isMathEscaped(s, i)) {
      const content = s.slice(start, i);
      // If the content has no strong math signal, this $ might not be
      // a math closer. Look for a later $ on the same line.
      if (!hasStrongMathSignal(content) && s.indexOf("$", i + 1) !== -1) {
        continue;
      }
      return i;
    }
  }
  return -1;
}

function hasStrongMathSignal(content: string): boolean {
  return /\\[A-Za-z]+\b/.test(content)      // \alpha, \frac, ...
      || /[\^_{}|]/.test(content)           // x^2, x_i, |x|, {a,b}
      || /\b(?:alpha|beta|gamma|sum|int|prod|lim|infty|sqrt|frac|sin|cos|tan|log|ln|max|min|partial|nabla|left|right)\b/.test(content);
}

For inputs without strong math signals, the walker looks for a later $ on the same line. If it finds one, it tries the longer span. If not, it bails. The net effect: $X…$Y patterns where the inner content doesn't look like math are left as plain text, and only clean math pairs ( $E=mc^2$ , $\alpha$ , $x^2$ ) get the math treatment.

The alternative is to delegate to remark-math's built-in $...$ parser, which already handles these cases correctly with a battle-tested currency guard and span rules. That's a bigger change but eliminates the custom-walker class of bugs entirely.

2. The classifier is too eager (smaller, but real)

In isLikelyInlineMath (Markdown.tsx), the final return true means any $...$ span that doesn't match an explicit reject pattern is treated as math. This causes regressions for paired-dollar non-math content:

$PATH$ → math italic (env var)
$USER$ , $HOME$ → math italic
$v1$ , $v2$ → math italic (version)
$PATH=$PATH$ → first half math, second half plain (the walker bug above compounds this)

In the old code these were left as plain text.

A simple "flip the default to false" is not the right fix — that would break the legitimate $x$ → math case. The classifier needs an explicit short-math-identifier allow-list to catch single-letter variables like x, y, n, Re, det, etc.

Suggested fix: make isLikelyInlineMath follow the same structure as isInlineMathLike in CodeOrMath.tsx (which the same PR gets right — explicit math signals, explicit code denylist, conservative default):

const SHORT_MATH_IDS = new Set([
  "x","y","z","n","m","k","p","q","r","s","t","u","v","w",
  "a","b","c","d","e","f","g","h","i","j","l","o",
  "Re","Im","det","tr","arg","dim","ker","vec","mat",
]);

const NON_MATH_PATTERNS = [
  /^[A-Z]{2,}$/,              // ALL_CAPS env vars
  /^v?\d+(\.\d+)*[a-z]?$/i,  // version strings
  /^[A-Z]:[/\\]/,              // Windows paths
  /^(?:npm|git|docker|ssh|curl|make|cmake)\s/i, // shell commands
];

function isLikelyInlineMath(math: string): boolean {
  if (!math || math !== math.trim() || math.includes("\n")) return false;
  if (math.includes("://") || math.includes("](")) return false;
  if (/^\$?\d+(?:\.\d+)?%?$/.test(math)) return false;
  for (const re of NON_MATH_PATTERNS) if (re.test(math)) return false;
  if (/\\[A-Za-z]+\b/.test(math)) return true;        // \alpha, \frac, ...
  if (/[\^_{}|]/.test(math)) return true;            // x^2, x_i, |x|, {a,b}
  if (/\b(?:alpha|beta|gamma|sum|int|prod|lim|infty|sqrt|frac|sin|cos|tan|log|ln|max|min|partial|nabla|left|right)\b/.test(math)) return true;
  if (SHORT_MATH_IDS.has(math)) return true;          // x, y, n, Re, det, ...
  if (/[A-Za-z]\s+[A-Za-z]/.test(math)) return false; // prose with spaces
  return false;                                        // default: not math
}

This is the same fix structure as isInlineMathLike (which the PR already gets right). The two classifiers should arguably share a single classifyMathContent function — they answer the same question with different rules, and that's a maintenance liability.

Input	Current	After fix
$x$	math ✓	math ✓
$E=mc^2$	math ✓	math ✓
$\alpha$	math ✓	math ✓
$PATH$	math ✗	text ✓
$USER$ , $HOME$	math ✗	text ✓
$v1$ , $v2$	math ✗	text ✓
$5$	text ✓	text ✓

3. Add an end-to-end visual regression test

The current test file (src/__tests__/math-golden.test.ts) only checks the strings produced by latexNormalizeForKatex and the boolean classifiers. It does not run those strings through katex.renderToString to verify they actually produce valid output. This means:

A future change to latexNormalizeForKatex could silently break the KaTeX render and every test would still pass.
The original \| → "italic vert" bug would not be caught by the new test suite.

Please add at least one test that runs the canonical chiral decomposition through the full pipeline:

test("chiral decomposition renders end-to-end", () => {
  const src = String.raw`U(3)_L \times U(3)_R \;=\; \underbrace{U(1)_V}_{\text{baryon #}}\times \underbrace{SU(3)_V}_{\text{isospin, SU(3) flavor}} \;\times\; \underbrace{U(1)_A}_{\text{axial}}\times \underbrace{SU(3)_A}_{\text{chiral }SU(3)}`;
  const html = katex.renderToString(latexNormalizeForKatex(src), {
    throwOnError: true,
    displayMode: true,
  });
  assert(!html.includes("katex-error"));
  for (const label of ["baryon", "isospin", "axial", "chiral"]) {
    assert(html.includes(label), `expected label "${label}" in rendered output`);
  }
});

test("\\| is preserved as double-bar norm", () => {
  const html = katex.renderToString(latexNormalizeForKatex(String.raw`\|x\|`), {
    throwOnError: true,
  });
  const text = html.replace(/<[^>]+>/g, "").trim();
  assert(text.includes("∥"), "expected double-bar norm glyph in output");
  assert(!text.toLowerCase().includes("vert"), "rendered output should not contain literal 'vert'");
});

Both bugs you fixed would have been caught by tests like these.

Smaller items (not blockers)

4. latexNormalizeForKatex always emits a trailing space after \vert

if (source[i] === "|") { out += "\\vert "; i += 1; continue; }

For A | B this becomes \vert B (double space). Visually identical, but ugly in the source. One-line fix: make the space conditional on the next character not being whitespace, or drop it entirely.

5. CSS adds 66 lines of .inline-math__source / .block-math styling

The hover-source popup and copy button are nice UX features, but they're a feature add bundled with a bug fix. Consider splitting into a separate PR so the bug fix is easier to backport and review.

6. The two commits could be squashed

The first commit (f4f10e8) introduces a small bug — inlineMathNode and blockMathNode create hProperties: {} but the components read value from the props, so the first commit's components receive undefined as source. The second commit (f9f4eb6) fixes it by passing { value } in hProperties. Consider squashing so git log doesn't carry a known-broken intermediate state.

Confirmed working (no action needed)

Both reported bugs are fixed end-to-end.
stripMathDelimiters handles all four delimiter pairs.
latexNormalizeForKatex correctly preserves \text{already\_escaped} and \tfrac{a}{b} (nested braces) without double-escaping.
isInlineMathLike in CodeOrMath.tsx is well-designed; false-positive risk for backtick-wrapped content is low.
The bundle gets smaller (drops rehype-katex and remark-math from prod deps).
looksLikeFormula is conservative for multi-line content and correctly requires an explicit LaTeX command for the single-line case.
The custom components (InlineMath, BlockMath) enable nice UX features that the old rehype-katex pipeline couldn't (hover-to-show-source, copy buttons, custom error fallback).

Summary

The two must-fixes are genuinely important — #1 (the walker bug) is a visible regression on a whole class of inputs that previously worked, and #2 (the classifier default) is the same class of bug, smaller in scope. Item #3 (the missing test) means a future refactor could silently re-introduce the original bugs.

If you push back on any of these I'd be curious to hear the reasoning. In particular, if you think the custom walker is the right design and you want to keep it, the fix in #1 is the minimum needed. If you'd rather not maintain a custom walker at all, the alternative is to keep remark-math + rehype-katex and just fix the two bugs with a small latexNormalize.ts helper (which is roughly what my fix did in ~80 lines).

That's the focused review. Three things to address before merge, in priority order: the walker bug (item 1, biggest impact), the classifier default (item 2, narrower), the missing end-to-end test (item 3, prevents future regressions). The rest is polish.

If you want, I can also write up a minimal counter-proposal — a latexNormalize.ts + tests only, ~80 lines, that fixes both reported bugs without rewiring the walker or replacing remark-math. The developer can pick between that and the PR based on their priorities.

esengine · 2026-06-06T11:41:44Z

Thanks for this, and for the careful math-rendering work (the golden tests are nice). Since this PR was opened, main-v2 has landed math rendering directly via the remark-math + rehype-katex plugins in Markdown.tsx, so the core feature is already in place. The custom BlockMath/InlineMath/latexNormalize approach in this PR now overlaps with and competes against that. Closing as superseded. If there is a concrete gap, for example support for non-standard (...) / [...] delimiters that remark-math does not handle by default, please open a small focused PR that augments the existing plugin pipeline. Thank you!

lightfront · 2026-06-06T15:18:25Z

I believe this is a misunderstanding. This PR IS for main-v2. The current rendering through the remark-math + rehype-katex plugins in Markdown.tsx in main-v2 has too many edge cases inappropriately treated or failed to be rendered at all. This PR vastly improved the treatment of these edge cases and thus is absolutely necessary.

fix(desktop): render LLM math in chat messages

f4f10e8

SivanCola requested a review from esengine as a code owner June 5, 2026 11:23

github-actions Bot added the v2 Go rewrite (1.x) — main-v2 branch, active development label Jun 5, 2026

chatgpt-codex-connector Bot reviewed Jun 5, 2026

View reviewed changes

Comment thread desktop/frontend/src/components/Markdown.tsx Outdated

fix: pass delimiter math source to markdown components

f9f4eb6

esengine closed this Jun 6, 2026

esengine mentioned this pull request Jun 6, 2026

Codex/pr 3251 review fix #3287

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(desktop): render LLM math in chat messages#3251

fix(desktop): render LLM math in chat messages#3251
SivanCola wants to merge 2 commits into
esengine:main-v2from
SivanCola:codex/chat-markdown-rendering

SivanCola commented Jun 5, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

lightfront commented Jun 5, 2026

Uh oh!

esengine commented Jun 6, 2026

Uh oh!

lightfront commented Jun 6, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

SivanCola commented Jun 5, 2026

Summary

Testing

Notes

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

lightfront commented Jun 5, 2026

Review of PR #3251

1. The greedy walker breaks for any $X…$Y pattern (significant)

2. The classifier is too eager (smaller, but real)

3. Add an end-to-end visual regression test

Smaller items (not blockers)

Confirmed working (no action needed)

Summary

Uh oh!

esengine commented Jun 6, 2026

Uh oh!

lightfront commented Jun 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

1. The greedy walker breaks for any `$X…$Y` pattern (significant)

lightfront commented Jun 6, 2026 •

edited

Loading