Skip to content

fix(desktop): render LLM math in chat messages#3251

Closed
SivanCola wants to merge 2 commits into
esengine:main-v2from
SivanCola:codex/chat-markdown-rendering
Closed

fix(desktop): render LLM math in chat messages#3251
SivanCola wants to merge 2 commits into
esengine:main-v2from
SivanCola:codex/chat-markdown-rendering

Conversation

@SivanCola

Copy link
Copy Markdown
Collaborator

Summary

  • Replace raw-string math normalization with an LLM-aware Markdown rendering pipeline for chat messages.
  • Render math-like inline/fenced code as KaTeX while preserving real code in CodeViewer/code spans.
  • Normalize confirmed LaTeX math for KaTeX compatibility and keep source/copy affordances.
  • Add golden-case tests and dedicated test typecheck config for math classification and normalization.

Testing

  • pnpm run build
  • pnpm run test:all
  • pnpm run test:todo-visibility

Notes

  • Removed unused remark-math / rehype-katex dependencies from the desktop frontend.
  • Kept the latest main-v2 deferred Markdown rendering behavior via useDeferredValue.

@SivanCola SivanCola requested a review from esengine as a code owner June 5, 2026 11:23
@github-actions github-actions Bot added the v2 Go rewrite (1.x) — main-v2 branch, active development label Jun 5, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f4f10e815e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread desktop/frontend/src/components/Markdown.tsx Outdated
@lightfront

Copy link
Copy Markdown
Contributor

Review of PR #3251

Both reported bugs are fixed — I verified end-to-end with the chiral-decomposition example and the \|x\| case. 17KB of clean KaTeX HTML, all four \underbrace braces present, zero katex-error spans. The custom remark-plugin + components architecture is a reasonable refactor.

I have three substantive concerns, one of them significant. Then a few smaller items.

1. The greedy walker breaks for any $X…$Y pattern (significant)

The walker in Markdown.tsx finds the first inner $ as the closer. For any string that has a $X followed later by a $Y (with the Y not closing a complete pair), the walker takes X… (everything up to that first inner $) as the math span and leaves $Y orphaned as plain text.

Real inputs that hit this:

Input Walker result Visual
$X$Y math X + plain Y *X*Y
$5$6 text 5 (number reject) + plain 6 56 (works by accident)
$5 and $6 math 5 and + plain 6 *5 and*6
$5 or $6 math 5 or + plain 6 *5 or*6
$PATH=$PATH math PATH= + plain PATH *PATH=*PATH
$PATH and $HOME math PATH and + plain HOME *PATH and*HOME
key=$a : $b$ math a : + plain b key=*a :*b
$5 = $6 math 5 = + plain 6 *5 =*6
cost $5 and a$6 text 5 and a (rejected) plain ✓ (works by accident)

The classifier's letter-space-letter heuristic ([A-Za-z]\s+[A-Za-z]) catches some cases accidentally — $5 and a$6 works because d space a matches. But most cases in the table are broken because the heuristic only fires when the prose between the tokens has a letter-space-letter sequence, which isn't true for 5 or, 5 and, 5 =, etc.

Suggested fix (in findDollarClose): only form a math pair if the content between the $s has a strong math signal (any of: \command, ^, _, |, {, }, named math function). If the content is ambiguous, look for a later $ on the same line. If none works, bail out (return -1) and let the whole span be plain text.

A sketch:

function findDollarClose(s: string, start: number): number {
  for (let i = start; i < s.length; i++) {
    if (s[i] === "\n") return -1;
    if (s[i] === "\\") { i += 1; continue; }
    if (s[i] === "$" && !isMathEscaped(s, i)) {
      const content = s.slice(start, i);
      // If the content has no strong math signal, this $ might not be
      // a math closer. Look for a later $ on the same line.
      if (!hasStrongMathSignal(content) && s.indexOf("$", i + 1) !== -1) {
        continue;
      }
      return i;
    }
  }
  return -1;
}

function hasStrongMathSignal(content: string): boolean {
  return /\\[A-Za-z]+\b/.test(content)      // \alpha, \frac, ...
      || /[\^_{}|]/.test(content)           // x^2, x_i, |x|, {a,b}
      || /\b(?:alpha|beta|gamma|sum|int|prod|lim|infty|sqrt|frac|sin|cos|tan|log|ln|max|min|partial|nabla|left|right)\b/.test(content);
}

For inputs without strong math signals, the walker looks for a later $ on the same line. If it finds one, it tries the longer span. If not, it bails. The net effect: $X…$Y patterns where the inner content doesn't look like math are left as plain text, and only clean math pairs ($E=mc^2$, $\alpha$, $x^2$) get the math treatment.

The alternative is to delegate to remark-math's built-in $...$ parser, which already handles these cases correctly with a battle-tested currency guard and span rules. That's a bigger change but eliminates the custom-walker class of bugs entirely.

2. The classifier is too eager (smaller, but real)

In isLikelyInlineMath (Markdown.tsx), the final return true means any $...$ span that doesn't match an explicit reject pattern is treated as math. This causes regressions for paired-dollar non-math content:

  • $PATH$ → math italic (env var)
  • $USER$, $HOME$ → math italic
  • $v1$, $v2$ → math italic (version)
  • $PATH=$PATH$ → first half math, second half plain (the walker bug above compounds this)

In the old code these were left as plain text.

A simple "flip the default to false" is not the right fix — that would break the legitimate $x$ → math case. The classifier needs an explicit short-math-identifier allow-list to catch single-letter variables like x, y, n, Re, det, etc.

Suggested fix: make isLikelyInlineMath follow the same structure as isInlineMathLike in CodeOrMath.tsx (which the same PR gets right — explicit math signals, explicit code denylist, conservative default):

const SHORT_MATH_IDS = new Set([
  "x","y","z","n","m","k","p","q","r","s","t","u","v","w",
  "a","b","c","d","e","f","g","h","i","j","l","o",
  "Re","Im","det","tr","arg","dim","ker","vec","mat",
]);

const NON_MATH_PATTERNS = [
  /^[A-Z]{2,}$/,              // ALL_CAPS env vars
  /^v?\d+(\.\d+)*[a-z]?$/i,  // version strings
  /^[A-Z]:[/\\]/,              // Windows paths
  /^(?:npm|git|docker|ssh|curl|make|cmake)\s/i, // shell commands
];

function isLikelyInlineMath(math: string): boolean {
  if (!math || math !== math.trim() || math.includes("\n")) return false;
  if (math.includes("://") || math.includes("](")) return false;
  if (/^\$?\d+(?:\.\d+)?%?$/.test(math)) return false;
  for (const re of NON_MATH_PATTERNS) if (re.test(math)) return false;
  if (/\\[A-Za-z]+\b/.test(math)) return true;        // \alpha, \frac, ...
  if (/[\^_{}|]/.test(math)) return true;            // x^2, x_i, |x|, {a,b}
  if (/\b(?:alpha|beta|gamma|sum|int|prod|lim|infty|sqrt|frac|sin|cos|tan|log|ln|max|min|partial|nabla|left|right)\b/.test(math)) return true;
  if (SHORT_MATH_IDS.has(math)) return true;          // x, y, n, Re, det, ...
  if (/[A-Za-z]\s+[A-Za-z]/.test(math)) return false; // prose with spaces
  return false;                                        // default: not math
}

This is the same fix structure as isInlineMathLike (which the PR already gets right). The two classifiers should arguably share a single classifyMathContent function — they answer the same question with different rules, and that's a maintenance liability.

Input Current After fix
$x$ math ✓ math ✓
$E=mc^2$ math ✓ math ✓
$\alpha$ math ✓ math ✓
$PATH$ math ✗ text ✓
$USER$, $HOME$ math ✗ text ✓
$v1$, $v2$ math ✗ text ✓
$5$ text ✓ text ✓

3. Add an end-to-end visual regression test

The current test file (src/__tests__/math-golden.test.ts) only checks the strings produced by latexNormalizeForKatex and the boolean classifiers. It does not run those strings through katex.renderToString to verify they actually produce valid output. This means:

  • A future change to latexNormalizeForKatex could silently break the KaTeX render and every test would still pass.
  • The original \| → "italic vert" bug would not be caught by the new test suite.

Please add at least one test that runs the canonical chiral decomposition through the full pipeline:

test("chiral decomposition renders end-to-end", () => {
  const src = String.raw`U(3)_L \times U(3)_R \;=\; \underbrace{U(1)_V}_{\text{baryon #}}\times \underbrace{SU(3)_V}_{\text{isospin, SU(3) flavor}} \;\times\; \underbrace{U(1)_A}_{\text{axial}}\times \underbrace{SU(3)_A}_{\text{chiral }SU(3)}`;
  const html = katex.renderToString(latexNormalizeForKatex(src), {
    throwOnError: true,
    displayMode: true,
  });
  assert(!html.includes("katex-error"));
  for (const label of ["baryon", "isospin", "axial", "chiral"]) {
    assert(html.includes(label), `expected label "${label}" in rendered output`);
  }
});

test("\\| is preserved as double-bar norm", () => {
  const html = katex.renderToString(latexNormalizeForKatex(String.raw`\|x\|`), {
    throwOnError: true,
  });
  const text = html.replace(/<[^>]+>/g, "").trim();
  assert(text.includes("∥"), "expected double-bar norm glyph in output");
  assert(!text.toLowerCase().includes("vert"), "rendered output should not contain literal 'vert'");
});

Both bugs you fixed would have been caught by tests like these.

Smaller items (not blockers)

4. latexNormalizeForKatex always emits a trailing space after \vert

if (source[i] === "|") { out += "\\vert "; i += 1; continue; }

For A | B this becomes \vert B (double space). Visually identical, but ugly in the source. One-line fix: make the space conditional on the next character not being whitespace, or drop it entirely.

5. CSS adds 66 lines of .inline-math__source / .block-math styling

The hover-source popup and copy button are nice UX features, but they're a feature add bundled with a bug fix. Consider splitting into a separate PR so the bug fix is easier to backport and review.

6. The two commits could be squashed

The first commit (f4f10e8) introduces a small bug — inlineMathNode and blockMathNode create hProperties: {} but the components read value from the props, so the first commit's components receive undefined as source. The second commit (f9f4eb6) fixes it by passing { value } in hProperties. Consider squashing so git log doesn't carry a known-broken intermediate state.

Confirmed working (no action needed)

  • Both reported bugs are fixed end-to-end.
  • stripMathDelimiters handles all four delimiter pairs.
  • latexNormalizeForKatex correctly preserves \text{already\_escaped} and \tfrac{a}{b} (nested braces) without double-escaping.
  • isInlineMathLike in CodeOrMath.tsx is well-designed; false-positive risk for backtick-wrapped content is low.
  • The bundle gets smaller (drops rehype-katex and remark-math from prod deps).
  • looksLikeFormula is conservative for multi-line content and correctly requires an explicit LaTeX command for the single-line case.
  • The custom components (InlineMath, BlockMath) enable nice UX features that the old rehype-katex pipeline couldn't (hover-to-show-source, copy buttons, custom error fallback).

Summary

The two must-fixes are genuinely important — #1 (the walker bug) is a visible regression on a whole class of inputs that previously worked, and #2 (the classifier default) is the same class of bug, smaller in scope. Item #3 (the missing test) means a future refactor could silently re-introduce the original bugs.

If you push back on any of these I'd be curious to hear the reasoning. In particular, if you think the custom walker is the right design and you want to keep it, the fix in #1 is the minimum needed. If you'd rather not maintain a custom walker at all, the alternative is to keep remark-math + rehype-katex and just fix the two bugs with a small latexNormalize.ts helper (which is roughly what my fix did in ~80 lines).


That's the focused review. Three things to address before merge, in priority order: the walker bug (item 1, biggest impact), the classifier default (item 2, narrower), the missing end-to-end test (item 3, prevents future regressions). The rest is polish.

If you want, I can also write up a minimal counter-proposal — a latexNormalize.ts + tests only, ~80 lines, that fixes both reported bugs without rewiring the walker or replacing remark-math. The developer can pick between that and the PR based on their priorities.

@esengine

esengine commented Jun 6, 2026

Copy link
Copy Markdown
Owner

Thanks for this, and for the careful math-rendering work (the golden tests are nice). Since this PR was opened, main-v2 has landed math rendering directly via the remark-math + rehype-katex plugins in Markdown.tsx, so the core feature is already in place. The custom BlockMath/InlineMath/latexNormalize approach in this PR now overlaps with and competes against that. Closing as superseded. If there is a concrete gap, for example support for non-standard (...) / [...] delimiters that remark-math does not handle by default, please open a small focused PR that augments the existing plugin pipeline. Thank you!

@esengine esengine closed this Jun 6, 2026
@lightfront

lightfront commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

I believe this is a misunderstanding. This PR IS for main-v2. The current rendering through the remark-math + rehype-katex plugins in Markdown.tsx in main-v2 has too many edge cases inappropriately treated or failed to be rendered at all. This PR vastly improved the treatment of these edge cases and thus is absolutely necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

v2 Go rewrite (1.x) — main-v2 branch, active development

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants