fix(desktop): render LLM math in chat messages#3251
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f4f10e815e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Review of PR #3251Both reported bugs are fixed — I verified end-to-end with the chiral-decomposition example and the I have three substantive concerns, one of them significant. Then a few smaller items. 1. The greedy walker breaks for any
|
| Input | Walker result | Visual |
|---|---|---|
$X$Y |
math X + plain Y |
*X*Y ✗ |
$5$6 |
text 5 (number reject) + plain 6 |
56 (works by accident) |
$5 and $6 |
math 5 and + plain 6 |
*5 and*6 ✗ |
$5 or $6 |
math 5 or + plain 6 |
*5 or*6 ✗ |
$PATH=$PATH |
math PATH= + plain PATH |
*PATH=*PATH ✗ |
$PATH and $HOME |
math PATH and + plain HOME |
*PATH and*HOME ✗ |
key=$a : $b$ |
math a : + plain b |
key=*a :*b ✗ |
$5 = $6 |
math 5 = + plain 6 |
*5 =*6 ✗ |
cost $5 and a$6 |
text 5 and a (rejected) |
plain ✓ (works by accident) |
The classifier's letter-space-letter heuristic ([A-Za-z]\s+[A-Za-z]) catches some cases accidentally — $5 and a$6 works because d space a matches. But most cases in the table are broken because the heuristic only fires when the prose between the tokens has a letter-space-letter sequence, which isn't true for 5 or, 5 and, 5 =, etc.
Suggested fix (in findDollarClose): only form a math pair if the content between the $s has a strong math signal (any of: \command, ^, _, |, {, }, named math function). If the content is ambiguous, look for a later $ on the same line. If none works, bail out (return -1) and let the whole span be plain text.
A sketch:
function findDollarClose(s: string, start: number): number {
for (let i = start; i < s.length; i++) {
if (s[i] === "\n") return -1;
if (s[i] === "\\") { i += 1; continue; }
if (s[i] === "$" && !isMathEscaped(s, i)) {
const content = s.slice(start, i);
// If the content has no strong math signal, this $ might not be
// a math closer. Look for a later $ on the same line.
if (!hasStrongMathSignal(content) && s.indexOf("$", i + 1) !== -1) {
continue;
}
return i;
}
}
return -1;
}
function hasStrongMathSignal(content: string): boolean {
return /\\[A-Za-z]+\b/.test(content) // \alpha, \frac, ...
|| /[\^_{}|]/.test(content) // x^2, x_i, |x|, {a,b}
|| /\b(?:alpha|beta|gamma|sum|int|prod|lim|infty|sqrt|frac|sin|cos|tan|log|ln|max|min|partial|nabla|left|right)\b/.test(content);
}For inputs without strong math signals, the walker looks for a later $ on the same line. If it finds one, it tries the longer span. If not, it bails. The net effect: $X…$Y patterns where the inner content doesn't look like math are left as plain text, and only clean math pairs ($E=mc^2$, $\alpha$, $x^2$) get the math treatment.
The alternative is to delegate to remark-math's built-in $...$ parser, which already handles these cases correctly with a battle-tested currency guard and span rules. That's a bigger change but eliminates the custom-walker class of bugs entirely.
2. The classifier is too eager (smaller, but real)
In isLikelyInlineMath (Markdown.tsx), the final return true means any $...$ span that doesn't match an explicit reject pattern is treated as math. This causes regressions for paired-dollar non-math content:
$PATH$→ math italic (env var)$USER$,$HOME$→ math italic$v1$,$v2$→ math italic (version)$PATH=$PATH$→ first half math, second half plain (the walker bug above compounds this)
In the old code these were left as plain text.
A simple "flip the default to false" is not the right fix — that would break the legitimate $x$ → math case. The classifier needs an explicit short-math-identifier allow-list to catch single-letter variables like x, y, n, Re, det, etc.
Suggested fix: make isLikelyInlineMath follow the same structure as isInlineMathLike in CodeOrMath.tsx (which the same PR gets right — explicit math signals, explicit code denylist, conservative default):
const SHORT_MATH_IDS = new Set([
"x","y","z","n","m","k","p","q","r","s","t","u","v","w",
"a","b","c","d","e","f","g","h","i","j","l","o",
"Re","Im","det","tr","arg","dim","ker","vec","mat",
]);
const NON_MATH_PATTERNS = [
/^[A-Z]{2,}$/, // ALL_CAPS env vars
/^v?\d+(\.\d+)*[a-z]?$/i, // version strings
/^[A-Z]:[/\\]/, // Windows paths
/^(?:npm|git|docker|ssh|curl|make|cmake)\s/i, // shell commands
];
function isLikelyInlineMath(math: string): boolean {
if (!math || math !== math.trim() || math.includes("\n")) return false;
if (math.includes("://") || math.includes("](")) return false;
if (/^\$?\d+(?:\.\d+)?%?$/.test(math)) return false;
for (const re of NON_MATH_PATTERNS) if (re.test(math)) return false;
if (/\\[A-Za-z]+\b/.test(math)) return true; // \alpha, \frac, ...
if (/[\^_{}|]/.test(math)) return true; // x^2, x_i, |x|, {a,b}
if (/\b(?:alpha|beta|gamma|sum|int|prod|lim|infty|sqrt|frac|sin|cos|tan|log|ln|max|min|partial|nabla|left|right)\b/.test(math)) return true;
if (SHORT_MATH_IDS.has(math)) return true; // x, y, n, Re, det, ...
if (/[A-Za-z]\s+[A-Za-z]/.test(math)) return false; // prose with spaces
return false; // default: not math
}This is the same fix structure as isInlineMathLike (which the PR already gets right). The two classifiers should arguably share a single classifyMathContent function — they answer the same question with different rules, and that's a maintenance liability.
| Input | Current | After fix |
|---|---|---|
$x$ |
math ✓ | math ✓ |
$E=mc^2$ |
math ✓ | math ✓ |
$\alpha$ |
math ✓ | math ✓ |
$PATH$ |
math ✗ | text ✓ |
$USER$, $HOME$ |
math ✗ | text ✓ |
$v1$, $v2$ |
math ✗ | text ✓ |
$5$ |
text ✓ | text ✓ |
3. Add an end-to-end visual regression test
The current test file (src/__tests__/math-golden.test.ts) only checks the strings produced by latexNormalizeForKatex and the boolean classifiers. It does not run those strings through katex.renderToString to verify they actually produce valid output. This means:
- A future change to
latexNormalizeForKatexcould silently break the KaTeX render and every test would still pass. - The original
\|→ "italic vert" bug would not be caught by the new test suite.
Please add at least one test that runs the canonical chiral decomposition through the full pipeline:
test("chiral decomposition renders end-to-end", () => {
const src = String.raw`U(3)_L \times U(3)_R \;=\; \underbrace{U(1)_V}_{\text{baryon #}}\times \underbrace{SU(3)_V}_{\text{isospin, SU(3) flavor}} \;\times\; \underbrace{U(1)_A}_{\text{axial}}\times \underbrace{SU(3)_A}_{\text{chiral }SU(3)}`;
const html = katex.renderToString(latexNormalizeForKatex(src), {
throwOnError: true,
displayMode: true,
});
assert(!html.includes("katex-error"));
for (const label of ["baryon", "isospin", "axial", "chiral"]) {
assert(html.includes(label), `expected label "${label}" in rendered output`);
}
});
test("\\| is preserved as double-bar norm", () => {
const html = katex.renderToString(latexNormalizeForKatex(String.raw`\|x\|`), {
throwOnError: true,
});
const text = html.replace(/<[^>]+>/g, "").trim();
assert(text.includes("∥"), "expected double-bar norm glyph in output");
assert(!text.toLowerCase().includes("vert"), "rendered output should not contain literal 'vert'");
});Both bugs you fixed would have been caught by tests like these.
Smaller items (not blockers)
4. latexNormalizeForKatex always emits a trailing space after \vert
if (source[i] === "|") { out += "\\vert "; i += 1; continue; }For A | B this becomes \vert B (double space). Visually identical, but ugly in the source. One-line fix: make the space conditional on the next character not being whitespace, or drop it entirely.
5. CSS adds 66 lines of .inline-math__source / .block-math styling
The hover-source popup and copy button are nice UX features, but they're a feature add bundled with a bug fix. Consider splitting into a separate PR so the bug fix is easier to backport and review.
6. The two commits could be squashed
The first commit (f4f10e8) introduces a small bug — inlineMathNode and blockMathNode create hProperties: {} but the components read value from the props, so the first commit's components receive undefined as source. The second commit (f9f4eb6) fixes it by passing { value } in hProperties. Consider squashing so git log doesn't carry a known-broken intermediate state.
Confirmed working (no action needed)
- Both reported bugs are fixed end-to-end.
stripMathDelimitershandles all four delimiter pairs.latexNormalizeForKatexcorrectly preserves\text{already\_escaped}and\tfrac{a}{b}(nested braces) without double-escaping.isInlineMathLikeinCodeOrMath.tsxis well-designed; false-positive risk for backtick-wrapped content is low.- The bundle gets smaller (drops
rehype-katexandremark-mathfrom prod deps). looksLikeFormulais conservative for multi-line content and correctly requires an explicit LaTeX command for the single-line case.- The custom components (InlineMath, BlockMath) enable nice UX features that the old rehype-katex pipeline couldn't (hover-to-show-source, copy buttons, custom error fallback).
Summary
The two must-fixes are genuinely important — #1 (the walker bug) is a visible regression on a whole class of inputs that previously worked, and #2 (the classifier default) is the same class of bug, smaller in scope. Item #3 (the missing test) means a future refactor could silently re-introduce the original bugs.
If you push back on any of these I'd be curious to hear the reasoning. In particular, if you think the custom walker is the right design and you want to keep it, the fix in #1 is the minimum needed. If you'd rather not maintain a custom walker at all, the alternative is to keep remark-math + rehype-katex and just fix the two bugs with a small latexNormalize.ts helper (which is roughly what my fix did in ~80 lines).
That's the focused review. Three things to address before merge, in priority order: the walker bug (item 1, biggest impact), the classifier default (item 2, narrower), the missing end-to-end test (item 3, prevents future regressions). The rest is polish.
If you want, I can also write up a minimal counter-proposal — a latexNormalize.ts + tests only, ~80 lines, that fixes both reported bugs without rewiring the walker or replacing remark-math. The developer can pick between that and the PR based on their priorities.
|
Thanks for this, and for the careful math-rendering work (the golden tests are nice). Since this PR was opened, main-v2 has landed math rendering directly via the remark-math + rehype-katex plugins in Markdown.tsx, so the core feature is already in place. The custom BlockMath/InlineMath/latexNormalize approach in this PR now overlaps with and competes against that. Closing as superseded. If there is a concrete gap, for example support for non-standard (...) / [...] delimiters that remark-math does not handle by default, please open a small focused PR that augments the existing plugin pipeline. Thank you! |
|
I believe this is a misunderstanding. This PR IS for main-v2. The current rendering through the remark-math + rehype-katex plugins in Markdown.tsx in main-v2 has too many edge cases inappropriately treated or failed to be rendered at all. This PR vastly improved the treatment of these edge cases and thus is absolutely necessary. |
Summary
Testing
pnpm run buildpnpm run test:allpnpm run test:todo-visibilityNotes
remark-math/rehype-katexdependencies from the desktop frontend.main-v2deferred Markdown rendering behavior viauseDeferredValue.