Add `SymbolElem` to improve math character styling and fix `$"a"$` vs. `$a$` by wrzian · Pull Request #5738 · typst/typst

wrzian · 2025-01-23T03:49:48Z

This PR is a compromise to close #274 without doing a full overhaul of how Typst handles math text. That overhaul is still needed, but will happen after a design is ready, which will take some time. This path was agreed on in the call that closed #4638 so that we can get some pragmatic fixes in even if they're temporary.

However this still required many changes, as SymbolElem splits all textual content into the two options of itself or TextElem. Getting all the tests to pass was rough, but they now do!

I've taken care to separate the changes into discrete chunks, and the test suite should pass on each of the commits.

(random fact: this PR also puts us over 2500 tests in the test suite 🎉)

Commits

1. MathText

Adds the new syntaxkind for math text. I've chosen to use an enum to represent the difference between numbers and characters in the AST.

2. SymbolElem

I added the initial SymbolElem with small hacks so it converts into a TextElem before any other realization (which are reverted in the two layout commits).

I kept SymbolElem separate from the Symbol value to make this PR easier, this required putting it in the outer symbols.rs file to avoid duplicate names (Repr). I don't think this is really the right location, but it's the easiest one for now.

I'm not sure what we want for the documentation of SymbolElem, I feel like it should be more of a hidden element that isn't exposed to the user for now, and the value type Symbol should be the main visible API.

3. Using SymbolElem

Here we make use of symbol elem in all the places it should be and I also update the casting rules for char so that symbols can be used in expected locations, such as matrix delimiters.

Notably, the evaluation rules for shorthands and escapes now generate content instead of values.

4. Realization and Markup layout

Here we undo the realization hack and start to layout symbol elems in markup mode.

This also includes a change to textual show rules: if you show a SymbolElem, it will act like a regex rule, but the synthesized text element that we pass into regex rule recipes has to be updated. If we just pass a TextElem to recipes for all regex rules, then show rules on characters in math will lose their italic styling.

Example:

#show regex("x"): set text(fill: red)
This $"x"$ should be upright, but this $x$ should be italic.

We now pass either a SymbolElem or a TextElem to textual show rule recipes based on these criteria:

If the selector was a symbol, pass a SymbolElem
If a text or regex selector matched a lone SymbolElem, pass a SymbolElem
Otherwise, pass a TextElem

I think this is the most intuitive solution, but it does mean that show rules on text can no longer rely on comparing against the text element function. The content-fields-complex test exposes this difference and needed to be updated for this.

5. Math layout

Here we no longer apply auto-italics to TextElems, but still do for SymbolElem. That should be the only real change, but I also refactored most of the functions in text.rs in an attempt to improve readability. I'm not sure that I really succeeded, let me know if you have suggestions.

One particular edge case was that the stretch_fragment layout function in stretch.rs no longer works on character strings, because they now generate FrameFragments instead of GlyphFragments. I spent a while trying to find a solution. But I couldn't make it work without changing some aspect of the layout.

Luckily the workaround is easy: create a SymbolElem by calling symbol("X"), writing a character escape, or writing the character raw in math. This is the reason for the changes in the math-lr-color and math-stretch-shorthand tests.

Review

I've done a first pass review myself, but still left a few questions in the code as TODO comments that I would appreciate feedback on. I'll try to be responsive the next few days so we can get this merged.

Let's put #274 to rest!

laurmaedje · 2025-01-23T15:53:35Z

Thank you for implementing this! 🎉

Notes

I chose the review style of refactoring your PR myself to see what I think should be kept/changed. You can view the end result of it in this branch. (It's all squashed since I'm no jj voodoo master 😅.) Here are the things I think should be changed:

In non-math realizations, the SymbolElem should be eagerly transformed into TextElem in the else branch of visit_math_rules. This way, we
- can skip much of the regex matching complexity
- don't need any special handling of symbols in inline layout
- can remove SymbolElem from the grouping rules
I'm not really comfortable with adding a symbol selector. But I think we don't actually need it! If we transform SymbolElem to TextElem eagerly in non-math as mentioned above, the only way to regex match a SymbolElem is from find_regex_match_in_text directly on a SymbolElem. For that, your lone symbol check should suffice. And we can also remove the TextualKind stuff. On top, I suggest that we call the field of SymbolElem text so that we have minimal breakage in existing text show rules that will now get a symbol. Since SymbolElem is temporary, having the cleanest name doesn't matter much.
I'm not comfortable with a generic Content -> char cast, since it is a fundamentally lossy operation. The existing thing in AccentElem is a hack and I would prefer for it to stay isolated to it. If we just let escapes and shorthands stay Symbol instead of SymbolElem (keeping the return type of the Evals at Value rather than Content), from my testing, we don't need this change to fix Single letter strings get parsed as italicized in math mode #274.
You had to update issue-math-realize-scripting because of missing spaces. The reason for that is that joining content and symbol still turned the symbol into a TextElem. Can be fixed by updating fn join in foundations/ops.rs.
The SymbolElem can go into foundations/symbol.rs and we just spell out crate::foundations::Repr in the impl.
In two places &c.to_string() can be replaced by c.encode_utf8(&mut [0; 4]) to skip an allocation.
The refactor in layout/math/text.rs looks fine!
I think the intent of the content-fields-complex isn't really kept since it tries to evaluate a normal-looking equation, but now there are various [..] thingies. If we switch the field of SymbolElem to text, we can replace func == text with elem.has("text") to fix it completely within let compute (alongside two of your changes)

As said, the branch linked above contains the result I ended with, containing all these changes. If you disagree with some of my proposed changes, then I'm curious to hear your thoughts. If not, I don't mind whether one of us pushes the squashed changes to this PR or whether you incorporate them into your cleaner commit history. I assume the individual commits were mostly for review convenience, so maybe it's unnecessary work.

Responses

I'm not sure what we want for the documentation of SymbolElem, I feel like it should be more of a hidden element that isn't exposed to the user for now, and the value type Symbol should be the main visible API.

No documentation is necessary. Even though SymbolElem is observable in a few cases, let's consider it internal for now. (SequenceElem is way more important and also not yet documented.)

TODO: Can we get away with not checking the RTL/LTR direction?

Not relevant if we transform to TextElem eagerly.

TODO: Update traversal with std::ops::ControlFlow to exit early.

Seems like a nice change in general, but unrelated to the PR. :)

TODO: Should we try to improve these error messages? [In cast! for char]

I'm not sure what change you are envisioning, but also probably not tied to this PR.

TODO: Not sure if this should change. [In Repr impl]

Having SymbolElem's repr be just like TextElem's is good for now.

wrzian · 2025-01-23T21:41:43Z

Thanks! I've incorporated all of your changes 👍

Can be fixed by updating fn join in foundations/ops.rs.

Aha!

&c.to_string() can be replaced by c.encode_utf8(&mut [0; 4]) to skip an allocation.

I thought there should be a way to do this! Couldn't come up with the right incantation 😁

I rebased onto main and used jj squash <path> --to <commit> to put your changes into the right commits (after making a new commit matching your files; jj new, jj restore --from <commit>).

I did make some changes, but only to comments! You can see them on this commit in my repository (which matches the rebased update exactly). I mostly added more context or fixed typos.

If those look good, I'm ready to merge.

…. `$a$` (#5738)

laurmaedje · 2025-01-23T22:13:40Z

Fast-forward merged locally to keep the good commits. :)

Thank you!!

gasche · 2025-01-24T05:23:18Z

As a user affected by #274: very nice, thanks!

Typst v0.13 slightly reduced the widths of single alphabetic characters in attachments and fractions. In older versions of Typst the width would change between a single character and multiple characters. The current behavior is therefore more consistent! The other cases have a slightly reduced padding between the base unit and the exponent. This is however barely noticeable compared to the first change. I think these changes are related to typst/typst#274 and typst/typst#5738. While the output will be slightly different in older versions of Typst, this is overall not a huge issue.

qguv · 2025-05-06T15:17:41Z

Hi @laurmaedje @wrzian! The description of this PR hints at an upcoming overhaul of how Typst handles math text. I couldn't find another obvious place where ideas are being collected for that overhaul, so I'll include a small note here. If there's a more appropriate place to put this, could you direct me to it? Thanks :)

The current math mode behavior of removing spaces between consecutive variables makes typesetting expressions difficult in contexts where juxtaposition indicates some common operation, e.g. function application in lambda calculus or logic, or multiplication in arithmetic. The current workarounds are quite cumbersome, so it would be really nice if the math text overhaul allowed this space-removal between consecutive variables to be turned on or off with a #set rule. Alternatively, if the new math mode is able to provide a type for variables in math mode (e.g. the type of $a$ and $b$ in the expression $a b$) which could be targeted by a #show rule, then the style could be adjusted, solving the problem. I think it would be wise to think about this already at the design stage, as it might be more difficult to fix if the new design calcifies the current behavior.

laurmaedje · 2025-05-06T15:37:35Z

If there's a more appropriate place to put this, could you direct me to it?

Much of the development discussions happen on Discord. There's a dedicated thread for math there.

gasche · 2025-05-06T15:47:46Z

(I think it's fairly unfortunate that many of the design discussions around Typst are enclosed on Discord. Various people cannot access it easily, and it is not easily searchable by web engines, does not have public archives, etc. This trove of content is going to be lost in a few years when Discord enshittifies in response to the next popular chat app, and the early history of Typst is going to be basically lost.)

wrzian mentioned this pull request Jan 23, 2025

Update traversal with std::ops::ControlFlow to exit early. #5743

Closed

wrzian added 5 commits January 23, 2025 16:28

Just add MathText SyntaxKind

0b8b7d0

Basic SymbolElem addition

c47b71b

Use SymbolElem in more places and add char cast for content

fecdc39

Add SymbolElem to realization

7838da0

Update math TextElem layout to separate out SymbolElem

6fe1e20

wrzian force-pushed the symbolelem branch from 18bcb9d to 6fe1e20 Compare January 23, 2025 21:30

laurmaedje added this pull request to the merge queue Jan 23, 2025

github-merge-queue bot pushed a commit that referenced this pull request Jan 23, 2025

Add SymbolElem to improve math character styling and fix $"a"$ vs…

57c86fb

…. `$a$` (#5738)

laurmaedje removed this pull request from the merge queue due to a manual request Jan 23, 2025

laurmaedje merged commit 6fe1e20 into typst:main Jan 23, 2025
6 checks passed

wrzian deleted the symbolelem branch January 24, 2025 00:40

laurmaedje mentioned this pull request Feb 22, 2025

Can no longer attach label to symbols #5930

Closed

1 task

YDX-2147483647 mentioned this pull request Feb 23, 2025

Differentials is broken Leedehai/typst-physics#45

Closed

laurmaedje mentioned this pull request Mar 6, 2025

Hotfix for labels on symbols #6015

Merged

MDLC01 mentioned this pull request Mar 23, 2025

math.lr function has no effect when passed a string #6085

Closed

1 task

wrzian mentioned this pull request May 22, 2025

Script-style is not applied to numbers consistently in math #6318

Closed

1 task

mkorje mentioned this pull request Jun 3, 2025

Strikethrough doesn't apply to single characters in math equation #1263

Closed

1 task

ParrotsByte mentioned this pull request Oct 30, 2025

Single-letter strings inside math mode are not sensitive to spaces around them #7270

Closed

1 task

isuffix mentioned this pull request Oct 31, 2025

Always treat TextElem fragments as spaced in math #7276

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add `SymbolElem` to improve math character styling and fix $"a"$ vs. $a$ #5738

Add `SymbolElem` to improve math character styling and fix $"a"$ vs. $a$ #5738
laurmaedje merged 5 commits intotypst:mainfrom
wrzian:symbolelem

wrzian commented Jan 23, 2025 •

edited

Loading

Uh oh!

laurmaedje commented Jan 23, 2025

Uh oh!

wrzian commented Jan 23, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

laurmaedje commented Jan 23, 2025

Uh oh!

gasche commented Jan 24, 2025

Uh oh!

qguv commented May 6, 2025

Uh oh!

laurmaedje commented May 6, 2025

Uh oh!

gasche commented May 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

wrzian commented Jan 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Commits

1. MathText

2. SymbolElem

3. Using SymbolElem

4. Realization and Markup layout

5. Math layout

Review

Uh oh!

laurmaedje commented Jan 23, 2025

Notes

Responses

Uh oh!

wrzian commented Jan 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

laurmaedje commented Jan 23, 2025

Uh oh!

gasche commented Jan 24, 2025

Uh oh!

qguv commented May 6, 2025

Uh oh!

laurmaedje commented May 6, 2025

Uh oh!

gasche commented May 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

wrzian commented Jan 23, 2025 •

edited

Loading

wrzian commented Jan 23, 2025 •

edited

Loading