Skip to content

Allow equation and tight lists/enums to be part of paragraphs#7931

Open
mkorje wants to merge 4 commits intotypst:mainfrom
mkorje:par-math
Open

Allow equation and tight lists/enums to be part of paragraphs#7931
mkorje wants to merge 4 commits intotypst:mainfrom
mkorje:par-math

Conversation

@mkorje
Copy link
Copy Markdown
Collaborator

@mkorje mkorje commented Mar 2, 2026

Closes #3206.

This PR adds a new internal field inlinable to block(), which allows blocks that are next to paragraph-level content (and other inlinable blocks) to become part of the paragraph. Block equations and tight lists/enums now have this property.

The field is being kept internal for now as the design is still undecided and subject to change (if it is to be exposed at all).

The implementation is almost complete and is working, but there are some things that need discussion still - to be updated soon™.

More details

  • The inlinable field is internal.
  • An inlinable block together with paragraph-level content joins the paragraph.
  • A single inlinable block stays block-level, whereas multiple inlinable blocks together become a paragraph.
  • The rules for tight ListElems and EnumElems now construct BlockElems with inlinable: true.
  • The rule for a block EquationElem constructs a BlockElem with inlinable: true.
  • With the way tight lists/enums work you can end up with some perhaps odd/unexpected behavior:
    Blah paragraph:
    - A
    - B
    - C
    
    // This isn't a new list and par, but makes the whole list not tight!
    // Not tight list = a paragraph, a block, and another paragraph.
    - A
    - B
    - C
    New paragraph?

Todos

  • Verify that pdftags test changes are okay
  • Fix issue-6242-tight-list-attach-spacing test regression
  • Add tests
  • Sort out window/orphan need calculation in typst_layout::flow::distribute::Collector::par_children

@laurmaedje laurmaedje added math Related to math category, with its syntax, layout, etc. model Related to model category, which is all about structure and semantics. and removed math Related to math category, with its syntax, layout, etc. labels Mar 2, 2026
@SeSodesa
Copy link
Copy Markdown
Contributor

SeSodesa commented Mar 2, 2026

Without actually reviewing the code, was the edge case of a block equation being in the middle of a paragraph handled? A block equation might end a paragraph, but the text block following a block equation might also be a part of the same paragraph. Does the single proposed new field allow differentiating between these 2 cases, or is this simply handled by the usual empty lines in markup mode? What if such a paragraph is created in code mode?

A block equation starting a paragraph is never appropriate. That case does not need to be considered as a valid structure.

@Enivex
Copy link
Copy Markdown
Collaborator

Enivex commented Mar 2, 2026

A block equation starting a paragraph is never appropriate

I wouldn't make such categorical statements. There's a lot of diversity out there.

Edit: A simple example would be $ ... $ is known as the (...) equation. Or a "paragraph" consisting of a single equation and nothing more.

@mkorje
Copy link
Copy Markdown
Collaborator Author

mkorje commented Mar 2, 2026

Here's a summary of the behaviour, using its hypothetical HTML equivalent to make things clear.

Note that having an equation on its own in a paragraph is not covered by this. It's just a single block then.

#lorem(20)

$ x + y $

#lorem(20)
<p>Lorem...</p>
<math>...</math>
<p>Lorem...</p>
#lorem(20)
$ x + y $

#lorem(20)
<p>
  Lorem...
  <math>...</math>
</p>
<p>Lorem...</p>
#lorem(20)

$ x + y $
#lorem(20)
<p>Lorem...</p>
<p>
  <math>...</math>
  Lorem...
</p>
#lorem(20)
$ x + y $
#lorem(20)
<p>
  Lorem...
  <math>...</math>
  Lorem...
</p>
$ x + y $
$ y + z $
<p>
  <math>...</math>
  <math>...</math>
</p>

As for creating paragraphs in code, I think it should work the same as whatever works currently. (I need to check, as I'm not entirely sure how the paragraph grouping works in code mode...)

@fabianbosshard
Copy link
Copy Markdown

A block equation starting a paragraph is never appropriate. That case does not need to be considered as a valid structure.

In most cases that may be true, but I think it’s important to leave the choice to the user.
As a student who needs to take notes, write summaries, etc., I find myself quite often in the scenario where I start with an equation and then continue with e.g. “where A is the area of…” or something like that.

@SeSodesa
Copy link
Copy Markdown
Contributor

SeSodesa commented Mar 3, 2026

As a student who needs to take notes, write summaries, etc., I find myself quite often in the scenario where I start with an equation and then continue with e.g. “where A is the area of…” or something like that.

But you then go on to re-read your notes and fix your language, especially if you intend to send them for somebody else to read, right? I would not expect that personal quick notes necessarily conform to good writing practices, so what Typst does to the tag structure there does not really matter. But if I was your supervisor or a reviewer at a journal and you handed me your thesis or an article with an equation starting a paragraph, it would be something you would have to fix in the next revision.

A better argument for allowing an equation to start a paragraph would be that the accessible PDF standards allow it (do they?), and therefore we should not apply additional restrictions.

@SeSodesa
Copy link
Copy Markdown
Contributor

SeSodesa commented Mar 3, 2026

Note that having an equation on its own in a paragraph is not covered by this. It's just a single block then.

Yeah, this is semantically the correct thing to do. A paragraph by definition implies prose, possibly combined with other content. It would be misleading for screen readers if something that is not written discourse (which an equation by itself cannot be) would be marked as such.

Actually, because of the last point, even 2 equations side by side should really not generate a paragraph in my humble opinion. They would generate 2 separate block equations. Unless of course there is a "proper paragraph" attached to them.

@laurmaedje laurmaedje added the waiting-on-review This PR is waiting to be reviewed. label Mar 3, 2026
@fabianbosshard
Copy link
Copy Markdown

But you then go on to re-read your notes and fix your language, especially if you intend to send them for somebody else to read, right? I would not expect that personal quick notes necessarily conform to good writing practices, so what Typst does to the tag structure there does not really matter. But if I was your supervisor or a reviewer at a journal and you handed me your thesis or an article with an equation starting a paragraph, it would be something you would have to fix in the next revision.

@SeSodesa I agree that in most cases it doesn’t make sense to start a paragraph with an equation. However, that doesn’t mean this should be enforced by the typesetting system itself. Otherwise, we could just as well argue that integral signs should be disallowed in inline math because they’re often considered poor style.

Correct me if I'm wrong, but this seems to be not merely a question of “good” style, but also one about (future) functionality. A typesetting system should not hard-code a specific editorial philosophy or writing convention.

Moreover, as @Enivex pointed out, there are legitimate cases where an equation can start a paragraph. Style requirements can still be enforced by supervisors or journals where needed. But the tool itself shouldn’t artificially prohibit structurally valid constructs.

I actually encountered this issue while experimenting with the interaction between text and display equations. In LaTeX, display equations are placed tightly relative to surrounding text when no blank lines are present: \abovedisplayskip or \abovedisplayshortskip before the equation (depending on the preceding line length), and \belowdisplayskip or \belowdisplayshortskip after it. Only when there is a blank line, an additional \parskip is added on the corresponding side.

I imagine that the variable under discussion could be used to implement a similar spacing mechanism in Typst, making the spacing more responsive to the semantics of the surrounding content. If a paragraph begins with a display equation and is followed directly by text (without a blank line), no additional paragraph spacing (parskip) should be inserted. Instead, it should use the same vertical spacing as a display equation that is surrounded by text on both sides.

@fabianbosshard
Copy link
Copy Markdown

fabianbosshard commented Mar 4, 2026

I just did some further experimenting and it seems my previous comment oversimplifies the exact mechanism in LaTeX. There is some other small space inserted in some cases, but to be honest I don't understand the exact rules applied here. Here is the experiment:

\documentclass[12pt]{article}
\begin{document}
\pagestyle{empty}



\setlength{\parindent}{0pt}
\setlength{\parskip}{0pt plus 0pt minus 0pt} 

\setlength{\abovedisplayskip}{0pt plus 0pt minus 0pt}
\setlength{\abovedisplayshortskip}{0pt plus 0pt minus 0pt}
\setlength{\belowdisplayskip}{0pt plus 0pt minus 0pt}
\setlength{\belowdisplayshortskip}{0pt plus 0pt minus 0pt}



\setlength{\topsep}{0pt plus 0pt minus 0pt}
\setlength{\partopsep}{0pt plus 0pt minus 0pt}

\setlength{\baselineskip}{0pt plus 0pt minus 0pt}


% ---------------------------------

\setlength{\parskip}{20pt plus 0pt minus 0pt}

A long long long long long long long long long line
\[
E=mc^2
\]

\vspace{1em}

\textbf{is (as expected) \emph{not} the same as}

\vspace{1em}


A long long long long long long long long long line

\[
E=mc^2
\]

\setlength{\parskip}{0pt plus 0pt minus 0pt}


\vspace{4em}
-----------------------------------
\vspace{4em}




\setlength{\baselineskip}{12pt plus 0pt minus 0pt}

A long long long long long long long long long line
\[
E=mc^2
\]

\vspace{1em}

\textbf{is also \emph{not exactly} the same as}

\vspace{1em}


A long long long long long long long long long line

\[
E=mc^2
\]



\vspace{4em}
-----------------------------------
\vspace{4em}



\setlength{\baselineskip}{11pt plus 0pt minus 0pt}


A long long long long long long long long long line
\[
E=mc^2
\]

\vspace{1em}

\textbf{now it \emph{is} the same as}

\vspace{1em}


A long long long long long long long long long line

\[
E=mc^2
\]



\end{document}

In any case, the general idea still holds: the spacing around display equations in LaTeX depends on the surrounding structure (paragraph boundaries, line lengths, spacing, etc.), which allows the layout to adapt to context. A similar mechanism in Typst - where the spacing of a display equation depends on whether it appears within a paragraph, at a paragraph boundary, or as a standalone block - could make equation spacing more semantically appropriate. Additionally, if the preceding line is short relative to the width of the equation, the spacing above the display could be reduced further, similar to how TeX uses \abovedisplayshortskip in that situation. As others (#2438) have pointed out, that would not only look nicer, but also increase available space on pages with many equations.

@uwni
Copy link
Copy Markdown
Contributor

uwni commented Mar 19, 2026

I didn't review the implementation yet. May I ask if this field is added for all block elements? I think it will be helpful if all blocks can be a part of paragraph. Like when you want to prevent the paragraph next to a block from first-line-intend (which means the block is a part of a complete paragraph semantically)

Copy link
Copy Markdown
Collaborator Author

@mkorje mkorje Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#5510 (fixed by #6242) has regressed.

Comment on lines +518 to +524
Item::Block(elem, styles) => {
// We need owned values as the `ParChild::Block` created
// will outlive inline layout.
block_elem = Some((*elem).clone());
block_styles = Some(styles.to_map());
block_idx = Some(idx);
}
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will StyleChain::to_map() have a big performance impact?
We need to get an owned version of the StyleChain onto the ParChild::Block enum variant

Comment on lines +198 to +207
// TODO: collect frames for widow/orphan prevention calculation. Should
// frames only (not blocks) participate in widow/orphan prevention?
// Regardless, the code here is a little weird.
let frames: Vec<_> = children
.iter()
.filter_map(|c| match c {
ParChild::Frame(f) => Some(f),
ParChild::Block { .. } => None,
})
.collect();
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very wrong currently (but does appear to be benign).

@saecki
Copy link
Copy Markdown
Member

saecki commented Mar 31, 2026

Most of the changes regarding PDF tags are fine. The paragraphs missing from some list bodies aren't an issue. What I'd consider a regression is the nesting of lists inside the list body (enum-number-override-nested and the list-tags-complex tests). This best practice guide describes that nested lists should be represented as direct children of the parent list in 4.2.5.1: https://pdfa.org/download-area/publications/Tagged-PDF-Best-Practice-Guide.pdf
typst-pdf actually has an extra pass that rearranges the tags for this:

pub fn push_body(&mut self, groups: &mut Groups, list: GroupId, body: GroupId) {

This only seems to happen if there are list items that are tightly spaced and then a widely spaced item after that:

- a
  - Indented

- b // Parbreak forces the list to be wide.

Here's the debug output of my debug-pdf-tags branch comparing main to the PR. Diffing is a sadly a little fuzzy because of the location hashes (and because the output is very verbose), but the main difference seems to be that on main there is a paragraph that groups only the text inside the first list item, and in this PR the paragraph spans the nested list too.

main
=== frames ===
[crates/typst-pdf/src/tags/tree/build.rs:201:9] &page.frame = Frame [
    Start("item", Location(17212)),
    Start("list", Location(37669)),
    Start("pdf-marker-tag", Location(59320)),
    Text("•"),
    End(Location(59320)),
    Group Frame [
        Start("pdf-marker-tag", Location(58334)),
        Start("par", Location(29122)),
        Text("a"),
        End(Location(29122)),
        Start("item", Location(54946)),
        Start("list", Location(63932)),
        Group Frame [
            Start("pdf-marker-tag", Location(20721)),
            Text("‣"),
            End(Location(20721)),
            Start("pdf-marker-tag", Location(32160)),
            Text("Indented"),
            End(Location(32160)),
        ],
        End(Location(63932)),
        End(Location(54946)),
        End(Location(58334)),
    ],
    Group Frame [
        Start("pdf-marker-tag", Location(3627)),
        Text("•"),
        End(Location(3627)),
        Start("pdf-marker-tag", Location(63598)),
        Start("par", Location(27817)),
        Text("b"),
        End(Location(27817)),
        End(Location(63598)),
    ],
    End(Location(37669)),
    End(Location(17212)),
    Start("item", Location(51167)),
    End(Location(51167)),
]

=== tags ===
START 0 item
  START 1 list
    START 2 pdf-marker-tag ListItemLabel
text
    END   2 pdf-marker-tag
> group
    START 3 pdf-marker-tag ListItemBody
      START 4 par
text
      END   4 par
      START 5 item
        START 6 list
> group
          START 7 pdf-marker-tag ListItemLabel
text
          END   7 pdf-marker-tag
          START 8 pdf-marker-tag ListItemBody
text
          END   8 pdf-marker-tag
< group
        END   6 list
      END   5 item
    END   3 pdf-marker-tag
< group
> group
    START 9 pdf-marker-tag ListItemLabel
text
    END   9 pdf-marker-tag
    START 10 pdf-marker-tag ListItemBody
      START 11 par
text
      END   11 par
    END   10 pdf-marker-tag
< group
  END   1 list
END   0 item
START 12 item
END   12 item

=== tree ===
group Root Id::<Group>(0)
  group List Id::<Group>(1)
    group Standard Id::<Group>(11)
      group ListItemLabel Id::<Group>(2)
        leaf
      group ListItemBody Id::<Group>(3)
        group Par Id::<Group>(4) (weak)
          leaf
    group List Id::<Group>(5)
      group Standard Id::<Group>(12)
        group ListItemLabel Id::<Group>(6)
          leaf
        group ListItemBody Id::<Group>(7)
          leaf
    group Standard Id::<Group>(13)
      group ListItemLabel Id::<Group>(8)
        leaf
      group ListItemBody Id::<Group>(9)
        group Par Id::<Group>(10) (weak)
          leaf
PR
=== frames ===
[crates/typst-pdf/src/tags/tree/build.rs:201:9] &page.frame = Frame [
    Start("item", Location(16966)),
    Start("list", Location(27103)),
    Start("pdf-marker-tag", Location(24678)),
    Text("•"),
    End(Location(24678)),
    Group Frame [
        Start("pdf-marker-tag", Location(40426)),
        Start("par", Location(45515)),
        Text("a"),
        Start("item", Location(39568)),
        Start("list", Location(13436)),
        Group Frame [
            Start("pdf-marker-tag", Location(21467)),
            Text("‣"),
            End(Location(21467)),
            Start("pdf-marker-tag", Location(11517)),
            Text("Indented"),
            End(Location(11517)),
            End(Location(13436)),
            End(Location(39568)),
        ],
        End(Location(45515)),
        End(Location(40426)),
    ],
    Group Frame [
        Start("pdf-marker-tag", Location(19885)),
        Text("•"),
        End(Location(19885)),
        Start("pdf-marker-tag", Location(16079)),
        Start("par", Location(26183)),
        Text("b"),
        End(Location(26183)),
        End(Location(16079)),
    ],
    End(Location(27103)),
    End(Location(16966)),
    Start("item", Location(4100)),
    End(Location(4100)),
]

=== tags ===
START 0 item
  START 1 list
    START 2 pdf-marker-tag ListItemLabel
text
    END   2 pdf-marker-tag
> group
    START 3 pdf-marker-tag ListItemBody
      START 4 par
text
        START 5 item
          START 6 list
> group
            START 7 pdf-marker-tag ListItemLabel
text
            END   7 pdf-marker-tag
            START 8 pdf-marker-tag ListItemBody
text
            END   8 pdf-marker-tag
          END   6 list
        END   5 item
< group
      END   4 par
    END   3 pdf-marker-tag
< group
> group
    START 9 pdf-marker-tag ListItemLabel
text
    END   9 pdf-marker-tag
    START 10 pdf-marker-tag ListItemBody
      START 11 par
text
      END   11 par
    END   10 pdf-marker-tag
< group
  END   1 list
END   0 item
START 12 item
END   12 item

=== tree ===
group Root Id::<Group>(0)
  group List Id::<Group>(1)
    group Standard Id::<Group>(11)
      group ListItemLabel Id::<Group>(2)
        leaf
      group ListItemBody Id::<Group>(3)
        group Par Id::<Group>(4) (weak)
          leaf
          group List Id::<Group>(5)
            group Standard Id::<Group>(12)
              group ListItemLabel Id::<Group>(6)
                leaf
              group ListItemBody Id::<Group>(7)
                leaf
    group Standard Id::<Group>(13)
      group ListItemLabel Id::<Group>(8)
        leaf
      group ListItemBody Id::<Group>(9)
        group Par Id::<Group>(10) (weak)
          leaf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

model Related to model category, which is all about structure and semantics. waiting-on-review This PR is waiting to be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Paragraph should be able to contain tight lists and block-level equations

7 participants