Skip to content

Semantic paragraphs#5746

Merged
laurmaedje merged 12 commits intomainfrom
semantic-paragraphs
Jan 24, 2025
Merged

Semantic paragraphs#5746
laurmaedje merged 12 commits intomainfrom
semantic-paragraphs

Conversation

@laurmaedje
Copy link
Member

@laurmaedje laurmaedje commented Jan 24, 2025

What is this about?

When you add inline-level content to your document, Typst will automatically wrap it in paragraphs. However, a typical document also contains some text that shouldn't semantically be part of a paragraph, for example in a heading or caption. To express this difference, this pull request introduces a few rules that let Typst distinguish between "just text" and a semantic paragraph.

The new rules for when Typst wraps inline-level content in a paragraph are as follows:

  • All text at the root of a document is wrapped in paragraphs.

  • Text in a container (like a block) is only wrapped in a paragraph if the container holds any block-level content. If all of the contents are inline-level, no paragraph is created.

In the laid-out document, it's not immediately visible whether text became part of a paragraph. However, it is still important for various reasons:

  • Certain paragraph styling like first-line-indent will only apply to proper paragraphs, not any text. Similarly, par show rules of course only trigger on paragraphs.

  • A proper distinction between paragraphs and other text helps people who rely on assistive technologies (such as screen readers) navigate and understand the document properly. Currently, this only applies to HTML export since Typst does not yet output accessible PDFs, but support for this is planned for the near future.

  • HTML export will generate a <p> tag only for paragraphs.

Going forward, when creating custom reusable components, users can and should take charge over whether Typst creates paragraphs. By wrapping text in a [block] instead of just adding paragraph breaks around it, one can force the absence of a paragraph. Conversely, by adding a [parbreak] after some content in a container, one can force it to become a paragraph even if it's just one word. This is, for example, what non-tight lists do to force their items to become paragraphs.

Changes

Concretely, this PR makes the following changes:

  • The ParElem becomes a properly constructible element. Previously, calling #par[..] was sort of fake and internally just added parbreaks and applied the relevant styles. In particular, it could result in multiple paragraphs. With this PR, #par[..] creates a real paragraph. If its body happens to contain block-level content, Typst will ignore it and emit a warning (as is the usual way for handling elements that cannot be processed, same as with html.elem in layout mode).

  • Paragraphs resulting from realization also become ParElems. To do that, the already realized [(&Content, StyleChain)] pairs are repacked into sequences and styled elements, yielding body: Content for the paragraph.

  • When the paragraph is laid out during flow layout, its body is realized again with a new paragraph realization mode. If the body is repacked (previously realized) content (the common case), this mostly just restores the flat list (we pay a small performance price for this duplicated work). However, even in this case, the realization can have an effects due to par show rules that might have been applied in the meantime.

  • Fragment (i.e. non top-level) layout and HTML fragment generation change as follows: During fragment realization, we observe whether any block-level content occured. If not, we don't terminate a potentially opened paragraph grouping, instead directly emitting the inline content. We also let the caller know whether there was any block-level content. If there was, layout proceeds as usual. If not, flow layout is put into inline mode where it expects inline elements. These go through normal inline layout, but with knowledge that they do not form a semantic paragraph. In a follow-up PR, I plan to support a new first-line-indent mode that activates even on non-consecutive paragraphs.

  • Non-paragraph inline layout now ignores hanging-indent and first-line-indent. However, not all paragraph properties are ignored. We respect leading (after all, we need some value if it breaks over multiple lines). We also respect justify, linebreaks, and par.line. I'm not 100% sure how to deal with these, but if I disable them now, there'd be no way at all to get them, so they are untouched for now. And I'm not certain they truly should be tied to semantic paragraphs. For the indents, the situation is more clear cut. We'll need to see.

  • The default show rules of some built-in elements like lists, quotes, etc. are adjusted to ensure they produce/don't produce paragraphs as appropriate. (This was my motivation for the outline work -- outline entries needed to become blocks. Side track much.)

  • The paragraph documentation was expanded to explain the new distinction.

Notes

  • In principle, we can make ParElem locatable/queryable now. However, I'm holding off on this for now because it will put a lot of strain on the introspector and we should first evaluate the performance impact more thoroughly. Down the road, I plan to optimize the introspection data structures to make it possible to make many more things (everything?) locatable. Note that, for now, it's still possible to count semantic paragraphs with a par show rule that dispatches a counter update.

@laurmaedje laurmaedje enabled auto-merge January 24, 2025 12:07
@laurmaedje laurmaedje added this pull request to the merge queue Jan 24, 2025
Merged via the queue into main with commit 26e65bf Jan 24, 2025
12 checks passed
@laurmaedje laurmaedje deleted the semantic-paragraphs branch January 24, 2025 12:23
quachpas added a commit to typst-community/glossarium that referenced this pull request Feb 8, 2025
quachpas added a commit to typst-community/glossarium that referenced this pull request Feb 8, 2025
stelzo pushed a commit to stelzo/typst that referenced this pull request Nov 21, 2025
git download method

fixed warinings

documentation

cli updater adaptation and clippy fixes

enhanced documentation

add git downloader default impl

migrating from git2 to gitoxide crate for git downloads

Add support for `c2sc` OpenType feature in `smallcaps` (typst#5655)

Just add MathText SyntaxKind

Basic SymbolElem addition

Use SymbolElem in more places and add `char` cast for content

Add SymbolElem to realization

Update math TextElem layout to separate out SymbolElem

Handle boxes and blocks a bit better in HTML export (typst#5744)

Co-authored-by: Martin Haug <3874949+reknih@users.noreply.github.com>

Tweak HTML pretty printing (typst#5745)

Semantic paragraphs (typst#5746)

Fix space collapsing for explicit paragraphs (typst#5749)

Support first-line-indent for every paragraph (typst#5768)

Fixed typo in the new outline docs (typst#5772)

Resolve bound name of bare import statically (typst#5773)

Fix typo in scripting.md (typst#5783)

Modular, multi-threaded, transitioning plugins (typst#5779)

Include images from raw pixmaps and more (typst#5632)

Co-authored-by: PgBiel <9021226+PgBiel@users.noreply.github.com>
Co-authored-by: Laurenz <laurmaedje@gmail.com>

Change type repr to short name (typst#5788)

Disable cjk_latin_spacing in raw by default (typst#5753)

Change the default math class of U+22A5 ⊥ UP TACK to Normal (typst#5714)

Revert adding `flatten-text` to `image` (typst#5789)

Refactor `Scope` (typst#5797)

Enable HTML feature in docs generator (typst#5800)

Scope deprecations (typst#5798)

Fix typo in page documentation (typst#5804)

Bump openssl from 0.10.66 to 0.10.70 (typst#5802)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bump codex to 0.1.0 (typst#5805)

Bump dependencies (typst#5808)

Fix small copy-paste oversight (typst#5811)

Bump more dependencies (typst#5813)

Export target docs (typst#5812)

Co-authored-by: Martin Haug <3874949+reknih@users.noreply.github.com>

0.13 changelog (typst#5801)

Release Candidate 1

Autocomplete content methods (typst#5822)

Documentation fixes and improvements (typst#5816)

Fix docs outline for nested definitions (typst#5823)

Document removals in changelog (typst#5827)

Fix unnecessary import rename warning (typst#5828)

Don't crash on image with zero DPI (typst#5835)

Add warning for `pdf.embed` elem used with HTML (typst#5829)

Add smart quotes for Bulgarian (typst#5807)

Respect `par` constructor arguments (typst#5842)

Bump `typst-assets`

Fix autocomplete and jumps in math (typst#5849)

Update documentation for `float.{to-bits, from-bits}` (typst#5836)

`Gradient::repeat`: Fix floating-point error in stop calculation (typst#5837)

Lazy parsing of the package index (typst#5851)

Remove Linux Libertine warning (typst#5876)

Bring back type/str compatibility for 0.13, with warnings and hints (typst#5877)

More robust SVG auto-detection (typst#5878)

HTML export: Use `<code>` for inline `RawElem` (typst#5884)

--make-deps fixes (typst#5873)

Update changelog (typst#5894)

Version bump

Fix HTML export of table with gutter (typst#5920)

Fix comparison of `Func` and `NativeFuncData` (typst#5943)

HTML export: fix elem counting on classify_output (typst#5910)

Co-authored-by: Laurenz <laurmaedje@gmail.com>

Fix introspection of HTML root sibling metadata (typst#5953)

Fix high CPU usage due to inotify watch triggering itself (typst#5905)

Co-authored-by: Laurenz <laurmaedje@gmail.com>

Fix false positive for type/str comparison warning (typst#5957)

Fix paper name in page setup guide (typst#5956)

Fix curve with multiple non-closed components. (typst#5963)

Fix docs example with type/string comparison (typst#5987)

Correct typo (typst#5971)

Make `array.chunks` example more readable (typst#5975)

Hotfix for labels on symbols (typst#6015)

Replace `par` function call in tutorial (typst#6023)

Mention that `sym.ohm` was removed in the 0.13.0 changelog (typst#6017)

Co-authored-by: Laurenz <laurmaedje@gmail.com>

Mark breaking symbol changes as breaking in 0.13.0 changelog (typst#6024)

0.13.1 changelog (typst#6025)

Version bump

dep min 1.81

use 1.81 in ci
stelzo pushed a commit to stelzo/typst that referenced this pull request Nov 21, 2025
git download method

fixed warinings

documentation

cli updater adaptation and clippy fixes

enhanced documentation

add git downloader default impl

migrating from git2 to gitoxide crate for git downloads

Add support for `c2sc` OpenType feature in `smallcaps` (typst#5655)

Just add MathText SyntaxKind

Basic SymbolElem addition

Use SymbolElem in more places and add `char` cast for content

Add SymbolElem to realization

Update math TextElem layout to separate out SymbolElem

Handle boxes and blocks a bit better in HTML export (typst#5744)

Co-authored-by: Martin Haug <3874949+reknih@users.noreply.github.com>

Tweak HTML pretty printing (typst#5745)

Semantic paragraphs (typst#5746)

Fix space collapsing for explicit paragraphs (typst#5749)

Support first-line-indent for every paragraph (typst#5768)

Fixed typo in the new outline docs (typst#5772)

Resolve bound name of bare import statically (typst#5773)

Fix typo in scripting.md (typst#5783)

Modular, multi-threaded, transitioning plugins (typst#5779)

Include images from raw pixmaps and more (typst#5632)

Co-authored-by: PgBiel <9021226+PgBiel@users.noreply.github.com>
Co-authored-by: Laurenz <laurmaedje@gmail.com>

Change type repr to short name (typst#5788)

Disable cjk_latin_spacing in raw by default (typst#5753)

Change the default math class of U+22A5 ⊥ UP TACK to Normal (typst#5714)

Revert adding `flatten-text` to `image` (typst#5789)

Refactor `Scope` (typst#5797)

Enable HTML feature in docs generator (typst#5800)

Scope deprecations (typst#5798)

Fix typo in page documentation (typst#5804)

Bump openssl from 0.10.66 to 0.10.70 (typst#5802)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bump codex to 0.1.0 (typst#5805)

Bump dependencies (typst#5808)

Fix small copy-paste oversight (typst#5811)

Bump more dependencies (typst#5813)

Export target docs (typst#5812)

Co-authored-by: Martin Haug <3874949+reknih@users.noreply.github.com>

0.13 changelog (typst#5801)

Release Candidate 1

Autocomplete content methods (typst#5822)

Documentation fixes and improvements (typst#5816)

Fix docs outline for nested definitions (typst#5823)

Document removals in changelog (typst#5827)

Fix unnecessary import rename warning (typst#5828)

Don't crash on image with zero DPI (typst#5835)

Add warning for `pdf.embed` elem used with HTML (typst#5829)

Add smart quotes for Bulgarian (typst#5807)

Respect `par` constructor arguments (typst#5842)

Bump `typst-assets`

Fix autocomplete and jumps in math (typst#5849)

Update documentation for `float.{to-bits, from-bits}` (typst#5836)

`Gradient::repeat`: Fix floating-point error in stop calculation (typst#5837)

Lazy parsing of the package index (typst#5851)

Remove Linux Libertine warning (typst#5876)

Bring back type/str compatibility for 0.13, with warnings and hints (typst#5877)

More robust SVG auto-detection (typst#5878)

HTML export: Use `<code>` for inline `RawElem` (typst#5884)

--make-deps fixes (typst#5873)

Update changelog (typst#5894)

Version bump

Fix HTML export of table with gutter (typst#5920)

Fix comparison of `Func` and `NativeFuncData` (typst#5943)

HTML export: fix elem counting on classify_output (typst#5910)

Co-authored-by: Laurenz <laurmaedje@gmail.com>

Fix introspection of HTML root sibling metadata (typst#5953)

Fix high CPU usage due to inotify watch triggering itself (typst#5905)

Co-authored-by: Laurenz <laurmaedje@gmail.com>

Fix false positive for type/str comparison warning (typst#5957)

Fix paper name in page setup guide (typst#5956)

Fix curve with multiple non-closed components. (typst#5963)

Fix docs example with type/string comparison (typst#5987)

Correct typo (typst#5971)

Make `array.chunks` example more readable (typst#5975)

Hotfix for labels on symbols (typst#6015)

Replace `par` function call in tutorial (typst#6023)

Mention that `sym.ohm` was removed in the 0.13.0 changelog (typst#6017)

Co-authored-by: Laurenz <laurmaedje@gmail.com>

Mark breaking symbol changes as breaking in 0.13.0 changelog (typst#6024)

0.13.1 changelog (typst#6025)

Version bump

dep min 1.81

use 1.81 in ci
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant