Skip to content

Add support for unselectable text#7295

Open
andylizi wants to merge 2 commits intotypst:mainfrom
andylizi:selectable-text
Open

Add support for unselectable text#7295
andylizi wants to merge 2 commits intotypst:mainfrom
andylizi:selectable-text

Conversation

@andylizi
Copy link
Contributor

@andylizi andylizi commented Nov 3, 2025

Add a new parameter set text(selectable: false) that resolves #2249.

This particular design was previously proposed in #5789 and #5938 (comment).

This is my first PR in Typst and I'm not very familiar with the codebase. Please let me know if there's anything I missed!

Testing

Manually check the PDFs.

cargo testit --pdf --exact text-unselectable image-svg-text-unselectable

Example Use Cases

Limitations

@andylizi andylizi marked this pull request as ready for review November 4, 2025 00:02
@LaurenzV
Copy link
Collaborator

LaurenzV commented Nov 4, 2025

Seems like a good addition overall, but I'm wondering whether we should just disallow that option for PDF/UA-1? Since it's really not encouraged to do this normally.

@andylizi
Copy link
Contributor Author

andylizi commented Nov 4, 2025

I'm wondering whether we should just disallow that option for PDF/UA-1? Since it's really not encouraged to do this normally.

I'm not very familiar with accessibility best practices, but the two use cases I've seen (code block line numbers, watermarks) might both be considered artifacts. Perhaps we could automatically put unselectable text in pdf.artifact()?

However, are all use cases of unselectable text artifacts? I can't think of any exceptions at the moment, but presumably they could exist?...

@laurmaedje
Copy link
Member

Regarding accessibility: cc @saecki @reknih

@laurmaedje laurmaedje added pdf Related to PDF export or PDF embedding. text Related to the text category, which is all about text handling, shaping, etc. interface PRs that add to or change Typst's user-facing interface as opposed to internals or docs changes. labels Nov 5, 2025
@saecki
Copy link
Member

saecki commented Nov 5, 2025

I'm not very familiar with accessibility best practices, but the two use cases I've seen (code block line numbers, watermarks) might both be considered artifacts. Perhaps we could automatically put unselectable text in pdf.artifact()?

However, are all use cases of unselectable text artifacts? I can't think of any exceptions at the moment, but presumably they could exist?...

Considering that the text is inaccessible when drawn as plain glyphs, it should always be considered an artifact. Failing to mark it as an artifact would imply that it is somehow real content, but since it isn't accessible it wouldn't comply with the PDF/UA-1 spec.

I'm wondering whether we should just disallow that option for PDF/UA-1? Since it's really not encouraged to do this normally.

Yeah it feels a little too easy to degrade text this way. I'm wondering whether it may be a better idea to make this a semantic element, to make the consequences more obvious. Something like a more generalized pdf.artifact that has an unselectable parameter maybe?

@andylizi
Copy link
Contributor Author

andylizi commented Nov 5, 2025

I'm wondering whether it may be a better idea to make this a semantic element, to make the consequences more obvious. Something like a more generalized pdf.artifact that has an unselectable parameter maybe?

Just to throw some ideas out there, how about a new element under the Visualize section, along with line, rect, and friends. Here's what the docs say about Visualize:

All shapes and paths drawn by Typst are automatically marked as artifacts to make them invisible to Assistive Technology (AT) during PDF export. However, their contents (if any) remain accessible.

Seems to fit pretty well in our case. We could call it text-path or outlined-text or some other bikesheddable name.

A potential drawback is that we'd lose the most natural way to handle text in SVG images, show image: set text(selectable: false). Although it's not clear to me what additional use cases this would support, except for consistency with regular text, so maybe it's alright not to support it.


I'm wondering whether we should just disallow that option for PDF/UA-1? Since it's really not encouraged to do this normally.

The current documentation for pdf.artifact explicitly mention watermarks as one example of artifact usage, it'd seem strange if we can't actually create such a (well-functioning) watermark in PDF/UA-1.

@laurmaedje laurmaedje added the waiting-on-decision A decision must be made to proceed. label Nov 17, 2025
@reknih
Copy link
Member

reknih commented Dec 4, 2025

My two cents would be to leave the function as-is and force the user to manually wrap it in an artifact.

Furthermore, I am not even sure that it would be allowed in UA either way. It goes against the PDF 2.0 spirit of having the artifacts in tree and being able to tag their insides, should the AT user choose to inspect the artifact.

Finally, a document using this feature will for sure fail WCAG 2.2 clause 1.4.5 and documents using the function will thus fail to achieve WCAG AA conformance as well as conformance with standards based on it (e.g. EN 301 549). So that must be documented.

New parameter: `text(selectable: true)`
@andylizi
Copy link
Contributor Author

andylizi commented Dec 5, 2025

force the user to manually wrap it in an artifact

Okay, I've implemented the check. There are still a few design decisions left to resolve through.

  1. Should pdf.artifact be required even when the export target is not PDF or when using --no-pdf-tags? Right now unselectable text only works for PDFs, and the only implementation method would also make the text inaccessible, hence pdf.artifact. But in the future, if HTML export can use the user-select: none CSS property to create unselectable text that kept being accessible, it won't make much sense to mandate pdf.artifact in that case.
    • Furthermore, there is a note to consider adding aria-hidden="true" to pdf.artifact semantics in the future, which would make even less sense for unselectable text in HTML.
  2. How should this requirement interact with embedded text in SVG images? Right now we have to wrap the entire image in an artifact tag. Not sure how desirable this would be in practice.

The current wording of the diagnostics could probably be improved as well.


Furthermore, I am not even sure that it would be allowed in UA either way. It goes against the PDF 2.0 spirit of having the artifacts in tree and being able to tag their insides, should the AT user choose to inspect the artifact.

I found this specification on Well-Tagged PDF in PDF 2.0 on the PDF Association website, and it seems to suggest watermarks are a valid use of the artifact tag:

A watermark is tagged with an Artifact structure element with a Subtype attribute with a value of Watermark.

Adobe guidelines also talk about watermarks not being part of the tag tree:

You can add a watermark to a tagged PDF without adding it to the tag tree. Not having a watermark appear in the tag tree is helpful for people who are using screen readers, because they won’t hear the watermark read as a document content.

The best way to add a watermark that doesn’t interfere with screen readers is to insert an untagged PDF of the watermark into a tagged PDF.

@reknih
Copy link
Member

reknih commented Dec 11, 2025

On 1: I would not require it if tagging was disabled.

I found this specification on Well-Tagged PDF in PDF 2.0 on the PDF Association website, and it seems to suggest watermarks are a valid use of the artifact tag:

I am aware that watermarks are a valid use of artifacts, my remark was on outlining the text within: This takes away the user's possibility to inspect the artifact in PDF 2.0. Put differently, what purpose does a "confidential" watermark serve when blind users can prove in court that they could not have been aware of it?

@laurmaedje
Copy link
Member

The motivating examples here are line numbers and water marks. Both of these should be emitted as normal, accessible text with an artifact tag and a subtype. There is a Watermark subtype in PDF 1.7, see Table 330 PDF 32000:1-2008 and there is a LineNum subtype in PDF 2.0, see Table 363 in ISO/DIS 32000-2. Both of these appear along the top-level artifact type Pagination.

Thus, it should be the responsibility of the viewer to turn the text unselectable in these cases instead of us outlining it. This also makes sense in thus far that both line numbers and watermark content can be relevant information for people with vision impairment.

Currently, these subtypes are not exposed on pdf.artifact. There was work in this direction in LaurenzV/krilla#273 though it looks a little stalled.

In any case, this leaves the actual set text(selectable: false) with somewhat thin motivation (there might be other cases?).

@w1th0utnam3
Copy link

In any case, this leaves the actual set text(selectable: false) with somewhat thin motivation (there might be other cases?).

I was looking for a feature like this when I was using icon fonts such as FontAwesome which result in random Unicode sequences when copy & pasting. Using fonts for this (instead of embedding the SVGs) is very convenient because you can more easily adapt size, color and alignment to the document text. However, a better solution than non-selectable text would be to turn it into a path and even add alt-text to it, so I guess it's also not a very strong use case.

@andylizi
Copy link
Contributor Author

andylizi commented Mar 5, 2026

I can think of other use cases where text is intended as graphical elements rather than readable content. For example, ASCII diagrams:

                    ┌───────────────────────┐
                    │   Is it supposed to   │
                    │        move?          │
                    └───────────┬───────────┘
                                │
                ┌───────────────┴───────────────┐
                │                               │
              YES                              NO
                │                               │
    ┌───────────▼───────────┐       ┌───────────▼───────────┐
    │     Is it moving?     │       │     Is it moving?     │
    └───────────┬───────────┘       └───────────┬───────────┘
                │                               │
        ┌───────┴───────┐               ┌───────┴───────┐
        │               │               │               │
       YES              NO              YES              NO
        │               │               │               │
        │     ┌─────────▼─────────┐     │     ┌─────────▼─────────┐
        │     │     Use WD-40     │     │     │      You're       │
        │     └───────────────────┘     │     │       fine        │
        │                               │     └───────────────────┘
        │                               │
        │                     ┌─────────▼─────────┐
        │                     │   Use Duct Tape   │
        │                     └───────────────────┘
        │
        └──────────────►  You're fine

Here, it might be desirable to make Unicode box-drawing characters unselectable without affecting the labels inside.

There is also the treet package that generate tree lists like this:

root_folder
├─ sub-folder
│ ├─ 1-1
│ │ └─ 1.1.1
│ └─ 1.2
│ ├─ 1.2.1
│ └─ 1.2.2
└─ 2

Another more niche example is the "glitch text" effect.

H̶͉̆e̸̠̚r̸̤̒ẻ̵̩'̸͕͝s̸̝̑ ̷̞̂à̸̹n̸̛̺ ̵̭͆e̷̞̕x̷̩̏ā̵̞m̵̞̃p̷̲̕l̶͇̽e̵̡̓.̵̙̍ ̷̟̀I̸̱͝'̶̹͋m̸̥̅ ̵͚̋s̵̱̕ȗ̷͈r̵͎̊ȅ̵̡ ̷͒ͅy̶͈̐o̶̢̽ṷ̸̓'̶͖͆v̸̫͐è̸͕ ̴͚̚s̵̛̳ë̶̬́e̴̢͌n̶̞̄ ̴̳͝t̴͉͛h̵͕̽i̵̜̇s̴̮̄ ̷̦̍ơ̶ͅn̶͉̽ ̸̛͙t̷͌ͅh̸̦͘ë̷̳́ ̷̬̇í̸͉ǹ̸̼t̴͙͑ë̵̝r̸͕͘n̸̠͝e̶̳̎t̸̝̾ ̵̡̛a̵̫͌t̷̙͛ ̴̳̀s̵̢̊ó̴̜m̷̭̂e̷͔̿ ̴̼̆p̵͍̐ő̵͕ǐ̶̢ň̴͕t̸̙́.̷̗̎

I saw this used in some web novels to add flavor to scenes, but it always makes me wince because of how terrible they are for screen readers. But with unselectable text, maybe it'd be possible to keep the visual presentation without affecting accessibility.

@reknih
Copy link
Member

reknih commented Mar 5, 2026

I can think of other use cases where text is intended as graphical elements rather than readable content. For example, ASCII diagrams:

You can actually make these accessible by adding an alternative description using the figure function:

#figure(
  alt: "Here's an example I'm sure you've seen on the internet at some point rendered with a lot of accents and decorators so it looks spooky",
)[
  H̶͉̆e̸̠̚r̸̤̒ẻ̵̩'̸͕͝s̸̝̑ ̷̞̂à̸̹n̸̛̺ ̵̭͆e̷̞̕x̷̩̏ā̵̞m̵̞̃p̷̲̕l̶͇̽e̵̡̓.̵̙̍ ̷̟̀I̸̱͝'̶̹͋m̸̥̅ ̵͚̋s̵̱̕ȗ̷͈r̵͎̊ȅ̵̡ ̷͒ͅy̶͈̐o̶̢̽ṷ̸̓'̶͖͆v̸̫͐è̸͕ ̴͚̚s̵̛̳ë̶̬́e̴̢͌n̶̞̄ ̴̳͝t̴͉͛h̵͕̽i̵̜̇s̴̮̄ ̷̦̍ơ̶ͅn̶͉̽ ̸̛͙t̷͌ͅh̸̦͘ë̷̳́ ̷̬̇í̸͉ǹ̸̼t̴͙͑ë̵̝r̸͕͘n̸̠͝e̶̳̎t̸̝̾ ̵̡̛a̵̫͌t̷̙͛ ̴̳̀s̵̢̊ó̴̜m̷̭̂e̷͔̿ ̴̼̆p̵͍̐ő̵͕ǐ̶̢ň̴͕t̸̙́.̷̗̎
]

AT will treat this element as a leaf, i.e. it will read the alt text and, by default, not expose the capability to drill down into the contents. However, the original text stays selectable and can be copied from the PDF.

A limitation is that due to #7001, this pattern cannot be used for inline content. IMO, we might need a function to provide an alternative description that is not tightly coupled to figures.

This will satisfy WCAG 2.2 Success Criterion 1.1.1 Non-text Content where non-text content is defined to include ASCII art. Also see F72: Failure of Success Criterion 1.1.1 due to using ASCII art without providing a text alternative.

@w1th0utnam3
Copy link

A limitation is that due to #7001, this pattern cannot be used for inline content. IMO, we might need a function to provide an alternative description that is not tightly coupled to figures.

I also encountered this problem in my icon font use case.

@reknih
Copy link
Member

reknih commented Mar 5, 2026

I was looking for a feature like this when I was using icon fonts such as FontAwesome which result in random Unicode sequences when copy & pasting.

For this use case, I'd say that we have a multi-layered problem:

  1. For icon font glyphs that match existing Unicode codepoints semantically, the icon font should map them to the appropriate code point. E.g., fa-flag should be mapped to U+1F3F3 Waving White Flag or U+1F3F4 Waving Black Flag so it is natively announced correctly and can be copy-pasted even if the icon font is not available at the target. Not all icon font glyphs may be assignable to well-known codepoints, so they have to be assigned to codepoints in the Unicode Private Use Area (PUA) and be made accessible using the second technique.

  2. For icons that cannot be represented using Unicode codepoints outside the PUA, an alternative description is required. For block-level content, it can be provided using the above technique with a figure. For inline icon use, we need Default show-set align rule prevents inlining figure with box show rule #7001 fixed or a dedicated alt text function that allows us to attach alternative descriptions to any content.

  3. There are two cases where we may need to override what codepoint the icon font matches a glyph to:

    1. The font matches its icons that are representable in non-PUA Unicode codepoints to the PUA. In that case, we'd want to override the codepoint for an icon glyph with its well-known codepoint.

      See Section 8.5.2 of PDF/UA-2:

      As no pre-defined meaning is associated with Unicode values in the Private Use Area (PUA), PUA values in content streams shall be used only if no other valid Unicode value is available.

    2. The font matches some or all of its icons that are not representable in non-PUA Unicode codepoints to inappropriate codepoints outside of the PUA. In that case, we'd want to map the codepoint back to the PUA. In addition, an alternative description is required, as described above.

    For the first of these problems, we can use the PDF capability /ActualText to override the codepoint for one or multiple glyphs locally. This matches the guidance in PDF/UA-2, Section 8.5.3:

    In all cases where real content maps to PUA values an ActualText or Alt entry shall be present.

    (in the spec, real content refers to content not wrapped in an artifact)

    If ActualText is used, readers will copy and paste and expose to AT the contents of ActualText instead of the underlying code point. If the ActualText codepoint is not in the PUA, the icon is now accessible. ActualText is not currently exposed, but could be with a pdf.actual-text()[] function. To my knowledge, there is no corresponding mechanism in HTML or SVG. However, the reader support is spotty. Given that in almost all cases, we want to map a single glyph to the same codepoint across all of its uses, we may want to rewrite the font's ToUnicode cmap instead, a global lookup table that governs what Unicode codepoint is assigned to a particular glyph. The API design for this would be more complex, but we could generate a HTML webfont that incorporates the changes, so it would not be limited to PDF.

    Under PDF/UA-2 Section 8.7, ActualText and Alt cannot contain PUA codepoints. Since, at the same time, we may need to reassign codepoints to the PUA, we must have a mechanism to modify the ToUnicode cmap (see 2.ii. above) to accommodate this use case.

TL;DR: Don't outline icon fonts, instead, you need alt text and/or code point reassignment through ActualText and/or the ToUnicode cmap.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

interface PRs that add to or change Typst's user-facing interface as opposed to internals or docs changes. pdf Related to PDF export or PDF embedding. text Related to the text category, which is all about text handling, shaping, etc. waiting-on-decision A decision must be made to proceed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a way to make text non-selectable

6 participants