Skip to content

The inferred writing script is not passed to HarfBuzz, making locl ineffective #7396

@YDX-2147483647

Description

@YDX-2147483647

Description

The inferred writing script (e.g., latn/hani) is not passed to rustybuzz::UnicodeBuffer, making the OpenType locl (Localized forms) feature ineffective.

Outcome of this issue

When using certain fonts and languages, the position of period marks (U+3002 ideographic full stop) between generated texts are not stable.

In the example below, the first period is placed at the center of the square space, but the second period is placed at the lower left corner.

#set text(lang: "zh", region: "TW", font: "Noto Serif CJK SC")
#set heading(numbering: "1")
= Heading <a>

句號。@a@a 何故?
Image

The problem happens if the glyph for should be a non-default localized form provided by the locl feature.

Background and explanation

Typographical rules may have regional differences. For example, when writing Chinese horizontally, the period mark is placed at the lower left corner in the square space in Chinese Mainland, but placed at the center in Taiwan and Hong Kong.

Some fonts handle these differences with the locl feature. For instance, Noto Sans/Serif CJK, the language-specific version (also the version shipped on typst.app), contains all variants of the period mark, and the layout engine can select the needed glyph for each language from the font.

Note that locl requires the text to be properly tagged with both the language system and the script. If not, the default glyph will be used.

In the GSUB table of Noto CJK:

  1. There are a default script (DFLT) and 6 non-default scripts (cyrl, grek, hang, hani, kana, latn).
  2. The DFLT script contains only the default language system, and the non-default scripts also contain 5 non-default language systems (JAN, KOR, ZHH, ZHS, ZHT).
  3. The locl feature is only available for non-default scripts and language systems.

In Typst:

  • The locl feature is somewhat supported and enabled by default.

  • The language system can be tagged by applying set text(lang: "ja", region: "JP") (JAN), ko-KR (KOR), zh-HK (ZHH), zh-CN (ZHS), or zh-TW (ZHT).

  • The script can be manually tagged by applying set text(script: "hani") etc.

    However, ideally, the default set text(script: auto) should just work and Typst should infer the script from the used characters.

Typst has problems on inferring the script from characters. Specifically, the Unicode script property of is Common, which means it may be used with multiple scripts.

  • When is shaped in 句號。, HarfBuzz can infer from the context that the script is hani, so locl is applicable and the correct localized glyph will be used.
  • But if is shaped alone, as the in @a。@a, then HarfBuzz cannot determine the script and will use the DFLT script, so locl is not available and the default glyph will be used.

In the latter case, Typst is responsible for telling HarfBuzz that it is hani. However, Typst v0.14.0 only does so if the end author set text(script: "hani") manually. This should be improved.

if let Some(script) = ctx.styles.get(TextElem::script).custom().and_then(|script| {
rustybuzz::Script::from_iso15924_tag(Tag::from_bytes(script.as_bytes()))
}) {
buffer.set_script(script)
}

Further examples

Period marks between strong, emph, highlight, ref, or colored texts are all affected, as shown below.

#set text(lang: "zh", region: "CN", font: "Noto Serif CJK TC")
桑之未落,其叶沃若。于嗟鸠兮,无食桑葚!\
桑之未落,#text(red)[其叶沃若]。#text(blue)[于嗟鸠兮],无食桑葚!\
桑之未落,#highlight[其叶沃若]。#highlight[于嗟鸠兮],无食桑葚!\
#show ref: _ => [于嗟鸠兮]
桑之未落,*其叶沃若*@key,无食桑葚!
Image

Characters that are not specific to a single script are affected, including 、,。.?!:;≤≥≮≯.

Workaround

Use other versions of Noto CJK that set your desired glyphs as the defaults, or tag the script manually:

#show "": set text(script: "hani")

Notes

This issue was originally reported on 2025-08-14. Nothing was actionable until we tested the behavior of HarfBuzz (with the hb-view CLI), Firefox, and XeLaTeX / LuaLaTeX, and contacted an expert on typography by email.

Previous discussions was recorded in typst-doc-cn/clreq#41, in Chinese, but with many screenshots.


Relates to #5474.

Reproduction URL

No response

Operating system

Web app, Windows, Linux

Typst version

  • I am using the latest version of Typst

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingtextRelated to the text category, which is all about text handling, shaping, etc.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions