Skip to content

Also select font via localized family name #7468

@YDX-2147483647

Description

@YDX-2147483647

Description

Some fonts provide their names in multiple languages.
For example, each Source Han Sans/Serif font has both an English name and a localized name, as shown in following table extracted from Adobe's font readme.

It would be good if typst supports selecting font via any name.
Relates-to: #2098

Image

Note

It has to be Source Han Sans/Serif released by Adobe, not Noto Sans/Serif CJK released by Google.
Google also releases fonts with localized name table strings, but the actual strings are identical to the English-language ones.

Use Case

At present, typst supports selecting font via only a single name. Sometimes it's the English name, and sometimes it's the localized name. It is difficult to predict which one it is.

As an example of this chaos:

  • Source Han Sans SC (思源黑体) can only be selected by "Source Han Sans SC".
  • Source Han Sans TC (思源黑體) can only be selected by "思源黑體". If you specify "Source Han Sans TC", you'll get a font-not-found warning.

The following is a screenshot of typst fonts shared by Runge on 2025-11-14. You can see that TC is somewhat unique. All fonts use English names, except TC.

Image

Cause of the inconsistency

The original Chinese version 中文原版

Adobe 发布的思源黑体 Source Han Sans 提供了简体中文 SC、繁体中文 TC 等版本。在 Typst 中使用时,前者要写“Source Han Sans SC”,后者却要写“思源黑體”而非“Source Han Sans TC”。
这是因为 Adobe 在字体中提供了英文、中文(繁简之一)两版字体名,Typst 会选择靠前的名字。那么为什么 SC 版英文名靠前,而 TC 版中文名靠前呢?
根据 OpenType 规范,名字要按照语言 ID 排序。对于包括思源在内的很多字体来说,这个语言 ID 是微软的 Windows Language Code Identifier (LCID)。
LCID 用数字代码表示语言,大致低位对应语言,高位对应地区。于是 zh-TW = 0x0404, en-US = 0x0409, zh-CN = 0x0804。可排序还是从高到低排呀,于是繁体中文在英文前面,英文在简体中文前面……

Adobe provides both English and Chinese (either traditional or simplified) names in the font, and Typst will choose the earliest name with find_map.

/// Try to find and decode the name with the given id.
pub(super) fn find_name(ttf: &ttf_parser::Face, name_id: u16) -> Option<String> {
ttf.names().into_iter().find_map(|entry| {
if entry.name_id == name_id {
if let Some(string) = entry.to_string() {
return Some(string);
}
if entry.platform_id == PlatformId::Macintosh && entry.encoding_id == 0 {
return Some(decode_mac_roman(entry.name));
}
}
None
})
}

So why does the English name come first in SC, but the Chinese name come first in TC?

According to the OpenType specification, names should be sorted by language ID. For many fonts, including Source Han Sans, this Language ID is Microsoft's Windows Language Code Identifier (LCID).

LCID represents a language with a numeric code, with the lower bits roughly corresponding to the language and the higher bits to the region. So zh-TW = 0x0404, en-US = 0x0409, zh-CN = 0x0804.

Image

However, the IDs are sorted as numbers, and highers bits matter more. So Traditional Chinese (TC, zh-TW) comes before English (en-US), and English comes before Simplified Chinese (SC, zh-CN)...

Why this is discovered so late

Most people only use one version of Source Han, so won't encounter such inconsistency.

I noticed it while I was debugging #7396. I shared it in the Chinese QQ chat group and everyone was surprised. But after testing, we found that the behavior is reproducible.

Reference implementations

MS Paint, Word, etc. only show localized names.

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    cjkChinese, Japanese, Korean typography.feature requestNew feature or requesttextRelated to the text category, which is all about text handling, shaping, etc.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions