Skip to content

Language-aware line breakpoints #1009

@peng1999

Description

@peng1999

Description

Requirements for Chinese Text Layout have Prohibition Rules for Line Start and Line End, and Japanese also has similar rules. This is not implemented correctly in current typst, which uses vanilla UAX#14 directly by xi-unicode.

UAX#14 can be used directly in most latin languages, but is not enough for CJK. In UAX#14 §8 Customization:

A real-world line breaking algorithm has to be tailorable to some degree to meet user or document requirements. …
In Japanese, for example, tighter and looser specifications of prohibited line breaks may be used.

The most noticeable gap between vanilla UAX#14 and Chinese/Japanese convention is the handling of quotation marks. Here is an example:

An example
#set page(width: 5em + 2em, margin: (x: 1em))
#set text(font: "Noto Serif CJK SC")
测试文本,“测”

Current line breaking:

Desired line breaking:

ICU has a tailored line breaking rule for Chinese and Japanese, but unfortunately this is a C library and cannot used in typst.

Related: #276

Reproduction URL

No response

Operating system

No response

Typst version

  • I am using the latest version of Typst

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingtextRelated to the text category, which is all about text handling, shaping, etc.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions