Skip to content

Use icu4x for linebreaking algorithm#1355

Merged
laurmaedje merged 5 commits intotypst:mainfrom
peng1999:icu
May 30, 2023
Merged

Use icu4x for linebreaking algorithm#1355
laurmaedje merged 5 commits intotypst:mainfrom
peng1999:icu

Conversation

@peng1999
Copy link
Copy Markdown
Contributor

Remove xi-unicode, which is discontinued and hard to tailor.

Fix #1009, #335

Currently I use LSTM model rather than dictionary for East Asian language support. In icu_segmenter's document:

The LSTM, or Long Term Short Memory, is a machine learning model. It is smaller than the full dictionary but more expensive during segmentation (inference).

Also made #1164 easier to solve.

@laurmaedje laurmaedje merged commit e2bf232 into typst:main May 30, 2023
@laurmaedje
Copy link
Copy Markdown
Member

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Language-aware line breakpoints

2 participants