Skip to content

Automatically add spacing between CJK and Latin characters#2334

Merged
laurmaedje merged 10 commits intotypst:mainfrom
peng1999:cjk-latin-spacing
Oct 17, 2023
Merged

Automatically add spacing between CJK and Latin characters#2334
laurmaedje merged 10 commits intotypst:mainfrom
peng1999:cjk-latin-spacing

Conversation

@peng1999
Copy link
Copy Markdown
Contributor

@peng1999 peng1999 commented Oct 9, 2023

Related: #276

This feature follows the Requirements for Chinese Text Layout, which adds 1/4em spacing between each Han character and a Western character.

An option cjk-latin-spacing is added to enable users to disable this behavior. In the future, this option could accept a dimension to customize the spacing.

@laurmaedje
Copy link
Copy Markdown
Member

Currently, there is space processing in the preparation and in the line construction. Do both essentially do the same thing, but we need to redo it for the line because of reshaping?

@peng1999
Copy link
Copy Markdown
Contributor Author

Do both essentially do the same thing, but we need to redo it for the line because of reshaping?

No, they are doing different things. The diff is a little hard to reason about, because it mixed some adjustments that we already have with the newly added adjustments.

Currently we have following CJK-related adjustments:

  • a. The shrinkability of punctuation marks is added in preparation.
  • b. The width of two consecutive punctuation marks is adjusted in preparation.
  • c. Punctuation in line begin and end need additional adjustment. We do that in line construction.

In this PR, two more adjustment is added:

  • d. Spacing is added between two consecutive CJK and Latin characters in preparation.
  • e. But we do not want to add spacing to line begin and end. We do not know if it is the case until line construction. So we undo the above adjustment for line end in line construction.

Note that adjustment c. and e. happens in the same place in par::line function, and I added some comments for adjustment c., though it has actually added by previous PR #954 .

@laurmaedje
Copy link
Copy Markdown
Member

Doesn't that mean that all adjustments that are only done in preparation are potentially lost if we need to reshape a segment containing such adjustments?

@peng1999
Copy link
Copy Markdown
Contributor Author

peng1999 commented Oct 10, 2023

Doesn't that mean that all adjustments that are only done in preparation are potentially lost if we need to reshape a segment containing such adjustments?

No. There is no ligature in CJK scripts, so AFAIK its always safe to break around CJK character. So the glyphs will always be reused and no adjustment will lose.

* "CJK" has multiple meanings. Here (and in typst), I am referring to the Han, Hiragana, and Katakana scripts, not Hangul.

@laurmaedje laurmaedje merged commit e4d9db8 into typst:main Oct 17, 2023
@laurmaedje
Copy link
Copy Markdown
Member

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants