Skip to content

Configurable character protrusion for both margins#7782

Open
sicikh wants to merge 4 commits intotypst:mainfrom
sicikh:more-overhang
Open

Configurable character protrusion for both margins#7782
sicikh wants to merge 4 commits intotypst:mainfrom
sicikh:more-overhang

Conversation

@sicikh
Copy link
Copy Markdown
Contributor

@sicikh sicikh commented Jan 30, 2026

Resolves #261, resolves #7231, and closes #6582.

This PR changes the behaviour of the text.overhang property, such that:

  1. Left and right margins can be used for overhang characters for LTR and RTL languages accordingly.

  2. Users can now specify where overhang characters should happen:

    • enable or disable completely (true or false),
    • by side (left or right),
    • by direction (start or end),
    • by individual characters (currently only by one UTF-8 character, but this restriction would be lifted later in this PR).
  3. Old values overhang: false and overhang: true stay, but true changes its meaning from "enable default overhang punctuation only for the end margin" to "enable default protrusions for both margins".

    The default value changes from true to end, that works as previous true.

  4. Most beneficial result is that overhang punctuation is now considered when paragraph is being laid out, what leads to better linebreaks and hyphenation. Now documents, that were using default values (and that will continue to use them), will benefit from this PR without additional changes from their side.

For the results see below:

image
Code
#set page(paper: "a4")

// Wiki suggested values + Hàn Thế Thành’s original settings
//
// - https://en.wikipedia.org/wiki/Optical_margin_alignment#
// - https://ftp.snt.utwente.nl/pub/software/tex/macros/latex/contrib/microtype/microtype-code.pdf, p. 155
#let protrusion-table = (
  "A": (20%, 20%),
  "F": (0%, 5%),
  "J": (5%, 0%),
  "K": (0%, 5%),
  "L": (0%, 5%),
  "C": (10%, 0%),
  "O": (10%, 10%),
  "T": (20%, 20%),
  "V": (20%, 20%),
  "W": (20%, 20%),
  "X": (5%, 5%),
  "Y": (20%, 20%),
  "k": (0%, 5%),
  "r": (0%, 5%),
  "t": (0%, 10%),
  "v": (5%, 5%),
  "w": (5%, 5%),
  "x": (5%, 5%),
  "y": (10%, 10%),
  "c": (10%, 10%),
  "o": (10%, 10%),
  ".": (0%, 100%),
  ",": (0%, 100%),
  ":": (0%, 100%),
  ";": (0%, 100%),
  "!": (0%, 20%),
  "?": (0%, 20%),
  "(": (20%, 0%),
  ")": (0%, 20%),
  "-": (0%, 75%),
  "\u{ad}": (0%, 75%),
  "": (0%, 50%),
  "": (0%, 25%),
  "": (100%, 0%),
  "": (0%, 100%),
)
#let italic-protrusion-table = (
  "p": (20%, 0%),
)
#let use-another-table-for-emph(body) = {
  show emph: set text(overhang: (map: italic-protrusion-table, default: false))
  body
} 

#let example(name) = [
  #place(top, float: true, scope: "parent", strong(name))
  
Shortly after this, when the sexton
came to pay them a visit, the father
broke out to him, and told him what
a bad hand his youngest son was
at everything: he knew nothing and
learned nothing. “Only think! when
I asked him how he purposed gaining a livelihood, he actually asked
to be taught to shudder.” “If that’s
all he wants,” said the sexton, “I can
teach him that; just you send him to
me, I’ll soon polish him up.” The father was quite pleased with the proposal, because he thought: “It will
be a good discipline for the youth.”
And so the sexton took him into his
house, and his duty was to toll the
bell. After a few days he woke him
at midnight, and bade him rise and
climb into the tower and toll. “Now,
my friend, I’ll teach you to shudder,”
thought he. He stole forth secretly
in front, and when the youth was
up above, and had turned round to
grasp the bell-rope, he saw, standing _opposite_ the hole of the belfry,
a white figure. “Who’s there?” he
called out, but the figure gave no answer, and neither stirred nor moved.
“Answer,” cried the youth, “or begone; you have no business here
at this hour of the night.” But the
sexton remained motionless, so that
the youth might think that it was a
ghost. The youth called out the second time: “What do you want here?
Speak if you are an honest fellow, or
]

#set page(width: 500pt, height: 21em, margin: 15pt)
#set par(justify: true)
#set text(size: 0.8em)

#grid(
  columns: (1fr, 1fr, 1fr, 1fr),
  gutter: 10pt,
  {
    set text(overhang: false)
    example[No protruding]
  },
  {
    set text(overhang: (map: protrusion-table, default: false))
    show: use-another-table-for-emph
    example[Character protruding]
  },
  {
    set text(overhang: false)
    set par(justification-limits: (tracking: (max: 0.02em, min: -0.02em)))
    example[Character level justification]
  },
  {
    set text(overhang: (map: protrusion-table, default: false))
    set par(justification-limits: (tracking: (max: 0.02em, min: -0.02em)))
    show: use-another-table-for-emph
    example[Protruding + justification]
  }
)

API

// Disable overhang completely.
#set text(overhang: false)

// Enable overhang for both margins with default protrusions.
#set text(overhang: true)

// Enable overhang only for one margin:
#set text(overhang: left)
#set text(overhang: right)
#set text(overhang: start) // will be left margin for LTR text
#set text(overhang: end) // will be right margin for LTR text, also this is the new default value

// Enable overhang _per glyph_.

#let protrusion-table = (
  // (left, right) protrusions as `ratios-to-width`% + `relative-to-font`em.
  "«": (100%, 100%),
  "»": (100%, 100%),
  "": (100%, 100%),
  "": (100%, 100%),
  "A": (10%, 0%),
  "C": (1% + 0.001em, 0%)
  // If we would like to customize only one side,
  // and rely on default algorithm (more on that later) for another,
  // `auto` value is used.
  "-": (100%, auto),
)

// Note that this protrusion table can be shipped by third-party package
// for specific font and its style (this is what LaTeX package "microtype" does).

// `text.overhang` also can receive dictionary with two optional values:
// `map` - the table from character to protrusions for both sides,
// `default` - the behaviour for characters, that are not in the table.
//             Receives values, as `overhang` itself (false, true, left, right, ...)
//             If omitted, results in `false` 
//             (the protrusions will be read only from the specified table).
#set text(overhang: (map: protrusion-table, default: right)))

Rationale

As written in #7231 (comment), the main problem for this functionality is the API. #7231 (comment) proposes some kind of automatic calculation for the amount of protrusion for characters.

Main argument against disabling the possibility for customizing protrusions is that such algorithm, that can be implemented in the future, may yield suboptimal results. In some circumstances users may want to customize behaviour per character or per margin basis (see #7231 (comment) and #6582).

Thus, the API should take into account, that such algorithm for calculating protrusions by default may be implemented later, but also should give users a possibility to directly adjust values.

Another option was considered to rather recieve function (char, side) => protrusion, but it has some technical limitations. Also manipulating dictionaries from the user-side is better than with the black-box functions (think of possible typst's "microtype" package with those protrusion tables).

TODO

@terefang
Copy link
Copy Markdown

terefang commented Jan 31, 2026

i would propose it differently:

lets keep an option that make simple things simple but still allows complex behavior:

#set text(overhang: bool | string | dict)

for "bool" case

  • false – simply disables overhang behavior and should be the default
  • true – enables overhang behavior with whatever internal default can be agreed upon

this is actually a weak case and could be dropped entirely by using the case below.

for the string/enum case

simple "left", "right", "both", "default", and "none" overhang layout with whatever characters agreed as the default.

for the dict case

this follows the proposal of the OP but more refined:

#set text(overhang: (
  map: (
    "«": (100%, 100%),
    "»": (100%, 100%),
    "”": (100%, 100%),
    "“": (100%, 100%),
    "A": (10%, false),
  // customize protrusion only for the left margin
    "-": (100%, true),
  ),
  // regex match against characters.
  match: (
    "regex1": (100%, 100%),
    "regex1": (10%, false),
  ),
  // use built-in values for other characters 
  // (could be "left", "right", "both", "default", and "none") 
  rest: "right",
  // if omitted, results in `none`
))

what do you think ?

@sicikh
Copy link
Copy Markdown
Contributor Author

sicikh commented Jan 31, 2026

Thanks for the suggestions! My thoughts on this:

  1. Why would we need "default" value? I don't think that is the right path — the more correct one will be using other kind of values as default (particularly, right/"right" or true/"both").

  2. After removing "default" from your design, for the string case we have only "left", "right", "both" and "none". Initially I also was thinking about this, but then I came to this logic:

    • Users alredy may have overhang: true or overhang: false in their code. Thus, new overhang values must accept bool.

    • If overhang can accept bool, what these values will mean? false, obviously, should just disable overhang completely (yours "none"). Then true should mean "enable (full) overhang", that is yours "both".

    • Then only "left" and "right" values remain for the string case. Why not just use alignment type, as done in string.trim(at: start/end)?

    • As true changed its meaning from "enable (right) overhang" to "enable (full) overhang", the default value may become right.

  3. The proposal for the dict case is quite interesting, I also had some thoughts in this direction. But what do you think about the usecases for the regex matching?

  4. After a day, I think that individual protrusions should receive auto (use default algorithm/table, does not matter) or ratios. Then we don't have two options for disabling overhanging with either 0% and false, only the first one remains.

With this, the API for the dict case may look like the following:

#set text(overhang: (
  map: (
    "«": (100%, 100%),
    "»": (100%, 100%),
    "": (100%, 100%),
    "": (100%, 100%),
    // Enable overhang only for the left margin.
    "A": (10%, 0%),
    // Customize protrusion only for the left margin,
    // use default values for the right one.
    "-": (100%, auto),
  ),
  // Use built-in values for other characters on the right margin
  // (could be `left`, `right`, `true` or `false`).
  // If omitted, results in `false`.
  rest: right,
))

@laurmaedje
Copy link
Copy Markdown
Member

I'm not yet ready to comment on this at large, but still wanted to add three points to the discussion:

  • We don't necessarily have to have true / false in the primary API. They could remain as deprecated aliases for something like none / auto.
  • Stringly-typed values should be avoided if better values exist (like left, right, none, or auto)
  • Rather than supporting some complex ad-hoc dictionary structure, we might want to just support passing a function that receives the char.

@eltos
Copy link
Copy Markdown
Contributor

eltos commented Jan 31, 2026

You could use the alignment type, i.e. left + right instead of "both"/true, and start/end/start + end as a text-direction adaptive setting.

@laurmaedje
Copy link
Copy Markdown
Member

unfortunately, adding left + right does not work. you can only add alignments for different axes. the same thing came up for block.sticky where there was discussion of having sticky: (top, bottom) to have it sticky on both sides.

@sicikh
Copy link
Copy Markdown
Contributor Author

sicikh commented Feb 1, 2026

  • We don't necessarily have to have true / false in the primary API. They could remain as deprecated aliases for something like none / auto.

I think that we should not deprecate bool value in the overhang option: overhang: false disables overhang, overhang: true enables overhang for left and right margins.

  • Rather than supporting some complex ad-hoc dictionary structure, we might want to just support passing a function that receives the char.

That is quite an interesting idea! The question now what signature should this function have. I see two options:

  1. Function receives character and side of the margin as alignment, returns ratio or auto:

    #let protrusion-table = (
      "«": (100%, 100%),
      "»": (100%, 100%),
      "": (100%, 100%),
      "": (100%, 100%),
      "A": (10%, 0%),
      "-": (100%, auto),
    )
    
    // text.overhang: bool | alignment | (str, alignment) => (auto | ratio)  
    #set text(overhang: (char, side) => 
      protrusion-table
        // enable default algorithm for the right margin, disable overhang for the left one 
        .at(char, default: (0%, auto))
        .at(if side == left { 0 } else { 1 })
    )
  2. Function receives only character, returns an array of two protrusions (ratio or auto):

    // text.overhang: bool | alignment | str => [auto | ratio; 2]  
    #set text(overhang: char => 
      protrusion-table
        .at(char, default: (0%, auto))
    )

Currently, I'm leaning more towards the first option.

@sicikh sicikh force-pushed the more-overhang branch 2 times, most recently from 5ea7ed5 to 02cb067 Compare February 1, 2026 18:17
@sicikh
Copy link
Copy Markdown
Contributor Author

sicikh commented Feb 1, 2026

The situation is as follows.

  1. Now overhang receives function instead of dictionary, and this is better approach. Currently, its signature is (str, alignment) => (auto | ratio), but it can be changed easily. The alignment here is either left or right.

  2. Previous changes clearly broke tests for RTL languages. The previous behaviour was to overhang punctuation on the end margin, not only on the right one.

    So, overhang can not receive just left or right, because this can not replicate previous behaviour — it must also take start and end values. Now the default value is overhang: end and all tests pass.

    I'm thinking, do we currently really need left and right values in these circumstances...

@sicikh
Copy link
Copy Markdown
Contributor Author

sicikh commented Feb 1, 2026

As I read in Hàn Thế Thành's dissertation, this PR implements so-called "level 1 character protruding" (pp. 41-43), when "margin kerning" is used when lines are already laid. "Level 2", on the other hand, incorporates margin kerning into the layout algorithm.

Since very good results were obtained in #6161 when character-level justification was included in the layout algorithm without much changes, I think we can do this too. I will study this in more detail.

See the comparison between level 1 and level 2 character protruding from Hàn Thế Thành dissertation (p. 47) below:

image

@sicikh
Copy link
Copy Markdown
Contributor Author

sicikh commented Feb 2, 2026

I think, that now overhang characters' width, that they add to the line's one, is included into the layout algorithm. The results show exactly this, but not by amount, that was shown in the dissertation. I'll investigate it further. But the result is great nonetheless.

image
Code
#set page(paper: "a4")

// Wiki suggested values + Hàn Thế Thành’s original settings
//
// - https://en.wikipedia.org/wiki/Optical_margin_alignment#
// - https://ftp.snt.utwente.nl/pub/software/tex/macros/latex/contrib/microtype/microtype-code.pdf, p. 155
#let protrusion-table = (
  "A": (20%, 20%),
  "F": (0%, 5%),
  "J": (5%, 0%),
  "K": (0%, 5%),
  "L": (0%, 5%),
  "C": (10%, 0%),
  "O": (10%, 10%),
  "T": (20%, 20%),
  "V": (20%, 20%),
  "W": (20%, 20%),
  "X": (5%, 5%),
  "Y": (20%, 20%),
  "k": (0%, 5%),
  "r": (0%, 5%),
  "t": (0%, 10%),
  "v": (5%, 5%),
  "w": (5%, 5%),
  "x": (5%, 5%),
  "y": (10%, 10%),
  "c": (10%, 10%),
  "o": (10%, 10%),
  ".": (0%, 100%),
  ",": (0%, 100%),
  ":": (0%, 100%),
  ";": (0%, 100%),
  "!": (0%, 20%),
  "?": (0%, 20%),
  "(": (20%, 0%),
  ")": (0%, 20%),
  "-": (0%, 75%),
  "\u{ad}": (0%, 75%),
  "": (0%, 50%),
  "": (0%, 25%),
  "": (100%, 0%),
  "": (0%, 100%),
)
#let italic-protrusion-table = (
  "p": (20%, 0%),
)
#let use-another-table-for-emph(body) = {
  show emph: set text(overhang: (map: italic-protrusion-table, default: false))
  body
} 

#let example(name) = [
  #place(top, float: true, scope: "parent", strong(name))
  
Shortly after this, when the sexton
came to pay them a visit, the father
broke out to him, and told him what
a bad hand his youngest son was
at everything: he knew nothing and
learned nothing. “Only think! when
I asked him how he purposed gaining a livelihood, he actually asked
to be taught to shudder.” “If that’s
all he wants,” said the sexton, “I can
teach him that; just you send him to
me, I’ll soon polish him up.” The father was quite pleased with the proposal, because he thought: “It will
be a good discipline for the youth.”
And so the sexton took him into his
house, and his duty was to toll the
bell. After a few days he woke him
at midnight, and bade him rise and
climb into the tower and toll. “Now,
my friend, I’ll teach you to shudder,”
thought he. He stole forth secretly
in front, and when the youth was
up above, and had turned round to
grasp the bell-rope, he saw, standing _opposite_ the hole of the belfry,
a white figure. “Who’s there?” he
called out, but the figure gave no answer, and neither stirred nor moved.
“Answer,” cried the youth, “or begone; you have no business here
at this hour of the night.” But the
sexton remained motionless, so that
the youth might think that it was a
ghost. The youth called out the second time: “What do you want here?
Speak if you are an honest fellow, or
]

#set page(width: 500pt, height: 21em, margin: 15pt)
#set par(justify: true)
#set text(size: 0.8em)

#grid(
  columns: (1fr, 1fr, 1fr, 1fr),
  gutter: 10pt,
  {
    set text(overhang: false)
    example[No protruding]
  },
  {
    set text(overhang: (map: protrusion-table, default: false))
    show: use-another-table-for-emph
    example[Character protruding]
  },
  {
    set text(overhang: false)
    set par(justification-limits: (tracking: (max: 0.02em, min: -0.02em)))
    example[Character level justification]
  },
  {
    set text(overhang: (map: protrusion-table, default: false))
    set par(justification-limits: (tracking: (max: 0.02em, min: -0.02em)))
    show: use-another-table-for-emph
    example[Protruding + justification]
  }
)

Secondly, the idea of passing a function to the overhang parameter was interesting, but it has one significant technical limitation — calling a function requires a mutable reference to the Engine, while layout algorithm has only shared one. So I reverted the changes and left "ad-hoc" dictionary with two entries: the table itself ("map") and default behaviour ("default" field).

Also I changed protrusion value from Ratio to Rel — Donald Knuth commented on the Hàn Thế Thành's dissertation, that a more correct option would be to specify protrusion not in ratio to a glyph width, but in em. Fortunately, we can support both options.

My next steps will be to reassure, that the implementation is correct (currently I have some doubts), maybe make overhang foldable and improve error messages.

…ert `overhang` value from function to dictionary
@YDX-2147483647
Copy link
Copy Markdown
Contributor

YDX-2147483647 commented Feb 2, 2026

Hi! I see that text.overhang.map does not accept multi-character punctuation marks like —— (CJK two-em dash, two consecutive U+2014).

error: protrusion table keys must be single characters

Is that an essential limitation of the current design, or merely a superficial restriction of the interface (or data types)?
If the latter, I wish the restriction can be removed. It will provide a workaround for #6735.


Besides, I suggest updating the PR description or at least adding a note. The current version is outdated and misleading.

@sicikh
Copy link
Copy Markdown
Contributor Author

sicikh commented Feb 2, 2026

Is that an essential limitation of the current design, or merely a superficial restriction of the interface (or data types)? If the latter, I wish the restriction can be removed. It will provide a workaround for #6735.

Hi! Yes, this is a current limitation of the interface, but not a technical one. I just haven't had time to think about how to describe the protrusions for several characters, but at the same time to disallow, for example, "abcdef": (9999%, 9999%).

Besides, I suggest updating the PR description or at least adding a note. The current version is outdated and misleading.

Also thought of this, will change it in a minute. UPD: done.

@sicikh sicikh changed the title Make overhang configurable, add overhanging to the left margin Configurable character protrusion for both margins Feb 2, 2026
@laurmaedje laurmaedje added waiting-on-review This PR is waiting to be reviewed. layout Related to the layout category, which is about composing, positioning, etc. text Related to the text category, which is all about text handling, shaping, etc. interface PRs that add to or change Typst's user-facing interface as opposed to internals or docs changes. labels Feb 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

interface PRs that add to or change Typst's user-facing interface as opposed to internals or docs changes. layout Related to the layout category, which is about composing, positioning, etc. text Related to the text category, which is all about text handling, shaping, etc. waiting-on-review This PR is waiting to be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support hanging-punctuation in the left margin Chinese punctuation overhang More configurable text overhang

5 participants