Skip to content

MAINT: Further refactoring of the Font and FontDescriptor classes and Core Font Metrics#3606

Merged
stefan6419846 merged 25 commits intopy-pdf:mainfrom
PJBrs:fontwork
Jan 29, 2026
Merged

MAINT: Further refactoring of the Font and FontDescriptor classes and Core Font Metrics#3606
stefan6419846 merged 25 commits intopy-pdf:mainfrom
PJBrs:fontwork

Conversation

@PJBrs
Copy link
Contributor

@PJBrs PJBrs commented Jan 19, 2026

This PR is one more refactoring of the Font and FontDescriptor classes and Core Font Metrics, after previously having converged on one shared Font class. The current PR:

  1. Removes character_widths from the FontDescriptor class, putting all associated code in the Font class. Conceptually, this is also where character widths belong; the PDF specification does not include character widths in the Font Descriptor PDF object either.
  2. Reorganises the Core Font Metrics in a NamedTuple of (FontDescriptor, character_widths)
  3. Moves all code that instantiates a Font class in the Font.from_font_resource method.
  4. Better deals with Type3 fonts that do not specify a Font Descriptor

There is a limited net gain in lines of code, but this only pertains to the automatically generated core font metrics. The file pydpf/_font.py has shrunk by 33 lines and is now easier to read.

Perhaps I should have done this in the previous PR. I had hoped that the diff would be easier to read, but unfortunately it is not. Still, a lot of the diffstat consists of moving four methods - _parse_font_descriptor, _collect_tt_t1_character_widths, _collect_cid_character_widths and _add_default_width - from the FontDescriptor class to the Font class, with hardly any modification, so the main changes for review would be:

  1. The new NamedTuple for CoreFontMetrics and separating out the character_widths from the FontDescriptor
  2. The integrated Font.from_font_resource method

EDIT
With regard to the necessity of this PR - I think that having a Font class that can conceptually represent a Font resource, and having a FontDescriptor class that conceptually represents a Font Descriptor resource, help converting a Font class to a new PDF font resource. This I presume is a prerequisite for fixing the bug where arabic text is not displayed correctly when creating a new appearance stream.

@PJBrs PJBrs force-pushed the fontwork branch 2 times, most recently from db9d49c to dc0eaf0 Compare January 19, 2026 19:29
@codecov
Copy link

codecov bot commented Jan 19, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.39%. Comparing base (8f1857d) to head (6042aa8).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #3606   +/-   ##
=======================================
  Coverage   97.38%   97.39%           
=======================================
  Files          55       55           
  Lines        9845     9851    +6     
  Branches     1799     1800    +1     
=======================================
+ Hits         9588     9594    +6     
  Misses        151      151           
  Partials      106      106           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@PJBrs
Copy link
Contributor Author

PJBrs commented Jan 22, 2026

I just noticed #3602 , which looks really nice and useful. It conflicts a little bit with this one, but it would be very easy for me to rebase anyway. So let's wait until #3602 is merged. I'll mark this a draft in the meantime.

@PJBrs PJBrs marked this pull request as draft January 22, 2026 08:43
@stefan6419846
Copy link
Collaborator

In my opinion, delaying a possibly ready PR just to take care of another PR which needs further changes usually is not necessary.

@PJBrs
Copy link
Contributor Author

PJBrs commented Jan 25, 2026

In my opinion, delaying a possibly ready PR just to take care of another PR which needs further changes usually is not necessary.

Hmmm... Well, I noticed your review, and it seemed to me that there were only three very minor changes that needed to happen before it could be merged... I'd almost fetch those patches myself and fix them, but that does not sound very courteous to me.

In any case, I can rebase this and push, no problem.

@PJBrs PJBrs marked this pull request as ready for review January 25, 2026 14:56
@PJBrs
Copy link
Contributor Author

PJBrs commented Jan 25, 2026

Rebased. #3602 does not apply cleanly to my branch, because I changed the imports and I removed character_widths from the FontDescriptor class. Other than that, though, no changes ought to be necessary.

@PJBrs
Copy link
Contributor Author

PJBrs commented Jan 27, 2026

Rebased and rewrote CoreFontMetrics as a dataclass.

Should I add one more patch that takes into account FixedPitch fonts when setting default width based on space width? I already implemented this in the core font metrics.

@PJBrs PJBrs force-pushed the fontwork branch 2 times, most recently from a5fdfe7 to 39792ea Compare January 27, 2026 18:40
@PJBrs
Copy link
Contributor Author

PJBrs commented Jan 27, 2026

I added the font flags to constants.py as class FontFlags, and then I used that in both aft_to_dataclass.py and in _font.py.

Also, while checking that fixed fonts actually result in default width equalling space width, I noticed that some fonts set default width to 0. I added a patch to override that.

Finally, there's one test that sets fixed pitch and has space width that differs from non-zero default width, also both set by the font itself. I guess there are limits to what we can accommodate! Other than that, I hope that the above is fine.

EDIT: This is the test:
tests/test_cmap.py::test_text_extraction_of_specific_pages[iss1379]
It's got two character widths: {' ': 260, 'default': 533}, and font flags are 5 (i.e., Fixed Pitch and Script). Some files just don't make sense, haha!

PJBrs added 13 commits January 27, 2026 20:53
Change this script to produce core font metrics in terms of
its own class, consisting of a font descriptor and character
widths.
Previously, this resource collected FontDescriptor classes
for the afm files of the 14 Adobe core fonts. Now it collects
Font classes.
Previously, default width was set in the FontDescriptor class.
This patch moves that step to the Font class.
Some Type3 fonts do not have a Font Descriptor object, but they
do define FontBBox. This patch makes sure that we also define a
font descriptor for such fonts.
PJBrs added 6 commits January 27, 2026 20:53
For a font's default character width, if not defined in the font
descriptor, we use the width of two spaces as a default. However,
this does not work for fixed width fonts such as courier. This
patch reads the font flags to see if a font is FixedPitch (bit 1
of the font flags) and sets default width accordingly.

This patch applies the same logic to space width, when space width
is not defined or set to 0.
While testing setting default width and space width for
fixed pitch fonts, I accidentally discovered that some
fonts actually set a default width of 0 in their font
descriptor pdf resource object. This cannot work for
either text extraction or making an appearance stream.
Override it instead.
@stefan6419846 stefan6419846 changed the title MAINT: Final refactoring of the Font and FontDescriptor classes and Core Font Metrics MAINT: Further refactoring of the Font and FontDescriptor classes and Core Font Metrics Jan 29, 2026
Copy link
Collaborator

@stefan6419846 stefan6419846 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@stefan6419846 stefan6419846 merged commit f6f8427 into py-pdf:main Jan 29, 2026
19 checks passed
@PJBrs
Copy link
Contributor Author

PJBrs commented Jan 29, 2026

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants