MAINT: Further refactoring of the Font and FontDescriptor classes and Core Font Metrics#3606
MAINT: Further refactoring of the Font and FontDescriptor classes and Core Font Metrics#3606stefan6419846 merged 25 commits intopy-pdf:mainfrom
Conversation
db9d49c to
dc0eaf0
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3606 +/- ##
=======================================
Coverage 97.38% 97.39%
=======================================
Files 55 55
Lines 9845 9851 +6
Branches 1799 1800 +1
=======================================
+ Hits 9588 9594 +6
Misses 151 151
Partials 106 106 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
In my opinion, delaying a possibly ready PR just to take care of another PR which needs further changes usually is not necessary. |
Hmmm... Well, I noticed your review, and it seemed to me that there were only three very minor changes that needed to happen before it could be merged... I'd almost fetch those patches myself and fix them, but that does not sound very courteous to me. In any case, I can rebase this and push, no problem. |
|
Rebased. #3602 does not apply cleanly to my branch, because I changed the imports and I removed character_widths from the FontDescriptor class. Other than that, though, no changes ought to be necessary. |
|
Rebased and rewrote CoreFontMetrics as a dataclass. Should I add one more patch that takes into account FixedPitch fonts when setting default width based on space width? I already implemented this in the core font metrics. |
a5fdfe7 to
39792ea
Compare
|
I added the font flags to constants.py as class FontFlags, and then I used that in both aft_to_dataclass.py and in _font.py. Also, while checking that fixed fonts actually result in default width equalling space width, I noticed that some fonts set default width to 0. I added a patch to override that. Finally, there's one test that sets fixed pitch and has space width that differs from non-zero default width, also both set by the font itself. I guess there are limits to what we can accommodate! Other than that, I hope that the above is fine. EDIT: This is the test: |
Change this script to produce core font metrics in terms of its own class, consisting of a font descriptor and character widths. Previously, this resource collected FontDescriptor classes for the afm files of the 14 Adobe core fonts. Now it collects Font classes.
Previously, default width was set in the FontDescriptor class. This patch moves that step to the Font class.
Some Type3 fonts do not have a Font Descriptor object, but they do define FontBBox. This patch makes sure that we also define a font descriptor for such fonts.
For a font's default character width, if not defined in the font descriptor, we use the width of two spaces as a default. However, this does not work for fixed width fonts such as courier. This patch reads the font flags to see if a font is FixedPitch (bit 1 of the font flags) and sets default width accordingly. This patch applies the same logic to space width, when space width is not defined or set to 0.
While testing setting default width and space width for fixed pitch fonts, I accidentally discovered that some fonts actually set a default width of 0 in their font descriptor pdf resource object. This cannot work for either text extraction or making an appearance stream. Override it instead.
|
Thank you! |
This PR is one more refactoring of the Font and FontDescriptor classes and Core Font Metrics, after previously having converged on one shared Font class. The current PR:
There is a limited net gain in lines of code, but this only pertains to the automatically generated core font metrics. The file pydpf/_font.py has shrunk by 33 lines and is now easier to read.
Perhaps I should have done this in the previous PR. I had hoped that the diff would be easier to read, but unfortunately it is not. Still, a lot of the diffstat consists of moving four methods - _parse_font_descriptor, _collect_tt_t1_character_widths, _collect_cid_character_widths and _add_default_width - from the FontDescriptor class to the Font class, with hardly any modification, so the main changes for review would be:
EDIT
With regard to the necessity of this PR - I think that having a Font class that can conceptually represent a Font resource, and having a FontDescriptor class that conceptually represents a Font Descriptor resource, help converting a Font class to a new PDF font resource. This I presume is a prerequisite for fixing the bug where arabic text is not displayed correctly when creating a new appearance stream.