-
Notifications
You must be signed in to change notification settings - Fork 1.6k
PageObject._get_fonts() returns embedded as unembedded. #2192
Copy link
Copy link
Closed
Description
import pypdf
reader = pypdf.PdfReader(r'...')
reader.pages[3]._get_fonts()return embedded as unembedded.
since there's a embedded font info like this:
{'/BaseFont': '/O9-PK748464-Identity-H',
'/DescendantFonts': # [IndirectObject(448, 0, 2596729619856)]
{'/BaseFont': '/O9-PK748464'
'/CIDSystemInfo': {'/Ordering': 'PKUO1',
'/Registry': 'Founder',
'/Supplement': 0},
'/DW': 480,
'/FontDescriptor': # IndirectObject(914, 0, 2596729619856)
{'/Ascent': 709,
'/CapHeight': 674,
'/Descent': -241,
'/Flags': 32,
'/FontBBox': [-115.218, -115.218, 345.65499999999997, 345.65499999999997],
'/FontFile3': {'/Subtype': '/CIDFontType0C', '/Filter': ['/FlateDecode']}},
'/FontName': '/O9-PK748464',
'/ItalicAngle': 0,
'/StemV': 91,
'/Type': '/FontDescriptor'},
'/Subtype': '/CIDFontType0',
'/Type': '/Font'},
'/Encoding': '/Identity-H',
'/Subtype': '/Type0',
'/ToUnicode': IndirectObject(449, 0, 2596729619856),
'/Type': '/Font'}This pdf is protected (unable to copy&paste), and the '/ToUnicode' is incorrect and incomplete although there's one. Therefore this case should be considered embedded.
But the code e51141d unembedded = fonts - embedded is not right for this case.
Environment
Windows-10-10.0.17134-SP0
pypdf==3.16.0, crypt_provider=('cryptography', '38.0.4'), PIL=9.4.0
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels