Skip to content

"/Pages" might be undefined #2516

@stefan6419846

Description

@stefan6419846

I am trying to get the pages of a PDF file which does not have the /Pages object, but where we have got

self.root_object == {'/Type': '/Pages', '/Kids': [IndirectObject(3, 0, 140441093721392)], '/Count': 1}

and

catalog["/Kids"][0].get_object().get_data() == b'q 595.20 0 0 840.96 0.00 0.00 cm 1 g /Im1 Do Q\r'

This does not seem to be properly supported at the moment.

Environment

Which environment were you using when you encountered the problem?

$ python -m platform
Linux-5.14.21-150400.24.100-default-x86_64-with-glibc2.3.4

$ python -c "import pypdf;print(pypdf._debug_versions)"
pypdf==latest Git main, crypt_provider=('local_crypt_fallback', '0.0.0'), PIL=8.4.0

Code + PDF

This is a minimal, complete example that shows the issue:

from pypdf import PdfReader, PdfWriter


reader = PdfReader('file.pdf')
for page in reader.pages:
    print(page)

I cannot disclose the PDF file here as the referenced image holds sensitive information, but I hope that it is indeed possible to generate a basic example PDF file from the provided values above.

Traceback

This is the complete traceback I see:

Traceback (most recent call last):
  File "run.py", line 5, in <module>
    for page in reader.pages:
  File "/home/stefan/temp/pypdf/pypdf/_page.py", line 2275, in __iter__
    for i in range(len(self)):
  File "/home/stefan/pypdf/pypdf/_page.py", line 2206, in __len__
    return self.length_function()
  File "/home/stefan/pypdf/pypdf/_reader.py", line 463, in _get_num_pages
    self._flatten()
  File "/home/stefan/pypdf/pypdf/_reader.py", line 1146, in _flatten
    pages = cast(DictionaryObject, catalog["/Pages"].get_object())
  File "/home/stefan/pypdf/pypdf/generic/_data_structures.py", line 385, in __getitem__
    return dict.__getitem__(self, key).get_object()
KeyError: '/Pages'

Metadata

Metadata

Assignees

No one assigned

    Labels

    PdfReaderThe PdfReader component is affectedkey-errorCould be a bug, but also a robustness issue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions