Skip to content

AttributeError: 'NoneType' object has no attribute 'get_object' #1689

@Brich40

Description

@Brich40

Hello,

I'm using pypdf to get pages from a pdf file, which is working fine for most of the files.
But for a specific file, I'm getting the exception below:

  File "/home/obr01/python-venv/opencapture/lib/python3.10/site-packages/pypdf/_page.py", line 2342, in __getitem__
    len_self = len(self)
  File "/home/obr01/python-venv/opencapture/lib/python3.10/site-packages/pypdf/_page.py", line 2333, in __len__
    return self.length_function()
  File "/home/obr01/python-venv/opencapture/lib/python3.10/site-packages/pypdf/_reader.py", line 452, in _get_num_pages
    self._flatten()
  File "/home/obr01/python-venv/opencapture/lib/python3.10/site-packages/pypdf/_reader.py", line 1185, in _flatten
    pages = catalog["/Pages"].get_object()  # type: ignore
AttributeError: 'NoneType' object has no attribute 'get_object'

Apparently this is comming from the value of "/Pages" in the Reader trailer, which is "None" for this file :

pdf_reader = pypdf.PdfReader(file_path, strict=False)
print(pdf_reader.trailer['/Root']['/Pages']) 

Output :

Object 2178 0 not defined.
None

Is there any way to handle this case?

Thanks,

Metadata

Metadata

Assignees

No one assigned

    Labels

    is-robustness-issueFrom a users perspective, this is about robustness

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions