Skip to content

NullObject in old_page "/Annots" prevents appending pages #3656

@tigger0jk

Description

@tigger0jk

While trying to merge PDFs, certain PDFs throw this exception:

  File "/Users/pfay/code/misc/pypdf_issue/.venv/lib/python3.10/site-packages/pypdf/generic/_link.py", line 83, in extract_links
    old_links = [_build_link(link, old_page) for link in old_page.get("/Annots", [])]
TypeError: 'NullObject' object is not iterable

I believe this may be the same issue as #3110 - this comment #3110 (comment) has some notes about a potential codefix, another warning that comes up, and that it would be nice to test merging this problematic doc with other docs that contain annotations to ensure they are combined cleanly without purging the annotations.

I have that potential fix on a branch here, as noted though it is not thoroughly tested and may be lacking other changes that should be made. main...tigger0jk:pypdf:main

Environment

$ uv run python -m platform
macOS-14.6.1-arm64-arm-64bit

$ uv run python -c "import pypdf;print(pypdf._debug_versions)"
pypdf==6.7.2, crypt_provider=('local_crypt_fallback', '0.0.0'), PIL=none

Code + PDF

This is a minimal, complete example that shows the issue:

from pypdf import PdfReader, PdfWriter
from io import BytesIO

with open('sample.pdf', 'rb') as f:
    content = f.read()

pdf = PdfReader(BytesIO(content))
merger = PdfWriter()
merger.append(pdf)

sample.pdf

Traceback

This is the complete traceback I see:

$ uv run python pypdf_issue.py
Traceback (most recent call last):
  File "/Users/pfay/code/misc/pypdf_issue/pypdf_issue.py", line 9, in <module>
    merger.append(pdf)
  File "/Users/pfay/code/misc/pypdf_issue/.venv/lib/python3.10/site-packages/pypdf/_writer.py", line 2615, in append
    self.merge(
  File "/Users/pfay/code/misc/pypdf_issue/.venv/lib/python3.10/site-packages/pypdf/_writer.py", line 2696, in merge
    srcpages[pg.indirect_reference.idnum] = self.add_page(
  File "/Users/pfay/code/misc/pypdf_issue/.venv/lib/python3.10/site-packages/pypdf/_writer.py", line 611, in add_page
    return self._add_page(page, len(self.flattened_pages), excluded_keys)
  File "/Users/pfay/code/misc/pypdf_issue/.venv/lib/python3.10/site-packages/pypdf/_writer.py", line 543, in _add_page
    self._unresolved_links.extend(extract_links(page, page_org))
  File "/Users/pfay/code/misc/pypdf_issue/.venv/lib/python3.10/site-packages/pypdf/generic/_link.py", line 83, in extract_links
    old_links = [_build_link(link, old_page) for link in old_page.get("/Annots", [])]
TypeError: 'NullObject' object is not iterable

Metadata

Metadata

Assignees

No one assigned

    Labels

    is-robustness-issueFrom a users perspective, this is about robustnessworkflow-mergeFrom a users perspective, merging is the affected feature/workflow

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions