-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Closed
Labels
is-robustness-issueFrom a users perspective, this is about robustnessFrom a users perspective, this is about robustnessworkflow-mergeFrom a users perspective, merging is the affected feature/workflowFrom a users perspective, merging is the affected feature/workflow
Description
While trying to merge PDFs, certain PDFs throw this exception:
File "/Users/pfay/code/misc/pypdf_issue/.venv/lib/python3.10/site-packages/pypdf/generic/_link.py", line 83, in extract_links
old_links = [_build_link(link, old_page) for link in old_page.get("/Annots", [])]
TypeError: 'NullObject' object is not iterable
I believe this may be the same issue as #3110 - this comment #3110 (comment) has some notes about a potential codefix, another warning that comes up, and that it would be nice to test merging this problematic doc with other docs that contain annotations to ensure they are combined cleanly without purging the annotations.
I have that potential fix on a branch here, as noted though it is not thoroughly tested and may be lacking other changes that should be made. main...tigger0jk:pypdf:main
Environment
$ uv run python -m platform
macOS-14.6.1-arm64-arm-64bit
$ uv run python -c "import pypdf;print(pypdf._debug_versions)"
pypdf==6.7.2, crypt_provider=('local_crypt_fallback', '0.0.0'), PIL=noneCode + PDF
This is a minimal, complete example that shows the issue:
from pypdf import PdfReader, PdfWriter
from io import BytesIO
with open('sample.pdf', 'rb') as f:
content = f.read()
pdf = PdfReader(BytesIO(content))
merger = PdfWriter()
merger.append(pdf)Traceback
This is the complete traceback I see:
$ uv run python pypdf_issue.py
Traceback (most recent call last):
File "/Users/pfay/code/misc/pypdf_issue/pypdf_issue.py", line 9, in <module>
merger.append(pdf)
File "/Users/pfay/code/misc/pypdf_issue/.venv/lib/python3.10/site-packages/pypdf/_writer.py", line 2615, in append
self.merge(
File "/Users/pfay/code/misc/pypdf_issue/.venv/lib/python3.10/site-packages/pypdf/_writer.py", line 2696, in merge
srcpages[pg.indirect_reference.idnum] = self.add_page(
File "/Users/pfay/code/misc/pypdf_issue/.venv/lib/python3.10/site-packages/pypdf/_writer.py", line 611, in add_page
return self._add_page(page, len(self.flattened_pages), excluded_keys)
File "/Users/pfay/code/misc/pypdf_issue/.venv/lib/python3.10/site-packages/pypdf/_writer.py", line 543, in _add_page
self._unresolved_links.extend(extract_links(page, page_org))
File "/Users/pfay/code/misc/pypdf_issue/.venv/lib/python3.10/site-packages/pypdf/generic/_link.py", line 83, in extract_links
old_links = [_build_link(link, old_page) for link in old_page.get("/Annots", [])]
TypeError: 'NullObject' object is not iterable
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
is-robustness-issueFrom a users perspective, this is about robustnessFrom a users perspective, this is about robustnessworkflow-mergeFrom a users perspective, merging is the affected feature/workflowFrom a users perspective, merging is the affected feature/workflow