-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Closed
Closed
Copy link
Labels
Has MCVEA minimal, complete and verifiable example helps a lot to debug / understand feature requestsA minimal, complete and verifiable example helps a lot to debug / understand feature requestsPdfMergerThe PdfMerger component is affectedThe PdfMerger component is affectedis-bugFrom a users perspective, this is a bug - a violation of the expected behavior with a compliant PDFFrom a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF
Description
A PDF from Google Sheet doesn't merge with PdfMerger when import_bookmarks is True. If that is False it works.
It seems that stream is not in a correct state for reading a header from a PDF.
Environment
Which environment were you using when you encountered the problem?
$ python -m platform
Linux-5.17.12-200.fc35.x86_64-x86_64-with-glibc2.34
$ python -c "import PyPDF2;print(PyPDF2.__version__)"
2.4.0Code + PDF
This is a minimal, complete example that shows the issue:
#!/usr/bin/env python
# vi: et sw=4 fileencoding=utf-8
from PyPDF2 import PdfReader, PdfMerger
import sys
out_pdf = PdfMerger()
print("This is OK")
out_pdf.append(PdfReader(sys.argv[1]), import_bookmarks=False)
print("This crashes")
out_pdf.append(PdfReader(sys.argv[1]), import_bookmarks=True)
out_file = open(sys.argv[2], 'wb')
out_pdf.write(out_file)Sample PDF file:
Traceback
This is the complete Traceback I see:
Traceback (most recent call last):
File "/home/hate/git/PyPDF2/sample-files/003-pdflatex-image/bug_report.py", line 18, in <module>
out_pdf.append(PdfReader(sys.argv[1]), import_bookmarks=True)
File "/home/hate/git/PyPDF2/PyPDF2/_merger.py", line 252, in append
self.merge(len(self.pages), fileobj, bookmark, pages, import_bookmarks)
File "/home/hate/git/PyPDF2/PyPDF2/_merger.py", line 152, in merge
outline = reader.outlines
File "/home/hate/git/PyPDF2/PyPDF2/_reader.py", line 665, in outlines
return self._get_outlines()
File "/home/hate/git/PyPDF2/PyPDF2/_reader.py", line 677, in _get_outlines
lines = cast(DictionaryObject, catalog[CO.OUTLINES])
File "/home/hate/git/PyPDF2/PyPDF2/generic.py", line 666, in __getitem__
return dict.__getitem__(self, key).get_object()
File "/home/hate/git/PyPDF2/PyPDF2/generic.py", line 237, in get_object
obj = self.pdf.get_object(self)
File "/home/hate/git/PyPDF2/PyPDF2/_reader.py", line 1051, in get_object
idnum, generation = self.read_object_header(self.stream)
File "/home/hate/git/PyPDF2/PyPDF2/_reader.py", line 1133, in read_object_header
return int(idnum), int(generation)
ValueError: invalid literal for int() with base 10: b'F-1.4'
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
Has MCVEA minimal, complete and verifiable example helps a lot to debug / understand feature requestsA minimal, complete and verifiable example helps a lot to debug / understand feature requestsPdfMergerThe PdfMerger component is affectedThe PdfMerger component is affectedis-bugFrom a users perspective, this is a bug - a violation of the expected behavior with a compliant PDFFrom a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF