fix issues with destination (#604)#821
fix issues with destination (#604)#821pubpub-zz wants to merge 4 commits intopy-pdf:mainfrom pubpub-zz:iss604
Conversation
#604 root cause: probably extraction from a document not extracting properly destination changes: * getDestinationPageNumber return -1 with NullObject * in case of Strict = False, return a destination to first page to prevent error (no change in case of Strict=True) note ; warning generated Test added with the sample test
PyPDF2/pdf.py
Outdated
| :rtype: int | ||
| """ | ||
| indirectRef = destination.page | ||
| if type(indirectRef) is NullObject: |
There was a problem hiding this comment.
We might want isinstance (reasons). What do you think about it?
|
@pubpub-zz Looks good to me, except for the type <-> isinstance part. I did a lot of changes (applying the black formatter + splitting the |
Codecov Report
@@ Coverage Diff @@
## main #821 +/- ##
==========================================
+ Coverage 75.35% 75.58% +0.22%
==========================================
Files 12 12
Lines 3563 3571 +8
Branches 822 824 +2
==========================================
+ Hits 2685 2699 +14
+ Misses 661 657 -4
+ Partials 217 215 -2
Continue to review full report at Codecov.
|
@MartinThoma , sure : I will propose a new PR |
py-pdf#604 root cause: probably extraction from a document not extracting properly destination changes: getDestinationPageNumber return -1 with NullObject in case of Strict = False, return a destination to first page to prevent error (no change in case of Strict=True) note ; warning generated Test added with the sample test (duplicate of py-pdf#821 to match refactoring)
If a destination is missing, getDestinationPageNumber now returns -1 If `strict=False`, the first page is used as a fallback. The code triggering the exception was ```python from PyPDF2 import PdfFileReader # https://github.com/mstamy2/PyPDF2/files/6045010/thyroid.pdf with open("thyroid.pdf", "rb") as f: reader = PdfFileReader(f) bookmarks = pdf.getOutlines() for b in bookmarks: print(reader.getDestinationPageNumber(b) + 1) # page count starts from 0 ``` The error message was: PyPDF2.utils.PdfReadError: Unknown Destination Type: 0 Closes #604 Closes #821
#604
root cause: probably extraction from a document not extracting properly destination
changes:
note ; warning generated
Test added with the sample test