Skip to content

fix issues with destination (#604)#821

Closed
pubpub-zz wants to merge 4 commits intopy-pdf:mainfrom
pubpub-zz:iss604
Closed

fix issues with destination (#604)#821
pubpub-zz wants to merge 4 commits intopy-pdf:mainfrom
pubpub-zz:iss604

Conversation

@pubpub-zz
Copy link
Copy Markdown
Collaborator

#604

root cause: probably extraction from a document not extracting properly destination

changes:

  • getDestinationPageNumber return -1 with NullObject
  • in case of Strict = False, return a destination to first page to prevent error (no change in case of Strict=True)
    note ; warning generated

Test added with the sample test

#604

root cause: probably extraction from a document not extracting properly destination

changes:
*  getDestinationPageNumber return -1 with NullObject
* in case of Strict = False, return a destination to first page to prevent error (no change in case of Strict=True)
note ; warning generated

Test added with the sample test
PyPDF2/pdf.py Outdated
:rtype: int
"""
indirectRef = destination.page
if type(indirectRef) is NullObject:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want isinstance (reasons). What do you think about it?

@MartinThoma
Copy link
Copy Markdown
Member

@pubpub-zz Looks good to me, except for the type <-> isinstance part.

I did a lot of changes (applying the black formatter + splitting the pdf module into many submodules). Do you want me to deal with the merge conflicts or can you do that?

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 29, 2022

Codecov Report

Merging #821 (2dd0986) into main (80f2f25) will increase coverage by 0.22%.
The diff coverage is 100.00%.

❗ Current head 2dd0986 differs from pull request most recent head 7596ce3. Consider uploading reports for the commit 7596ce3 to get more accurate results

@@            Coverage Diff             @@
##             main     #821      +/-   ##
==========================================
+ Coverage   75.35%   75.58%   +0.22%     
==========================================
  Files          12       12              
  Lines        3563     3571       +8     
  Branches      822      824       +2     
==========================================
+ Hits         2685     2699      +14     
+ Misses        661      657       -4     
+ Partials      217      215       -2     
Impacted Files Coverage Δ
PyPDF2/pdf.py 82.34% <100.00%> (+0.33%) ⬆️
PyPDF2/generic.py 68.12% <0.00%> (+0.28%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 80f2f25...7596ce3. Read the comment docs.

@pubpub-zz
Copy link
Copy Markdown
Collaborator Author

@pubpub-zz Looks good to me, except for the type <-> isinstance part.

I did a lot of changes (applying the black formatter + splitting the pdf module into many submodules). Do you want me to deal with the merge conflicts or can you do that?

@MartinThoma , sure : I will propose a new PR

pubpub-zz added a commit to pubpub-zz/pypdf that referenced this pull request Apr 30, 2022
py-pdf#604 

root cause: probably extraction from a document not extracting properly destination

changes:

    getDestinationPageNumber return -1 with NullObject
    in case of Strict = False, return a destination to first page to prevent error (no change in case of Strict=True)
    note ; warning generated

Test added with the sample test

(duplicate of  py-pdf#821 to match refactoring)
MartinThoma pushed a commit that referenced this pull request Apr 30, 2022
If a destination is missing, getDestinationPageNumber now returns -1
If `strict=False`, the first page is used as a fallback.

The code triggering the exception was

```python
from PyPDF2 import PdfFileReader

# https://github.com/mstamy2/PyPDF2/files/6045010/thyroid.pdf
with open("thyroid.pdf", "rb") as f:
   reader = PdfFileReader(f)
   bookmarks = pdf.getOutlines()
   for b in bookmarks:
       print(reader.getDestinationPageNumber(b) + 1)  # page count starts from 0
```

The error message was:
    PyPDF2.utils.PdfReadError: Unknown Destination Type: 0

Closes #604 
Closes #821
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants