Skip to content

Fix 608 use null ID when encrypted but no ID given#610

Closed
richardmillson wants to merge 2 commits intopy-pdf:mainfrom
richardmillson:fix-608-decryption-fails-when-no-ID-in-trailer
Closed

Fix 608 use null ID when encrypted but no ID given#610
richardmillson wants to merge 2 commits intopy-pdf:mainfrom
richardmillson:fix-608-decryption-fails-when-no-ID-in-trailer

Conversation

@richardmillson
Copy link
Copy Markdown
Contributor

Pull request

  • Fixes Issue Cannot decrypt PDF missing 'ID' in trailer #608 where a PDF with an '/Encrypt' key but no '/ID' key in
    self.trailer throws a KeyError and is not decrypted.
  • If no '/ID' key is present in self.trailer an array of two empty bytestrings
    is used in place of an '/ID'. This is how Apache PDFBox handles this case.
  • This makes PyPDF2 more robust to malformed PDFs that
    don't follow the spec.
  • As camelot
    uses this library, this provides a branch with working behavior to use.

How Has This Been Tested?

  • Added test against sample PDF.

@MartinThoma MartinThoma added is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF workflow-encryption From a users perspective, encryption is the affected feature/workflow labels Apr 6, 2022
@MartinThoma
Copy link
Copy Markdown
Member

Thank you for your PR! It already looks good, but it has merge conflicts. Would you mind resolving them?

If you don't want to do that, I would take care of it. I would give credit via Githubs co-authored-by feature.

@MartinThoma MartinThoma added soon PRs that are almost ready to be merged, issues that get solved pretty soon needs-change The PR/issue cannot be handled as issue and needs to be improved labels Apr 16, 2022
MartinThoma added a commit that referenced this pull request Apr 24, 2022
If no '/ID' key is present in self.trailer an array of two empty bytestrings is used in place of an '/ID'. This is how Apache PDFBox handles this case.

This makes PyPDF2 more robust to malformed PDFs.

Closes #608
Closes #610

Full credit for this one to Richard Millson - Martin Thoma only fixed a merge conflict

Co-authored-by: Richard Millson <8217613+richardmillson@users.noreply.github.com>
VictorCarlquist pushed a commit to VictorCarlquist/PyPDF2 that referenced this pull request Apr 29, 2022
If no '/ID' key is present in self.trailer an array of two empty bytestrings is used in place of an '/ID'. This is how Apache PDFBox handles this case.

This makes PyPDF2 more robust to malformed PDFs.

Closes py-pdf#608
Closes py-pdf#610

Full credit for this one to Richard Millson - Martin Thoma only fixed a merge conflict

Co-authored-by: Richard Millson <8217613+richardmillson@users.noreply.github.com>
@richardmillson
Copy link
Copy Markdown
Contributor Author

@MartinThoma Didn't see your message in time; thanks for resolving the conflict and merging though!

@richardmillson richardmillson deleted the fix-608-decryption-fails-when-no-ID-in-trailer branch May 3, 2022 17:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF needs-change The PR/issue cannot be handled as issue and needs to be improved soon PRs that are almost ready to be merged, issues that get solved pretty soon workflow-encryption From a users perspective, encryption is the affected feature/workflow

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants