add last1K limit to readNextEndLine#439
Closed
jonnythebard wants to merge 3 commits intopy-pdf:mainfrom
Closed
Conversation
|
Looking forward for this Pull request to be accepted |
272851b to
bd3ae44
Compare
Member
|
Have you seen #642 ? What do you think about it? |
Author
|
@MartinThoma Yes it looks much better than mine because my commit replaces a condition in the if phrase instead of adding one. I didn't notice my mistake. Glad that someone is finally being aware of this issue though 😂 |
Codecov Report
@@ Coverage Diff @@
## main #439 +/- ##
=======================================
Coverage 70.59% 70.59%
=======================================
Files 10 10
Lines 3425 3425
Branches 798 798
=======================================
Hits 2418 2418
Misses 763 763
Partials 244 244
Continue to review full report at Codecov.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I've been processing over 50k of pdf file with PyPDF2 for last several weeks and found it isn't filtering some malformed pdf file. The problem with malformed pdf file was that it had %%EOF marker at the beginning followed by 30m bytes of b'\x00'. Current version of PyPDF2 tries to travel all the way though 30m bytes of b'\x00' and find %%EOF. Since %%EOF marker should appear in last 1k of the file i thought it would make sense to add last1K limit to readNextEndLine function. i applied this to my application and it works fine.