ROB: Ignore EOF in NumberObject.read_from_stream#1505
ROB: Ignore EOF in NumberObject.read_from_stream#1505rraval wants to merge 2 commits intopy-pdf:mainfrom
Conversation
|
Sadly I do not have a reproducing PDF that I can add to the testing corpus. This came up when exercising |
Codecov ReportBase: 91.84% // Head: 92.03% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #1505 +/- ##
==========================================
+ Coverage 91.84% 92.03% +0.19%
==========================================
Files 33 32 -1
Lines 6202 5976 -226
Branches 1228 1163 -65
==========================================
- Hits 5696 5500 -196
+ Misses 326 312 -14
+ Partials 180 164 -16 Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
7ec8040 to
b4cb2fa
Compare
Use `ignore_eof=True` just like `NameObject` does, which is the only other caller to `read_until_regex` in this module. This helps prevent arcane exceptions when trying to parse a number: ``` PyPDF2.errors.PdfStreamError: Stream has ended unexpectedly ``` The motivation is essentially identical to the change that introduced `ignore_eof=True` on `NameObject`s: py-pdf@431ba70
b1d7aa7 to
c93f755
Compare
|
Hi there, I've rebased the branch off the latest |
MasterOdin
left a comment
There was a problem hiding this comment.
So now all usages of read_until_regex override the default ignore_eof value of False, right? If we're going this route, should probably then flip the default value to True and then remove using the third argument in the two places...
This was initially motivated by `NumberObject.read_from_stream`, which was calling `read_until_regex` with the default value of `ignore_eof=False` and thus raising exceptions like: ``` PyPDF2.errors.PdfStreamError: Stream has ended unexpectedly ``` py-pdf@431ba70 demonstrates a similar fix for `NameObject.read_from_stream`. From discussion in py-pdf#1505, it was realized that the change to `NumberObject.read_from_stream` had now made ALL callers of `read_until_regex` pass `ignore_eof=True`. It's cleaner to remove the parameter entirely and change the default behaviour.
I've opened a separate PR along these lines: #1521 Feel free to pick one and close out the other. I'm not familiar enough with this codebase to judge which is "better". |
This was initially motivated by `NumberObject.read_from_stream`, which was calling `read_until_regex` with the default value of `ignore_eof=False` and thus raising exceptions like: ``` PyPDF2.errors.PdfStreamError: Stream has ended unexpectedly ``` py-pdf@431ba70 demonstrates a similar fix for `NameObject.read_from_stream`. From discussion in py-pdf#1505, it was realized that the change to `NumberObject.read_from_stream` had now made ALL callers of `read_until_regex` pass `ignore_eof=True`. It's cleaner to remove the parameter entirely and change the default behaviour.
|
I'm closing this one as the parameter was now removed with #1521 :-) |
Use
ignore_eof=Truejust likeNameObjectdoes, which is the only other caller toread_until_regexin this module.This helps prevent arcane exceptions when trying to parse a number:
The motivation is essentially identical to the change that introduced
ignore_eof=TrueonNameObjects:431ba70