readObjectHeader: Allow extra whitespace before "obj"#567
Conversation
|
@malthejorgensen Sorry that it took so long - the PR looks good to me. It has merge conflicts which make it impossible for me to merge it at the moment. Would you mind solving them (or to open a new PR; might be simpler)? I would understand if you don't want to do this. Then I'd add the change myself, giving you credit via githubs co-authored-by feature |
5b3d04f to
0184fda
Compare
The header being read has the format:
<idnum> <generation> obj
where `<idnum>` and `<generation>` are integers.
Previously an arbitrary number of spaces was being allowed between `<idnum>` and `<generation>`, but not between `<generation>` and `obj`.
We now allow arbitrary spaces between `<generation>` and `obj`.
0184fda to
74c573c
Compare
|
@MartinThoma No problem :) – hereby rebased. |
Codecov Report
@@ Coverage Diff @@
## main #567 +/- ##
=======================================
Coverage 69.63% 69.64%
=======================================
Files 9 9
Lines 3316 3317 +1
Branches 783 783
=======================================
+ Hits 2309 2310 +1
Misses 763 763
Partials 244 244
Continue to review full report at Codecov.
|
|
Thank you very much! It was merged and will be part of the next release (some time this month) |
Deprecations (DEP): - Remove support for Python 2.6 and older (#776) New Features (ENH): - Extract document permissions (#320) Bug Fixes (BUG): - Clip by trimBox when merging pages, which would otherwise be ignored (#240) - Add overwriteWarnings parameter PdfFileMerger (#243) - IndexError for getPage() of decryped file (#359) - Handle cases where decodeParms is an ArrayObject (#405) - Updated PDF fields don't show up when page is written (#412) - Set Linked Form Value (#414) - Fix zlib -5 error for corrupt files (#603) - Fix reading more than last1K for EOF (#642) - Acciental import Robustness (ROB): - Allow extra whitespace before "obj" in readObjectHeader (#567) Documentation (DOC): - Link to pdftoc in Sample_Code (#628) - Working with annotations (#764) - Structure history Developer Experience (DEV): - Add issue templates (#765) - Add tool to generate changelog Maintenance (MAINT): - Use grouped constants instead of string literals (#745) - Add error module (#768) - Use decorators for @staticmethod (#775) - Split long functions (#777) Testing (TST): - Run tests in CI once with -OO Flags (#770) - Filling out forms (#771) - Add tests for Writer (#772) - Error cases (#773) - Check Error messages (#769) - Regression test for issue #88 - Regression test for issue #327 Code Style (STY): - Make variable naming more consistent in tests All changes: 1.27.5...1.27.6
Deprecations (DEP): - Remove support for Python 2.6 and older (py-pdf#776) New Features (ENH): - Extract document permissions (py-pdf#320) Bug Fixes (BUG): - Clip by trimBox when merging pages, which would otherwise be ignored (py-pdf#240) - Add overwriteWarnings parameter PdfFileMerger (py-pdf#243) - IndexError for getPage() of decryped file (py-pdf#359) - Handle cases where decodeParms is an ArrayObject (py-pdf#405) - Updated PDF fields don't show up when page is written (py-pdf#412) - Set Linked Form Value (py-pdf#414) - Fix zlib -5 error for corrupt files (py-pdf#603) - Fix reading more than last1K for EOF (py-pdf#642) - Acciental import Robustness (ROB): - Allow extra whitespace before "obj" in readObjectHeader (py-pdf#567) Documentation (DOC): - Link to pdftoc in Sample_Code (py-pdf#628) - Working with annotations (py-pdf#764) - Structure history Developer Experience (DEV): - Add issue templates (py-pdf#765) - Add tool to generate changelog Maintenance (MAINT): - Use grouped constants instead of string literals (py-pdf#745) - Add error module (py-pdf#768) - Use decorators for @staticmethod (py-pdf#775) - Split long functions (py-pdf#777) Testing (TST): - Run tests in CI once with -OO Flags (py-pdf#770) - Filling out forms (py-pdf#771) - Add tests for Writer (py-pdf#772) - Error cases (py-pdf#773) - Check Error messages (py-pdf#769) - Regression test for issue py-pdf#88 - Regression test for issue py-pdf#327 Code Style (STY): - Make variable naming more consistent in tests All changes: py-pdf/pypdf@1.27.5...1.27.6
The header being read by
readObjectHeaderhas the format:where
<idnum>and<generation>are integers.Previously an arbitrary number of spaces was being allowed between
<idnum>and<generation>, but not between<generation>andobj.With this pull request an arbitrary number of spaces between
<generation>andobjis allowed (but raises a warning similarly to how other extraneous whitespace is handled) .