readObjectHeader: Allow extra whitespace before "obj" by malthejorgensen · Pull Request #567 · py-pdf/pypdf

malthejorgensen · 2020-07-16T20:23:57Z

The header being read by readObjectHeader has the format:

<idnum> <generation> obj

where <idnum> and <generation> are integers.
Previously an arbitrary number of spaces was being allowed between <idnum> and <generation>, but not between <generation> and obj.

With this pull request an arbitrary number of spaces between <generation> and obj is allowed (but raises a warning similarly to how other extraneous whitespace is handled) .

MartinThoma · 2022-04-16T06:17:36Z

@malthejorgensen Sorry that it took so long - the PR looks good to me. It has merge conflicts which make it impossible for me to merge it at the moment. Would you mind solving them (or to open a new PR; might be simpler)?

I would understand if you don't want to do this. Then I'd add the change myself, giving you credit via githubs co-authored-by feature

The header being read has the format: <idnum> <generation> obj where `<idnum>` and `<generation>` are integers. Previously an arbitrary number of spaces was being allowed between `<idnum>` and `<generation>`, but not between `<generation>` and `obj`. We now allow arbitrary spaces between `<generation>` and `obj`.

malthejorgensen · 2022-04-16T06:52:41Z

@MartinThoma No problem :) – hereby rebased.

codecov-commenter · 2022-04-16T07:00:39Z

Codecov Report

Merging #567 (74c573c) into main (a5875c5) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main     #567   +/-   ##
=======================================
  Coverage   69.63%   69.64%           
=======================================
  Files           9        9           
  Lines        3316     3317    +1     
  Branches      783      783           
=======================================
+ Hits         2309     2310    +1     
  Misses        763      763           
  Partials      244      244

Impacted Files	Coverage Δ
PyPDF2/pdf.py	`72.38% <100.00%> (+0.01%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a5875c5...74c573c. Read the comment docs.

MartinThoma · 2022-04-16T07:06:46Z

Thank you very much! It was merged and will be part of the next release (some time this month)

@staticmethod

Deprecations (DEP): - Remove support for Python 2.6 and older (#776) New Features (ENH): - Extract document permissions (#320) Bug Fixes (BUG): - Clip by trimBox when merging pages, which would otherwise be ignored (#240) - Add overwriteWarnings parameter PdfFileMerger (#243) - IndexError for getPage() of decryped file (#359) - Handle cases where decodeParms is an ArrayObject (#405) - Updated PDF fields don't show up when page is written (#412) - Set Linked Form Value (#414) - Fix zlib -5 error for corrupt files (#603) - Fix reading more than last1K for EOF (#642) - Acciental import Robustness (ROB): - Allow extra whitespace before "obj" in readObjectHeader (#567) Documentation (DOC): - Link to pdftoc in Sample_Code (#628) - Working with annotations (#764) - Structure history Developer Experience (DEV): - Add issue templates (#765) - Add tool to generate changelog Maintenance (MAINT): - Use grouped constants instead of string literals (#745) - Add error module (#768) - Use decorators for @staticmethod (#775) - Split long functions (#777) Testing (TST): - Run tests in CI once with -OO Flags (#770) - Filling out forms (#771) - Add tests for Writer (#772) - Error cases (#773) - Check Error messages (#769) - Regression test for issue #88 - Regression test for issue #327 Code Style (STY): - Make variable naming more consistent in tests All changes: 1.27.5...1.27.6

@staticmethod

Deprecations (DEP): - Remove support for Python 2.6 and older (py-pdf#776) New Features (ENH): - Extract document permissions (py-pdf#320) Bug Fixes (BUG): - Clip by trimBox when merging pages, which would otherwise be ignored (py-pdf#240) - Add overwriteWarnings parameter PdfFileMerger (py-pdf#243) - IndexError for getPage() of decryped file (py-pdf#359) - Handle cases where decodeParms is an ArrayObject (py-pdf#405) - Updated PDF fields don't show up when page is written (py-pdf#412) - Set Linked Form Value (py-pdf#414) - Fix zlib -5 error for corrupt files (py-pdf#603) - Fix reading more than last1K for EOF (py-pdf#642) - Acciental import Robustness (ROB): - Allow extra whitespace before "obj" in readObjectHeader (py-pdf#567) Documentation (DOC): - Link to pdftoc in Sample_Code (py-pdf#628) - Working with annotations (py-pdf#764) - Structure history Developer Experience (DEV): - Add issue templates (py-pdf#765) - Add tool to generate changelog Maintenance (MAINT): - Use grouped constants instead of string literals (py-pdf#745) - Add error module (py-pdf#768) - Use decorators for @staticmethod (py-pdf#775) - Split long functions (py-pdf#777) Testing (TST): - Run tests in CI once with -OO Flags (py-pdf#770) - Filling out forms (py-pdf#771) - Add tests for Writer (py-pdf#772) - Error cases (py-pdf#773) - Check Error messages (py-pdf#769) - Regression test for issue py-pdf#88 - Regression test for issue py-pdf#327 Code Style (STY): - Make variable naming more consistent in tests All changes: py-pdf/pypdf@1.27.5...1.27.6

MartinThoma added the Tiny Pull requests that make a tiny change - and thus should be easy to merge label Apr 6, 2022

MartinThoma added the soon PRs that are almost ready to be merged, issues that get solved pretty soon label Apr 16, 2022

malthejorgensen force-pushed the readObjectHeader-allow-extra-spaces-before-obj branch from 5b3d04f to 0184fda Compare April 16, 2022 06:51

malthejorgensen force-pushed the readObjectHeader-allow-extra-spaces-before-obj branch from 0184fda to 74c573c Compare April 16, 2022 06:51

MartinThoma added the is-robustness-issue From a users perspective, this is about robustness label Apr 16, 2022

MartinThoma merged commit cf20f92 into py-pdf:main Apr 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readObjectHeader: Allow extra whitespace before "obj"#567

readObjectHeader: Allow extra whitespace before "obj"#567
MartinThoma merged 1 commit intopy-pdf:mainfrom
eduflow:readObjectHeader-allow-extra-spaces-before-obj

malthejorgensen commented Jul 16, 2020

Uh oh!

MartinThoma commented Apr 16, 2022

Uh oh!

malthejorgensen commented Apr 16, 2022

Uh oh!

codecov-commenter commented Apr 16, 2022

Uh oh!

MartinThoma commented Apr 16, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

malthejorgensen commented Jul 16, 2020

Uh oh!

MartinThoma commented Apr 16, 2022

Uh oh!

malthejorgensen commented Apr 16, 2022

Uh oh!

codecov-commenter commented Apr 16, 2022

Codecov Report

Uh oh!

MartinThoma commented Apr 16, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants