-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Closed
Labels
is-robustness-issueFrom a users perspective, this is about robustnessFrom a users perspective, this is about robustnessworkflow-imagesFrom a users perspective, image handling is the affected feature/workflowFrom a users perspective, image handling is the affected feature/workflow
Description
In
Lines 182 to 183 in b7ae2e5
| if len(data) % rowlength != 0: | |
| raise PdfReadError("Image data is not rectangular") |
I therefore would like to relax this to only issue a warning and do the necessary padding on our side:
missing_bytes = b"\x00" * (rowlength - len(data) % rowlength)
data += missing_bytesEnvironment
Which environment were you using when you encountered the problem?
$ python -m platform
Linux-6.4.0-150600.23.42-default-x86_64-with-glibc2.38
$ python -c "import pypdf;print(pypdf._debug_versions)"
pypdf==5.4.0, crypt_provider=('cryptography', '44.0.0'), PIL=11.1.0Code + PDF
This is a minimal, complete example that shows the issue:
from pypdf import PdfReader
reader = PdfReader('file.pdf')
for page in reader.pages:
for name, image in page.images.items():
print(name)I currently do not have an example I could share publicly.
Traceback
This is the complete traceback I see:
Traceback (most recent call last):
File "/home/stefan/tmp/pypdf/run.py", line 5, in <module>
for name, image in page.images.items():
^^^^^^^^^^^^^^^^^^^
File "/home/stefan/tmp/pypdf/pypdf/_page.py", line 444, in items
return [(x, self[x]) for x in self.ids_function()]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/stefan/tmp/pypdf/pypdf/_page.py", line 444, in <listcomp>
return [(x, self[x]) for x in self.ids_function()]
~~~~^^^
File "/home/stefan/tmp/pypdf/pypdf/_page.py", line 464, in __getitem__
return self.get_function(index)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/stefan/tmp/pypdf/pypdf/_page.py", line 657, in _get_image
imgd = _xobj_to_image(cast(DictionaryObject, xobjs[id]))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/stefan/tmp/pypdf/pypdf/filters.py", line 741, in _xobj_to_image
data = x_object_obj.get_data() # type: ignore
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/stefan/tmp/pypdf/pypdf/generic/_data_structures.py", line 1109, in get_data
decoded.set_data(decode_stream_data(self))
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/stefan/tmp/pypdf/pypdf/filters.py", line 657, in decode_stream_data
data = FlateDecode.decode(data, params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/stefan/tmp/pypdf/pypdf/filters.py", line 172, in decode
str_data = FlateDecode._decode_png_prediction(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/stefan/tmp/pypdf/pypdf/filters.py", line 183, in _decode_png_prediction
raise PdfReadError("Image data is not rectangular")
pypdf.errors.PdfReadError: Image data is not rectangular
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
is-robustness-issueFrom a users perspective, this is about robustnessFrom a users perspective, this is about robustnessworkflow-imagesFrom a users perspective, image handling is the affected feature/workflowFrom a users perspective, image handling is the affected feature/workflow