Hi!
I'm a maintainer of pdfly, that has an extract-images subcommand, that uses PageObject.images internally.
Recently, a user reported an image compression issue: py-pdf/pdfly#200
I investigated and I think I found what happens:
Suggested bug fix
In _xobj_to_image(), pypdf could simply provide an extra quality="keep" to Image.save().
IMHO this seems like the best default value,
even if that means a non-fully-backward-compatible change to pypdf.
Feature request
Could it be also possible to introduce a way to provide a custom value for this quality parameter provided to Image.save(), please?
Ideally a new optional argument would be great, but I don't quite see how to make this works with the PageObject.images property that returns a VirtualListImages...
Maybe through a global variable ?
Or through a new parameter of PdfReader?
Hi!
I'm a maintainer of
pdfly, that has anextract-imagessubcommand, that usesPageObject.imagesinternally.Recently, a user reported an image compression issue: py-pdf/pdfly#200
I investigated and I think I found what happens:
pypdf,PageObject.imagesinvokesPageObject._get_image()that calls_xobj_to_image()_xobj_to_image()calls theImage.save()method of Pillow that use a compression of75%by default: https://pillow.readthedocs.io/en/stable/handbook/image-file-formats.html#jpeg-savingSuggested bug fix
In
_xobj_to_image(),pypdfcould simply provide an extraquality="keep"toImage.save().IMHO this seems like the best default value,
even if that means a non-fully-backward-compatible change to
pypdf.Feature request
Could it be also possible to introduce a way to provide a custom value for this
qualityparameter provided toImage.save(), please?Ideally a new optional argument would be great, but I don't quite see how to make this works with the
PageObject.imagesproperty that returns aVirtualListImages...Maybe through a global variable ?
Or through a new parameter of
PdfReader?