Skip to content

Upgrade pillow#1209

Closed
okor wants to merge 15 commits intothumbor:masterfrom
okor:upgrade-pillow
Closed

Upgrade pillow#1209
okor wants to merge 15 commits intothumbor:masterfrom
okor:upgrade-pillow

Conversation

@okor
Copy link
Copy Markdown
Member

@okor okor commented Jun 26, 2019

This PR upgrades Pillow to version 6.0.0. It's the first of at least 2 PRs with the end goal of removing the and removes the piexif library dependency. This is desirable because piexif has at least one bug which causes explosive memory consumption eventually OOMing whatever host Thumbor is running on. See #1204 for details on the bug.

All tests pass so it seems fine. Tests are currently failing due to a bug in Pillow. Has to be fixed first.

@coveralls
Copy link
Copy Markdown

coveralls commented Jun 26, 2019

Coverage Status

Coverage remained the same at ?% when pulling 9276a6c on okor:upgrade-pillow into 8312a1e on thumbor:master.

.travis.yml Outdated
- ffmpeg
install:
- pip install --upgrade pip
- pip install git+git://github.com/okor/remotecv.git@439324abb8014969467ad18c65188b8e3db48b78
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a temporary hack until a new version of remotecv with Pillow version 6 is released. After the new version is released we can remove this instruction.

sources:
- trusty-media
packages:
- libgnutls28-dev
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason pycurl couldn't be built without installing this package. I'm not sure what changed but it seems something has.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see #1207

@okor okor force-pushed the upgrade-pillow branch from 5b9a6bc to 4af4a3d Compare June 26, 2019 21:14
@okor okor force-pushed the upgrade-pillow branch from 4af4a3d to 2a16651 Compare June 26, 2019 21:23
@okor okor marked this pull request as ready for review June 26, 2019 21:33
@okor okor changed the title [WIP] upgrade pillow Upgrade pillow Jun 26, 2019
@okor okor force-pushed the upgrade-pillow branch from f29b519 to 2a16651 Compare June 26, 2019 21:54
@okor okor changed the title Upgrade pillow [WIP] Upgrade pillow Jul 2, 2019
- ffmpeg
install:
- pip install --upgrade pip
- pip install git+git://github.com/thumbor/remotecv.git@a8cbb6997fec30d584e7e47874c70d91c932d511
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hack to install from git as the release we need is not yet published.

"flake8",
"yanc",
"remotecv",
# "remotecv==2.2.3",
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be uncommented when release is published. See https://github.com/thumbor/thumbor/pull/1209/files#r300174816

@okor okor changed the title [WIP] Upgrade pillow Upgrade pillow Jul 3, 2019
im = Image.new('RGB', (1, 1))
save_buffer = BytesIO()
im.save(save_buffer, format="JPEG", exif=self.exif)
return Image.open(save_buffer).getexif()
Copy link
Copy Markdown
Member Author

@okor okor Jul 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code above is bit expensive but it does work to get us an editable PIL.Image.Exif object. It's also portable so other engines can use it, like the opencv-engine (pillow was already a dependency). LMK if you can think of a better way to do this.

Also note that there were two reasons I didn't actually use https://github.com/python-pillow/Pillow/blob/a986fed5b445bc9586476167b08d46e19cba1bbc/src/PIL/ImageOps.py#L523

  1. Since it's a Pillow only image transform it might break the use of other engines.
  2. We had been setting the orientation exif value to 1 after reorienting where exif_transpose was deleting the orientation key all together. A subtle difference but it might matter to somebody.

While imperfect it is pretty flexible.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making a fake image with exif data seems like overkill. You could just use Exif class directly.
Also not sure why you want to access to private _data field. That looks fragile. I think we can use Exif object directly.

Index: thumbor/engines/__init__.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
--- thumbor/engines/__init__.py	(date 1562192633000)
+++ thumbor/engines/__init__.py	(date 1562196503403)
@@ -9,8 +9,7 @@
 # Copyright (c) 2011 globo.com thumbor@googlegroups.com
 
 from xml.etree.ElementTree import ParseError
-from io import BytesIO
-from PIL import Image
+from PIL.Image import Exif
 
 try:
     import cairosvg
@@ -232,21 +231,12 @@
         width, height = self.size
         return round(float(new_width) * height / width, 0)
 
-    def _get_exif_dict(self):
-        exif_object = self._get_exif_object()
-        if exif_object is None:
-            return
-
-        return exif_object._data
-
     def _get_exif_object(self):
-        if (not hasattr(self, 'exif')) or self.exif is None:
-            return None
+        exif = Exif()
+        if (hasattr(self, 'exif')) and self.exif is not None:
+            exif.load(self.exif)
 
-        im = Image.new('RGB', (1, 1))
-        save_buffer = BytesIO()
-        im.save(save_buffer, format="JPEG", exif=self.exif)
-        return Image.open(save_buffer).getexif()
+        return exif
 
     def get_orientation(self):
         """
@@ -255,10 +245,8 @@
         :return: Orientation value (1 - 8)
         :rtype: int or None
         """
-        exif_dict = self._get_exif_dict()
-        if exif_dict and exif_dict.get(274):
-            return exif_dict[274]
-        return None
+        exif = self._get_exif_object()
+        return exif.get(0x0112) or None
 
     def reorientate(self, override_exif=True):
         """
@@ -295,7 +283,7 @@
         if orientation != 1 and override_exif:
             exif = self._get_exif_object()
             exif[0x0112] = 1
-            self.exif = exif
+            self.exif = exif.tobytes()
 
     def gen_image(self, size, color):
         raise NotImplementedError()
Index: tests/engines/test_base_engine.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
--- tests/engines/test_base_engine.py	(date 1562192633000)
+++ tests/engines/test_base_engine.py	(date 1562196346827)
@@ -197,7 +197,7 @@
         expect(self.engine.get_orientation()).to_be_null()
 
     def test_get_orientation_without_orientation_in_exif(self):
-        self.engine.exif = 'Exif\x00\x00II*\x00\x08\x00\x00\x00\x04\x00\x1a\x01\x05\x00\x01\x00\x00\x006\x00\x00\x00\x1b\x01\x05\x00\x01\x00\x00\x00>\x00\x00\x00(\x01\x03\x00\x01\x00\x00\x00\x02\x00\x00\x00\x13\x02\x03\x00\x01\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x01\x00\x00\x00H\x00\x00\x00\x01\x00\x00\x00'
+        self.engine.exif = b'Exif\x00\x00II*\x00\x08\x00\x00\x00\x04\x00\x1a\x01\x05\x00\x01\x00\x00\x006\x00\x00\x00\x1b\x01\x05\x00\x01\x00\x00\x00>\x00\x00\x00(\x01\x03\x00\x01\x00\x00\x00\x02\x00\x00\x00\x13\x02\x03\x00\x01\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x01\x00\x00\x00H\x00\x00\x00\x01\x00\x00\x00'
         expect(self.engine.get_orientation()).to_be_null()
 
     def test_get_orientation(self):

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great idea. Though I'm not sure why these tests are failing https://travis-ci.org/thumbor/thumbor/jobs/557429663#L1506

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me take a look.
It was fine running tests on my local before I submitted suggestion.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kkopachev Any luck?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really a luck, but I figured it's Pillow's issue. Even with your original code it would fail.
Only reason it does not fail, because you put Exif object (not exif bytes) back into self.exif property. And it is luck that it's not used when saving image down the line, because PRESERVE_EXIF_INFO is off for these tests.
I'll dig around a bit more and probably going to open a ticket in Pillow
😢

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simple repro script

from PIL import Image, ImageOps

im = Image.open('tests/fixtures/images/10_years_of_Wikipedia_by_Guillaume_Paumier.jpg')
im = ImageOps.exif_transpose(im)

@okor
Copy link
Copy Markdown
Member Author

okor commented Jul 3, 2019

@kkopachev @heynemann 👋 I think this is good enough for a review. I obvs ended up bundling the pillow upgrade along side removing the piexif dependency.

@kkopachev
Copy link
Copy Markdown
Member

So I've found the bug in Pillow, which causes one particular test to fail, and fix is quite simple, but requires some monkey patching (:disappointed:). But then I tried to add test for it in Pillow and figured that there are tons of bugs in exif saving in Pillow and that won't be easy to fix them and wait for 3 months for new release, etc.
So, sadness all around 😢

I'm not a big fun of pyexiv2 dependency either.
I am thinking that we could adopt simple exif orientation reader like this https://stackoverflow.com/a/32490603 which meant to find orientation tag. Adjust it so it returns offset of orientation value. Then simply replace value with 1 upon reorientation. Seems easy and (given simplicity) would be more robust and bullet proof solution. One downside is that instead of dropping the value, we'd have to set it to 1.

Thoughts?

@okor
Copy link
Copy Markdown
Member Author

okor commented Jul 16, 2019

@kkopachev Thank you for looking into this.

  • I think we should submit a bug report to Pillow either way. Because ultimately I'd prefer to use a library and there is also benefit to their project as they may not be aware of the bug.
  • It seems like writing an EXIF orientation only reader may be our best bet to get something released in a reasonable amount of time.

I'm a little exhausted working on this task so I'm going to take a break for a bit. It's been rather frustrating.

@okor
Copy link
Copy Markdown
Member Author

okor commented Jul 16, 2019

@kkopachev I submitted a bug report to Pillow python-pillow/Pillow#3973 lmk if you think there is something missing. Thanks again.

@kkopachev
Copy link
Copy Markdown
Member

Awesome, thank you for all your work, @okor. I'm little busy with everything, but I tried to squeeze in coding for orientation-only exif writer. I'm half way there, pretty much. Will keep to post my progress here.

@kkopachev
Copy link
Copy Markdown
Member

see #1210

@okor
Copy link
Copy Markdown
Member Author

okor commented Jul 24, 2019

Closing in favor of #1210

@okor okor closed this Jul 24, 2019
@okor okor deleted the upgrade-pillow branch July 24, 2019 20:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants