TST: Compare extracted images against ground truth by MartinThoma · Pull Request #2072 · py-pdf/pypdf

MartinThoma · 2023-08-08T16:00:20Z

A function image_similarity was introduced which quantifies visual similarities of two images via Mean Squared Error (MSE). This way we can compare the extracted images with what we expect.

We cannot make a byte-wise comparison as updates to PIL can change the representation.

The new function helps us to ensure that updates to the pypdf code don't break image extraction.

pubpub-zz · 2023-08-08T16:47:16Z

you should have a look in test_filters.py : I did some image comparison using pillow

pubpub-zz · 2023-08-08T17:04:31Z

you should used ImageOps : this allow to prevent to look at the "encoded" image

codecov · 2023-08-08T18:39:35Z

Codecov Report

Patch and project coverage have no change.

Comparison is base (aad26dd) 94.23% compared to head (8a15677) 94.23%.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #2072   +/-   ##
=======================================
  Coverage   94.23%   94.23%           
=======================================
  Files          41       41           
  Lines        7340     7340           
  Branches     1445     1445           
=======================================
  Hits         6917     6917           
  Misses        263      263           
  Partials      160      160

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

MartinThoma · 2023-08-08T20:28:13Z

@pubpub-zz Now it's ready for review :-)

I undid the moving of test code. I don't know yet where exactly this code should be (test_images.py vs test_filters.py vs test_page.py). Only when we have a rather clear and helpful rule where to put which test, we can move stuff around.

pubpub-zz · 2023-08-09T06:33:17Z

@pubpub-zz Now it's ready for review :-)

I undid the moving of test code. I don't know yet where exactly this code should be (test_images.py vs test_filters.py vs test_page.py). Only when we have a rather clear and helpful rule where to put which test, we can move stuff around.

I personnally don't care and have no recommendation. at that time there was already some tests in test_filters.py that's why I've continued in the same way.

MartinThoma added 2 commits August 8, 2023 17:59

TST: Compare extracted images against ground truth

f9296c3

Add ids for easier readability

2b53bb5

MartinThoma added 2 commits August 8, 2023 18:56

MSE similarity

9300aba

Move tests from filters.py

8c3fc63

MartinThoma added 2 commits August 8, 2023 20:03

Use ImageChops

508e450

Fix it

189c47f

MartinThoma added 3 commits August 8, 2023 22:12

Undo moving of tests

1941601

Undo moving of tests

72ff3d3

Test the test code

4e725ea

MartinThoma requested a review from pubpub-zz August 8, 2023 20:26

MartinThoma mentioned this pull request Aug 8, 2023

PI: optimize _decode_png_prediction #2068

Merged

MartinThoma added 2 commits August 8, 2023 22:48

Get rid of resource warning

acf2060

BytesIO

8a15677

MartinThoma merged commit 82e8681 into main Aug 9, 2023

MartinThoma deleted the more-image-tests branch August 9, 2023 11:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TST: Compare extracted images against ground truth#2072

TST: Compare extracted images against ground truth#2072
MartinThoma merged 11 commits intomainfrom
more-image-tests

MartinThoma commented Aug 8, 2023 •

edited

Loading

Uh oh!

pubpub-zz commented Aug 8, 2023

Uh oh!

pubpub-zz commented Aug 8, 2023

Uh oh!

codecov bot commented Aug 8, 2023 •

edited

Loading

Uh oh!

MartinThoma commented Aug 8, 2023

Uh oh!

pubpub-zz commented Aug 9, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

MartinThoma commented Aug 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pubpub-zz commented Aug 8, 2023

Uh oh!

pubpub-zz commented Aug 8, 2023

Uh oh!

codecov bot commented Aug 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

MartinThoma commented Aug 8, 2023

Uh oh!

pubpub-zz commented Aug 9, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MartinThoma commented Aug 8, 2023 •

edited

Loading

codecov bot commented Aug 8, 2023 •

edited

Loading