ENH: Sort computed /ProcSet in a merged page, for reproducibility#1542
Merged
MartinThoma merged 2 commits intopy-pdf:mainfrom Jan 22, 2023
Merged
ENH: Sort computed /ProcSet in a merged page, for reproducibility#1542MartinThoma merged 2 commits intopy-pdf:mainfrom
MartinThoma merged 2 commits intopy-pdf:mainfrom
Conversation
Codecov ReportBase: 91.86% // Head: 91.86% // No change to project coverage 👍
Additional details and impacted files@@ Coverage Diff @@
## main #1542 +/- ##
=======================================
Coverage 91.86% 91.86%
=======================================
Files 33 33
Lines 6207 6207
Branches 1229 1229
=======================================
Hits 5702 5702
Misses 326 326
Partials 179 179
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
Member
|
I really hope that there is no hidden meaning of the ProcSet order. I couldn't find anything in the specs, so it should be fine. Thank you for the PR! It will be in the release today :-) |
MartinThoma
added a commit
that referenced
this pull request
Jan 22, 2023
New Features (ENH): - Add page label support to PdfWriter (#1558) - Accept inline images with space before EI (#1552) - Add circle annotation support (#1556) - Add polygon annotation support (#1557) - Make merging pages produce a deterministic PDF (#1542, #1543) Bug Fixes (BUG): - Fix error in cmap extraction (#1544) - Remove erroneous assertion check (#1564) - Fix dictionary access of optional page label keys (#1562) Robustness (ROB): - Set ignore_eof=True for read_until_regex (#1521) Documentation (DOC): - Paper size (#1550) Developer Experience (DEV): - Fix broken combination of dependencies of docs.txt - Annotate tests appropriately (#1551) [Full Changelog](3.2.1...3.3.0)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This fixes #1531 by sorting the
/ProcSetarray when merging two pages.Inferring from the name and the existing code's use of
frozenset, I believe/ProcSetis a set ofNameObjects, where the order doesn't matter. This means that it is should be fine formerge_pageto always choose a single consistent order for reproducibility, and the easiest order is sorting.Another option would be reproducing the order in the original pages as best as possible (e.g. if page 1 had
/B, /Aand page 2 had/C, /B, /D, then it could compute/B, /A, /C, /D, preserving the order for page 1 and page 2, except for duplicating/B), but this seems like unnecessary complexity.