Skip to content

[FW] 17.0 pypdf trixie compat moc#184787

Closed
fw-bot wants to merge 8 commits intoodoo:saas-17.2from
odoo-dev:saas-17.2-17.0-pypdf-trixie-compat-moc-8-Cd-fw
Closed

[FW] 17.0 pypdf trixie compat moc#184787
fw-bot wants to merge 8 commits intoodoo:saas-17.2from
odoo-dev:saas-17.2-17.0-pypdf-trixie-compat-moc-8-Cd-fw

Conversation

@fw-bot
Copy link
Contributor

@fw-bot fw-bot commented Oct 22, 2024

As Debian wants to remove pypdf2 and keep only pypdf (4.3) in
trixie, so we need to be compatible otherwise Odoo could not be released in the next Debian.

Forward-Port-Of: #183165

xmo-odoo and others added 8 commits October 22, 2024 17:10
Motivation: Debian wants to kill pypdf2 and keep only pypdf (4.3) in
trixie, so we need to be compatible or risk being dropped.

- move all imports from PyPDF2 to odoo.tools.pdf
- add two-ways compatibility shims and aliases

X-original-commit: fddf53c
In 3.x the PDF header is the header *only*, as it's decoded to a
string. However the mess is mostly unnecessary as since 1.27 pypdf2
always adds a binary comments line when writing out documents. So only
add (or copy) our own garbage when using 1.x.

Also this means our heuristic to detect pdf/a documents makes no
sense, as any pdf document can have a binary sig, but hey...

Also fix _ID handling: in pypdf 3.x it's a property which reads from
the trailer *and* it's automatically copied over by
`clone_reader_document_root`.

X-original-commit: 3dd4a34
Try to override both old and new API, although for the new version it
might actually be useless as pypdf seems to have added multiple
attachments support circa 3.5? (py-pdf/pypdf#1611)

X-original-commit: c78870c
The test relied on reader and writer sharing data because
`clone_reader_document_root` would just copy the reference to the root
object, so the writer would immediately impact the reader.

pypdf apparently now copies the document deeply (possibly since 3.2.0
/ py-pdf/pypdf#1520) so this is broken, we need to serialize and
reload the document every time.

X-original-commit: d5807f4
Next up in the series "`clone_document_from_reader` does stuff now",
`clone_document_from_reader` now properly clones over document
metadata. This means any metadata set *before* the cloning are nuked.

Update the branded writer to write its metadata at the last possible
moment, to ensure it always applies (this might actually be
undesirable as it means callers can't trivially override it anymore).

X-original-commit: e391863
`test_download_with_encrypted_pdf` tried to corrupt the PDF by
redirecting the /Encrypt indirect ref to an invalid entry, but it did
so by replacing a hard-coded index. Except the encryption object is
not stored at a fixed offset, where it's stored depends on what pypdf
wants to do with it or what other objects it stores in the file.

Update the replacement to not be completely hard-coded (and also not
assume /Encrypt is at the end of the trailer), and point it back to a
very early object (specifically the first one) which is *extremely*
unlikely to be a valid encryption object (it's very likely to be
either the catalog or the page tree root, though it could also be the
metadata map). This leads pypdf to not find a `/Filter` entry, which
triggers a parse error.

Note: the indirect ref can't be pointed at a nonsensical entry (e.g. 0
or 5479) because in that case pypdf just ignores the ref entirely.

X-original-commit: f37d753
Partial cherry-pick of 59709ca to be able to build a debian package in
Trixie.

X-original-commit: 72b419e
@robodoo
Copy link
Contributor

robodoo commented Oct 22, 2024

Pull request status dashboard

@fw-bot
Copy link
Contributor Author

fw-bot commented Oct 22, 2024

@d-fence while this was properly forward-ported, at least one co-dependent PR (odoo/enterprise#72538) did not succeed. You will need to fix it before this can be merged.

Both this PR and the others will need to be approved via @robodoo r+ as they are all considered “in conflict”.

More info at https://github.com/odoo/odoo/wiki/Mergebot#forward-port

@robodoo robodoo added forwardport This PR was created by @fw-bot conflict There was an error while creating this forward-port PR labels Oct 22, 2024
@C3POdoo C3POdoo added the RD research & development, internal work label Oct 22, 2024
@d-fence
Copy link
Contributor

d-fence commented Oct 24, 2024

robodoo override=ci/style

This is the PR for the tools that this ci is asking for 🐍

@d-fence
Copy link
Contributor

d-fence commented Oct 24, 2024

robodoo override=ci/security

was already overriden by xmo in the initial PR

@d-fence
Copy link
Contributor

d-fence commented Oct 25, 2024

robodoo r+

@robodoo
Copy link
Contributor

robodoo commented Oct 25, 2024

@d-fence linked pull request(s) odoo/enterprise#72538 not ready. Linked PRs are not staged until all of them are ready.

robodoo pushed a commit that referenced this pull request Oct 25, 2024
Motivation: Debian wants to kill pypdf2 and keep only pypdf (4.3) in
trixie, so we need to be compatible or risk being dropped.

- move all imports from PyPDF2 to odoo.tools.pdf
- add two-ways compatibility shims and aliases

X-original-commit: fddf53c
Part-of: #184787
Related: odoo/enterprise#72538
Signed-off-by: Christophe Monniez (moc) <moc@odoo.com>
robodoo pushed a commit that referenced this pull request Oct 25, 2024
In 3.x the PDF header is the header *only*, as it's decoded to a
string. However the mess is mostly unnecessary as since 1.27 pypdf2
always adds a binary comments line when writing out documents. So only
add (or copy) our own garbage when using 1.x.

Also this means our heuristic to detect pdf/a documents makes no
sense, as any pdf document can have a binary sig, but hey...

Also fix _ID handling: in pypdf 3.x it's a property which reads from
the trailer *and* it's automatically copied over by
`clone_reader_document_root`.

X-original-commit: 3dd4a34
Part-of: #184787
Related: odoo/enterprise#72538
Signed-off-by: Christophe Monniez (moc) <moc@odoo.com>
robodoo pushed a commit that referenced this pull request Oct 25, 2024
Try to override both old and new API, although for the new version it
might actually be useless as pypdf seems to have added multiple
attachments support circa 3.5? (py-pdf/pypdf#1611)

X-original-commit: c78870c
Part-of: #184787
Related: odoo/enterprise#72538
Signed-off-by: Christophe Monniez (moc) <moc@odoo.com>
robodoo pushed a commit that referenced this pull request Oct 25, 2024
The test relied on reader and writer sharing data because
`clone_reader_document_root` would just copy the reference to the root
object, so the writer would immediately impact the reader.

pypdf apparently now copies the document deeply (possibly since 3.2.0
/ py-pdf/pypdf#1520) so this is broken, we need to serialize and
reload the document every time.

X-original-commit: d5807f4
Part-of: #184787
Related: odoo/enterprise#72538
Signed-off-by: Christophe Monniez (moc) <moc@odoo.com>
robodoo pushed a commit that referenced this pull request Oct 25, 2024
Next up in the series "`clone_document_from_reader` does stuff now",
`clone_document_from_reader` now properly clones over document
metadata. This means any metadata set *before* the cloning are nuked.

Update the branded writer to write its metadata at the last possible
moment, to ensure it always applies (this might actually be
undesirable as it means callers can't trivially override it anymore).

X-original-commit: e391863
Part-of: #184787
Related: odoo/enterprise#72538
Signed-off-by: Christophe Monniez (moc) <moc@odoo.com>
robodoo pushed a commit that referenced this pull request Oct 25, 2024
X-original-commit: 0180654
Part-of: #184787
Related: odoo/enterprise#72538
Signed-off-by: Christophe Monniez (moc) <moc@odoo.com>
robodoo pushed a commit that referenced this pull request Oct 25, 2024
`test_download_with_encrypted_pdf` tried to corrupt the PDF by
redirecting the /Encrypt indirect ref to an invalid entry, but it did
so by replacing a hard-coded index. Except the encryption object is
not stored at a fixed offset, where it's stored depends on what pypdf
wants to do with it or what other objects it stores in the file.

Update the replacement to not be completely hard-coded (and also not
assume /Encrypt is at the end of the trailer), and point it back to a
very early object (specifically the first one) which is *extremely*
unlikely to be a valid encryption object (it's very likely to be
either the catalog or the page tree root, though it could also be the
metadata map). This leads pypdf to not find a `/Filter` entry, which
triggers a parse error.

Note: the indirect ref can't be pointed at a nonsensical entry (e.g. 0
or 5479) because in that case pypdf just ignores the ref entirely.

X-original-commit: f37d753
Part-of: #184787
Related: odoo/enterprise#72538
Signed-off-by: Christophe Monniez (moc) <moc@odoo.com>
robodoo pushed a commit that referenced this pull request Oct 25, 2024
Partial cherry-pick of 59709ca to be able to build a debian package in
Trixie.

closes #184787

X-original-commit: 72b419e
Related: odoo/enterprise#72538
Signed-off-by: Christophe Monniez (moc) <moc@odoo.com>
@robodoo robodoo closed this Oct 25, 2024
@d-fence d-fence deleted the saas-17.2-17.0-pypdf-trixie-compat-moc-8-Cd-fw branch November 7, 2024 08:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

conflict There was an error while creating this forward-port PR forwardport This PR was created by @fw-bot RD research & development, internal work

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants