ENH : complete attachments functions (get/add)#1611
ENH : complete attachments functions (get/add)#1611MartinThoma merged 17 commits intopy-pdf:mainfrom
Conversation
Codecov ReportBase: 92.28% // Head: 92.30% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #1611 +/- ##
==========================================
+ Coverage 92.28% 92.30% +0.02%
==========================================
Files 33 33
Lines 6376 6410 +34
Branches 1271 1280 +9
==========================================
+ Hits 5884 5917 +33
Misses 312 312
- Partials 180 181 +1
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
|
ready for 3.5.0 😊 |
Co-authored-by: Martin Thoma <info@martin-thoma.de>
Co-authored-by: Martin Thoma <info@martin-thoma.de>
Co-authored-by: Martin Thoma <info@martin-thoma.de>
Co-authored-by: Martin Thoma <info@martin-thoma.de>
Co-authored-by: Martin Thoma <info@martin-thoma.de>
|
That's a pretty awesome new feature! I'm curious to try this out from a users perspective tomorrow! Regarding the interface, I want to try a couple of things:
|
That's feasable however not sure it will be very useful...
The PDF will have been loaded in memory any way. but it will still be "compressed". I added an optional parameter to get only the file data you are looking for.
From my test, they are independant (although Acrobat Reader shows them in the attachment list).I agree embedded_files may be clearer |
|
This is the file generated with two attachments: The following readers seem not to be able to display this attachment / attachments at all:
Atril Document Viewer 1.24.0Document Viewer 3.36.10Firefox |
Co-authored-by: Martin Thoma <info@martin-thoma.de>
|
I'm happy with the implementation, but I'm uncertain about the public interface in the PdfReader. For the reader, could it be an attribute class _DelayedDict:
def __init__(self, list_attachments: Callable[[], List[str]], get_attachment: Callable[[str], bytes]]):
self.list_attachments = list_attachments
self.get_attachment = get_attachment
def keys(self) -> List[str]:
return self.list_attachments()
def __len__(self) -> int:
return len(self.keys())
def __getitem__(self, key: str) -> bytes:
return self.get_attachment(key)The idea is similar to the One way to get this merged fast would be to postpone the decision on the public interface by re-naming:
|
|
I've just completed with a commit for a test case I missed: multiple files with the same name (this case is acceptable for Acrobat Reader) : if multiple inputs have the same name, first they will be present many times in the list_attachments. In the dict returned by get_attachements they will be present in an array (in the same order as in the pdf attachment array |
|
@MartinThoma |
Add `PdfReader.attachments -> Mapping[str, List[bytes]] as a public interface. The heavy-lifting was done by @pubpub-zz in #1611 . This PR only adds the interface for the exiting functions.
New Features (ENH) - Add reader.attachments public interface (#1611, #1661) - Add PdfWriter.remove_objects_from_page(page: PageObject, to_delete: ObjectDeletionFlag) (#1648) - Allow free-text annotation to have transparent border/background (#1664) Bug Fixes (BUG) - Allow decryption with empty password for AlgV5 (#1663) - Let PdfWriter.pages return PageObject after calling `clone_document_from_reader()` (#1613) - Invalid font pointed during merge_resources (#1641) Robustness (ROB) - Cope with invalid objects in IndirectObject.clone (#1637) - Improve tolerance to invalid Names/Dests (#1658) - Decode encoded values in get_fields (#1636) - Let PdfWriter.merge cope with missing "/Fields" (#1628) [Full Changelog](3.4.1...3.5.0)
Try to override both old and new API, although for the new version it might actually be useless as pypdf seems to have added multiple attachments support circa 3.5? (py-pdf/pypdf#1611)
Try to override both old and new API, although for the new version it might actually be useless as pypdf seems to have added multiple attachments support circa 3.5? (py-pdf/pypdf#1611) Part-of: #183165 Related: odoo/enterprise#71676 Signed-off-by: Christophe Monniez (moc) <moc@odoo.com>
Try to override both old and new API, although for the new version it might actually be useless as pypdf seems to have added multiple attachments support circa 3.5? (py-pdf/pypdf#1611) X-original-commit: c78870c
Try to override both old and new API, although for the new version it might actually be useless as pypdf seems to have added multiple attachments support circa 3.5? (py-pdf/pypdf#1611) X-original-commit: c78870c Part-of: #184787 Related: odoo/enterprise#72538 Signed-off-by: Christophe Monniez (moc) <moc@odoo.com>
Try to override both old and new API, although for the new version it might actually be useless as pypdf seems to have added multiple attachments support circa 3.5? (py-pdf/pypdf#1611) X-original-commit: c78870c
Try to override both old and new API, although for the new version it might actually be useless as pypdf seems to have added multiple attachments support circa 3.5? (py-pdf/pypdf#1611) X-original-commit: c78870c Part-of: #185282 Related: odoo/enterprise#72781 Signed-off-by: Christophe Monniez (moc) <moc@odoo.com>
Try to override both old and new API, although for the new version it might actually be useless as pypdf seems to have added multiple attachments support circa 3.5? (py-pdf/pypdf#1611) X-original-commit: c78870c
Try to override both old and new API, although for the new version it might actually be useless as pypdf seems to have added multiple attachments support circa 3.5? (py-pdf/pypdf#1611) X-original-commit: c78870c Part-of: #185736 Related: odoo/enterprise#73014 Signed-off-by: Christophe Monniez (moc) <moc@odoo.com>
Try to override both old and new API, although for the new version it might actually be useless as pypdf seems to have added multiple attachments support circa 3.5? (py-pdf/pypdf#1611) X-original-commit: c78870c
Try to override both old and new API, although for the new version it might actually be useless as pypdf seems to have added multiple attachments support circa 3.5? (py-pdf/pypdf#1611) X-original-commit: c78870c Part-of: #186301 Related: odoo/enterprise#73330 Signed-off-by: Christophe Monniez (moc) <moc@odoo.com>



add_attachments now allows to produce a list of multiple files
get_attachemnts/list_attachements added
fixes #1047 #527 #169