Skip to content

ENH: Add Cloning #1371

Merged
MartinThoma merged 116 commits intopy-pdf:mainfrom
pubpub-zz:cloning
Dec 11, 2022
Merged

ENH: Add Cloning #1371
MartinThoma merged 116 commits intopy-pdf:mainfrom
pubpub-zz:cloning

Conversation

@pubpub-zz
Copy link
Copy Markdown
Collaborator

@pubpub-zz pubpub-zz commented Sep 27, 2022

The method .clone(pdf_dest,[force_duplicate]) clones the objects and all referenced objects.

If an object is already cloned, the already cloned object is returned (unless force_duplicate is set)
mainly for internal use but can be used on a page
for pageObject/DictionnaryObject/[Encoded/Decoded/Content]Stream an extra parameter ignore_fields list that provide the list of fields that should not be cloned.

When available, the pointer to an object is available in indirect_obj attribute.

New API for add_page/insert_page that :

  • returns the cloned page object
  • ignore_fields can be provided as a parameter.

Others

  • file is closed at the end of PdfWriter.write when a filename is provided
  • Breaking Change: add_outline_item now has a parameter before which is not the last parameter

Update

  • The public API of PdfMerger has been added to PdfWriter (ready to make PdfMerger an alias of it)
  • Process properly Outline merging
  • Process properly Named destinated

Deals with #1194, #1322, #471, #1337

add cloning capability
includes:
* add clone function
* new  API for add_page/insert_page that returns the cloned page object
* close file when a file name is provided to PdfWriter.write
@pubpub-zz pubpub-zz marked this pull request as draft September 27, 2022 18:34
w.merge and w.append
to be iaw PDF Spec

add page clean up for destination in NameObject that are not matching TextStringObject in Names/Dests
@codecov
Copy link
Copy Markdown

codecov bot commented Oct 15, 2022

Codecov Report

Base: 94.14% // Head: 92.70% // Decreases project coverage by -1.43% ⚠️

Coverage data is based on head (4ccfbff) compared to base (7633477).
Patch coverage: 84.45% of modified lines in pull request are covered.

❗ Current head 4ccfbff differs from pull request most recent head afebcab. Consider uploading reports for the commit afebcab to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1371      +/-   ##
==========================================
- Coverage   94.14%   92.70%   -1.44%     
==========================================
  Files          31       29       -2     
  Lines        5480     5691     +211     
  Branches     1037     1112      +75     
==========================================
+ Hits         5159     5276     +117     
- Misses        193      267      +74     
- Partials      128      148      +20     
Impacted Files Coverage Δ
PyPDF2/_merger.py 97.60% <ø> (+4.42%) ⬆️
PyPDF2/generic/_data_structures.py 89.75% <79.08%> (-5.57%) ⬇️
PyPDF2/_protocols.py 81.25% <81.25%> (ø)
PyPDF2/_writer.py 86.12% <84.11%> (-3.43%) ⬇️
PyPDF2/generic/_base.py 99.64% <98.36%> (-0.36%) ⬇️
PyPDF2/_page.py 92.23% <100.00%> (+0.28%) ⬆️
PyPDF2/_reader.py 90.33% <100.00%> (+0.04%) ⬆️
PyPDF2/types.py 100.00% <100.00%> (ø)
... and 11 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Copy Markdown
Member

@MartinThoma MartinThoma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mypy didn't complain when I checked. As you asked me to look at mypy, I checked all 'type: ignore' comments. Several were not necessary at all. In some cases mypy needed an assert variable is not None as a hint. And in some cases I could at least narrow the ignore down to be a bit more specific

Comment thread PyPDF2/_page.py Outdated
Comment thread PyPDF2/generic/_base.py Outdated
Comment thread PyPDF2/generic/_base.py Outdated
Comment thread PyPDF2/generic/_base.py Outdated
Comment thread PyPDF2/generic/_base.py Outdated
Comment thread PyPDF2/generic/_data_structures.py Outdated
Comment thread PyPDF2/generic/_data_structures.py Outdated
Comment thread PyPDF2/generic/_data_structures.py Outdated
Comment thread PyPDF2/generic/_data_structures.py Outdated
Comment thread PyPDF2/generic/_data_structures.py Outdated
pubpub-zz and others added 4 commits October 16, 2022 10:34
Co-authored-by: Martin Thoma <info@martin-thoma.de>
Co-authored-by: Martin Thoma <info@martin-thoma.de>
Co-authored-by: Martin Thoma <info@martin-thoma.de>
Co-authored-by: Martin Thoma <info@martin-thoma.de>
Comment thread PyPDF2/_writer.py Outdated
Comment thread PyPDF2/_writer.py Outdated
Comment thread PyPDF2/_writer.py Outdated
Comment thread PyPDF2/_writer.py Outdated
Comment thread PyPDF2/_writer.py Outdated
Comment thread PyPDF2/_writer.py
Comment thread PyPDF2/_writer.py Outdated
Comment thread PyPDF2/_writer.py
Comment thread PyPDF2/_writer.py Outdated
Comment thread PyPDF2/_writer.py Outdated
Comment thread PyPDF2/_writer.py Outdated
Comment thread PyPDF2/_writer.py Outdated
Comment thread PyPDF2/_writer.py Outdated
@MartinThoma
Copy link
Copy Markdown
Member

Finally! I'll have another quick look at the code and then merge today :-)

@MartinThoma MartinThoma merged commit 74b8a63 into py-pdf:main Dec 11, 2022
@MartinThoma
Copy link
Copy Markdown
Member

@pubpub-zz Thank you so much for this moonshot extension 🙏 ❤️

@xilopaint
Copy link
Copy Markdown
Contributor

@pubpub-zz thanks for all the effort you've put into this PR!

@MartinThoma MartinThoma removed the soon PRs that are almost ready to be merged, issues that get solved pretty soon label Dec 12, 2022
MartinThoma added a commit that referenced this pull request Dec 22, 2022
BREAKING CHANGES:
-  Deprecate features with PyPDF2==3.0.0 (#1489)
-  Refactor Fit / Zoom parameters (#1437)

New Features (ENH):
-  Add Cloning  (#1371)
-  Allow int for indirect_reference in PdfWriter.get_object (#1490)

Documentation (DOC):
-  How to read PDFs from S3 (#1509)
-  Make MyST parse all links as simple hyperlinks (#1506)
-  Changed 'latest' for 'stable' generated docs (#1495)
-  Adjust deprecation procedure (#1487)

Maintenance (MAINT):
-  Use typing.IO for file streams (#1498)

[Full Changelog](2.12.1...3.0.0)
@pubpub-zz pubpub-zz deleted the cloning branch June 24, 2023 08:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants