-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Closed
Labels
needs-pdfThe issue needs a PDF file to show the problemThe issue needs a PDF file to show the problem
Description
Explanation
I want to extract all images from a page, but omit inline images as they are not really useful in my case and just generate overhead (2 ms without and 29 s with inline images for one page with a dotted table which has 24643 inline images, but no "real" images).
Code Example
For now, I am basically exploiting
Line 478 in c2a741e
| if self.inline_images is None: |
from pypdf import PdfReader
reader = PdfReader(path)
for page in reader.pages:
page.inline_images = dict() # Avoid loading inline images.
for image in page.images:
print(image)Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
needs-pdfThe issue needs a PDF file to show the problemThe issue needs a PDF file to show the problem