-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Closed
Description
Environment
Which environment were you using when you encountered the problem?
$ python -m platform
macOS-13.4.1-arm64-arm-64bit
$ python -c "import pypdf;print(pypdf._debug_versions)"
pypdf 3.14.0
# though i got this:
$ python -c "import pypdf;print(pypdf._debug_versions)"
Traceback (most recent call last):
File "<string>", line 1, in <module>
AttributeError: module 'pypdf' has no attribute '_debug_versions'Code + PDF
Was extracting text from images with clown_sort (see my other filed issues).
whitepaper WBT token blockchain whitepaper.pdf
Traceback
➤ ValueError: conversion from 1 to LA not supported while parsing embedded image 1 on page 14...
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /Users/uzoreus/Library/Caches/pypoetry/virtualenvs/clown-sort-zLqmJuxs-py3.11/lib/python3.11/sit │
│ e-packages/PIL/Image.py:1835 in putalpha │
│ │
│ 1832 │ │ │ try: │
│ 1833 │ │ │ │ mode = getmodebase(self.mode) + "A" │
│ 1834 │ │ │ │ try: │
│ ❱ 1835 │ │ │ │ │ self.im.setmode(mode) │
│ 1836 │ │ │ │ except (AttributeError, ValueError) as e: │
│ 1837 │ │ │ │ │ # do things the hard way │
│ 1838 │ │ │ │ │ im = self.im.convert(mode) │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: image has wrong mode
During handling of the above exception, another exception occurred:
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /Users/uzoreus/workspace/clown_sort/clown_sort/files/pdf_file.py:61 in extracted_text │
│ │
│ 58 │ │ │ │ │
│ 59 │ │ │ │ # Extracting images is a bit fraught (lots of PIL and pypdf exceptions h │
│ 60 │ │ │ │ try: │
│ ❱ 61 │ │ │ │ │ for image_number, image in enumerate(page.images, start=1): │
│ 62 │ │ │ │ │ │ image_name = f"Page {page_number}, Image {image_number}" │
│ 63 │ │ │ │ │ │ self._log_to_stderr(f" Processing {image_name}...") │
│ 64 │ │ │ │ │ │ page_buffer.print(Panel(image_name, expand=False)) │
│ │
│ /Users/uzoreus/Library/Caches/pypoetry/virtualenvs/clown-sort-zLqmJuxs-py3.11/lib/python3.11/sit │
│ e-packages/pypdf/_page.py:2633 in __iter__ │
│ │
│ 2630 │ │
│ 2631 │ def __iter__(self) -> Iterator[ImageFile]: │
│ 2632 │ │ for i in range(len(self)): │
│ ❱ 2633 │ │ │ yield self[i] │
│ 2634 │ │
│ 2635 │ def __str__(self) -> str: │
│ 2636 │ │ p = [f"Image_{i}={n}" for i, n in enumerate(self.ids_function())] │
│ │
│ /Users/uzoreus/Library/Caches/pypoetry/virtualenvs/clown-sort-zLqmJuxs-py3.11/lib/python3.11/sit │
│ e-packages/pypdf/_page.py:2629 in __getitem__ │
│ │
│ 2626 │ │ │ index = len_self + index │
│ 2627 │ │ if index < 0 or index >= len_self: │
│ 2628 │ │ │ raise IndexError("sequence index out of range") │
│ ❱ 2629 │ │ return self.get_function(lst[index]) │
│ 2630 │ │
│ 2631 │ def __iter__(self) -> Iterator[ImageFile]: │
│ 2632 │ │ for i in range(len(self)): │
│ │
│ /Users/uzoreus/Library/Caches/pypoetry/virtualenvs/clown-sort-zLqmJuxs-py3.11/lib/python3.11/sit │
│ e-packages/pypdf/_page.py:534 in _get_image │
│ │
│ 531 │ │ │ │ │ raise KeyError("no inline image can be found") │
│ 532 │ │ │ │ return self.inline_images[id] │
│ 533 │ │ │ │
│ ❱ 534 │ │ │ imgd = _xobj_to_image(cast(DictionaryObject, xobjs[id])) │
│ 535 │ │ │ extension, byte_stream = imgd[:2] │
│ 536 │ │ │ f = ImageFile( │
│ 537 │ │ │ │ name=f"{id[1:]}{extension}", │
│ │
│ /Users/uzoreus/Library/Caches/pypoetry/virtualenvs/clown-sort-zLqmJuxs-py3.11/lib/python3.11/sit │
│ e-packages/pypdf/filters.py:1040 in _xobj_to_image │
│ │
│ 1037 │ │ │ │ alpha = alpha.convert("L") │
│ 1038 │ │ │ if img.mode == "P": │
│ 1039 │ │ │ │ img = img.convert("RGB") │
│ ❱ 1040 │ │ │ img.putalpha(alpha) │
│ 1041 │ │ if "JPEG" in image_format: │
│ 1042 │ │ │ extension = ".jp2" │
│ 1043 │ │ │ image_format = "JPEG2000" │
│ │
│ /Users/uzoreus/Library/Caches/pypoetry/virtualenvs/clown-sort-zLqmJuxs-py3.11/lib/python3.11/sit │
│ e-packages/PIL/Image.py:1838 in putalpha │
│ │
│ 1835 │ │ │ │ │ self.im.setmode(mode) │
│ 1836 │ │ │ │ except (AttributeError, ValueError) as e: │
│ 1837 │ │ │ │ │ # do things the hard way │
│ ❱ 1838 │ │ │ │ │ im = self.im.convert(mode) │
│ 1839 │ │ │ │ │ if im.mode not in ("LA", "PA", "RGBA"): │
│ 1840 │ │ │ │ │ │ raise ValueError from e # sanity check │
│ 1841 │ │ │ │ │ self.im = im │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: conversion from 1 to LA not supported
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels