-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Closed
Labels
is-robustness-issueFrom a users perspective, this is about robustnessFrom a users perspective, this is about robustness
Description
On decoding a pdf in the second line:
if orientation in orientations:
if isinstance(operands[0], str):
len(operands) == 0 and it raises an ex.
Should change it to:
if orientation in orientations and len(operands) > 0:
Environment
Which environment were you using when you encountered the problem?
$ python -m platform
macOS-14.1.1-x86_64-i386-64bit
$ python -c "import pypdf;print(pypdf._debug_versions)"
pypdf==3.17.1, crypt_provider=('cryptography', '3.3.2'), PIL=10.1.0Code + PDF
This is a minimal, complete example that shows the issue:
# sorry; PDF is confidentialShare here the PDF file(s) that cause the issue. The smaller they are, the
better. Let us know if we may add them to our tests!
Traceback
This is the complete traceback I see:
<our software>
page_text = page_obj.extract_text()
File "/Users/rgwood/mambaforge/envs/e_16730_w3/lib/python3.9/site-packages/pypdf/_page.py", line 2279, in extract_text
return self._extract_text(
File "/Users/rgwood/mambaforge/envs/e_16730_w3/lib/python3.9/site-packages/pypdf/_page.py", line 2115, in _extract_text
process_operation(b"Tj", operands)
File "/Users/rgwood/mambaforge/envs/e_16730_w3/lib/python3.9/site-packages/pypdf/_page.py", line 2075, in process_operation
text, rtl_dir = handle_tj(
File "/Users/rgwood/mambaforge/envs/e_16730_w3/lib/python3.9/site-packages/pypdf/_text_extraction/__init__.py", line 220, in handle_tj
if isinstance(operands[0], str):
IndexError: list index out of range
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
is-robustness-issueFrom a users perspective, this is about robustnessFrom a users perspective, this is about robustness