Skip to content

process_operation raises "TypeError: a bytes-like object is required, not 'dict'" #953

@MartinThoma

Description

@MartinThoma

When I try to extrac the text from the PDF below, I get:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/moose/Github/py-pdf/PyPDF2/PyPDF2/_page.py", line 1263, in extract_text
    return self._extract_text(self, self.pdf, space_width, PG.CONTENTS)
  File "/home/moose/Github/py-pdf/PyPDF2/PyPDF2/_page.py", line 1245, in _extract_text
    process_operation(operator, operands)
  File "/home/moose/Github/py-pdf/PyPDF2/PyPDF2/_page.py", line 1197, in process_operation
    text += operands[0].translate(cmap)
TypeError: a bytes-like object is required, not 'dict'

Fixing this issue would likely also fix #523

Environment

Which environment were you using when you encountered the problem?

$ python -m platform
Linux-5.4.0-113-generic-x86_64-with-debian-bullseye-sid

$ python -c "import PyPDF2;print(PyPDF2.__version__)"
2.0.0 (current main - 2.1.0)

Code

PDF: https://github.com/mstamy2/PyPDF2/files/3796761/17343_2008_Order_09-Jan-2019.pdf

from PyPDF2 import PdfReader

reader = PdfReader('17343_2008_Order_09-Jan-2019.pdf')
page = reader.pages[0]
page.extract_text()

Metadata

Metadata

Assignees

Labels

Has MCVEA minimal, complete and verifiable example helps a lot to debug / understand feature requestsis-bugFrom a users perspective, this is a bug - a violation of the expected behavior with a compliant PDFworkflow-text-extractionFrom a users perspective, text extraction is the affected feature/workflow

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions