Skip to content

extracting incorrect text from .extractText() #880

@pazazzwang

Description

@pazazzwang

extracting incorrect text from
PyPDF2.PdfFileReader(file_handle).getPage(2).extractText()

Environment

Which environment were you using when you encountered the problem?

$ python -m platform
# TODO: Your output goes here

$ python -c "import PyPDF2;print(PyPDF2.__version__)"
# TODO: Your output goes here

Code

This is a minimal, complete example that shows the issue:

# TODO: Your code goes here

PDF

Coraline.pdf

Share here the PDF file(s) that cause the issue. The smaller they are, the
better. Let us know if we may add them to our tests!

Metadata

Metadata

Assignees

Labels

is-bugFrom a users perspective, this is a bug - a violation of the expected behavior with a compliant PDFworkflow-text-extractionFrom a users perspective, text extraction is the affected feature/workflow

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions