Skip to content

PdfReadWarning: Superfluous whitespace found in object header b'1' b'0' [pdf.py:1666] #576

@kalkovid19

Description

@kalkovid19

Hi all,
I coverting pdf file to text for processing. code was workig fine an drecently it started giving errors like below and not text extraction
PdfReadWarning: Superfluous whitespace found in object header b'1' b'0' [pdf.py:1666]

MCVE

from PyPDF2 import PdfReader

reader = PdfReader("TN_24.08.2020.pdf")
text = reader.pages[0].extract_text()
assert "Directorate" in text, text

my pdf file and process code are attached
pdf2txt.py.txt
TN_24.08.2020.pdf

Thanks in advance

Metadata

Metadata

Assignees

No one assigned

    Labels

    Has MCVEA minimal, complete and verifiable example helps a lot to debug / understand feature requestsis-bugFrom a users perspective, this is a bug - a violation of the expected behavior with a compliant PDFis-robustness-issueFrom a users perspective, this is about robustness

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions