Skip to content

Line returns missing in text_extraction() #2138

@pubpub-zz

Description

@pubpub-zz

PDF file:
https://github.com/py-pdf/pypdf/files/12483807/AEO.1172.pdf

Can you also test the page.extract_text() function? It seems always combine sentences in multiline without space.
the first page in my attached file.
image

Originally posted by @yonglee7015 in #2135 (reply in thread)

Metadata

Metadata

Assignees

No one assigned

    Labels

    workflow-text-extractionFrom a users perspective, text extraction is the affected feature/workflow

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions