-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Line returns missing in text_extraction() #2138
Copy link
Copy link
Closed
Labels
workflow-text-extractionFrom a users perspective, text extraction is the affected feature/workflowFrom a users perspective, text extraction is the affected feature/workflow
Description
PDF file:
https://github.com/py-pdf/pypdf/files/12483807/AEO.1172.pdf
Can you also test the page.extract_text() function? It seems always combine sentences in multiline without space.
the first page in my attached file.

Originally posted by @yonglee7015 in #2135 (reply in thread)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
workflow-text-extractionFrom a users perspective, text extraction is the affected feature/workflowFrom a users perspective, text extraction is the affected feature/workflow