unstructured
unstructured copied to clipboard
bug/partition_pdf removes spaces from the text
Describe the bug Some spaces are removed from the text when partitioning a PDF document.
To Reproduce PDF: rok_20230930_1-1.pdf
elements = partition_pdf(
filename="rok_20230930_1-1.pdf",
strategy="hi_res",
infer_table_structure=True,
)
print(str(elements[20]))
Current behavior
Nameofeachexchangeonwhichregistered NewYorkStockExchange
Expected behavior
Name of each exchange on which registered New York Stock Exchange