Skip to content

Trouble with apostophes in names in text "O'Doul" #384

@chrisjcameron

Description

@chrisjcameron

I ended up adding a single quote to the "odd escape sequence list" in readStringFromStream() in file generic.py to resolve an issue I was having with some pdfs I am processing. I am not sure what undesirable consequences this might have but it seems to resolve my specific issue. (I cannot share the source PDF as these are confidential)

The added token is b_("'")

Metadata

Metadata

Assignees

No one assigned

    Labels

    is-bugFrom a users perspective, this is a bug - a violation of the expected behavior with a compliant PDFworkflow-text-extractionFrom a users perspective, text extraction is the affected feature/workflow

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions