Skip to content

extract arabic text from pdf #591

@dihia-lanasri

Description

@dihia-lanasri

Hello;

I have a pdf in arabic language text. I need to extract its text. but I obtain something like:
f˘£˘∏˘≤â GCh∫ GCeù¢, b˘Éa˘∏˘á J†°˘Ée˘æ«á eø h’já Gdû°∏∞ fëƒ
fl«˘ª˘Éä Gd˘ÓL˘ÄÚ Gdü°˘ëôGhjÚ eƒL¡á d∏û°©

How can I decode it? utf-8 doesn't work.

Metadata

Metadata

Assignees

No one assigned

    Labels

    is-bugFrom a users perspective, this is a bug - a violation of the expected behavior with a compliant PDFworkflow-text-extractionFrom a users perspective, text extraction is the affected feature/workflow

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions