-
Notifications
You must be signed in to change notification settings - Fork 171
Ocrmypdf fails due to Tesseract failed to report available languages #2504
Copy link
Copy link
Closed
Labels
dockerAll things regarding docker setupAll things regarding docker setup
Description
I'm on version 0.41.0 and I just noticed that I can't select text in my imported PDF (a scanned document).
Looking at the job log I found this:
Tue, February 20th, 2024, 21:03: Running external command: ocrmypdf -l deu --skip-text --deskew -j 1 /tmp/docspell-convert/docspell-ocrmypdf17124542125895878539/infile /tmp/docspell-convert/docspell-ocrmypdf17124542125895878539/out.pdf
Tue, February 20th, 2024, 21:03: Command `ocrmypdf -l deu --skip-text --deskew -j 1 /tmp/docspell-convert/docspell-ocrmypdf17124542125895878539/infile /tmp/docspell-convert/docspell-ocrmypdf17124542125895878539/out.pdf` finished: 3
Tue, February 20th, 2024, 21:03: ocrmypdf stdout:
Tue, February 20th, 2024, 21:03: ocrmypdf stderr: Tesseract failed to report available languages. Output from Tesseract: ----------- [DS] Profile file not available (tesseract_opencl_profile_devices.dat); performing profiling. [DS] Device: "(null)" (Native) evaluation... Error in pixCloseBrick: pixs not 1 bpp Error in pixOpenBrick: pixs not defined Error in pixSubtract: pixs1 not defined Error in pixOpenBrick: pixs not defined Error in pixOpenBrick: pixs not defined [DS] Device: "(null)" (Native) evaluated [DS] composeRGBPixel: 0.017794 (w=1.2) [DS] HistogramRect: 0.015793 (w=2.4) [DS] ThresholdRectToPix: 0.025850 (w=4.5) [DS] getLineMasksMorph: 0.000040 (w=5.0) [DS] Score: 0.175782 [DS] Scores written to file (tesseract_opencl_profile_devices.dat). [DS] Device[1] 0:(null) score is 0.175782 [DS] Selected Device[1]: "(null)" (Native) List of available languages in "/usr/share/tessdata/" (23): ces dan deu est fin fra heb ita jpn jpn_vert khm lav lit nld nor pol por ron rus slk spa swe ukr
Tue, February 20th, 2024, 21:03: PDF conversion failed: Command result=3. No output file found.. Go without PDF file
Tue, February 20th, 2024, 21:03: Closing process: `ocrmypdf -l deu --skip-text --deskew -j 1 /tmp/docspell-convert/docspell-ocrmypdf17124542125895878539/infile /tmp/docspell-convert/docspell-ocrmypdf17124542125895878539/out.pdf`
I don't think I ever saw this error when importing my ~1000 documents.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
dockerAll things regarding docker setupAll things regarding docker setup