Skip to content

extra spaces in result when ocr chinese #991

@BackT0TheFuture

Description

@BackT0TheFuture

win8.1 64bit tesseract 4.0.0alpha leptonica-1.74.2 (Jun 6 2017, 21:45:59) [MSC v.1910 LIB Release x64]

the reslut always contains extra spaces between character when using oem LstmOnly or TesseractAndLstm, oem TesseractOnly works normally but the result is bad.

image

oem: Default psm: SingleLine time: 114 ms. result: 伦 敦 楼 房 发 生 火 灾 中 使 馆 关 注 : 暂 无 中 国 公 民 受 伤

oem: TesseractOnly psm: AutoOsd time: 518 ms. result: 伦敦夺委房发生火火中使馆大汪 二 暂无中 ` 又伤

result_lines_chi.txt

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions