Skip to content

OCR on cyrillic text #1364

@Blightbuster

Description

@Blightbuster

Summary of your issue

Using OCR on cyrillic text yields a empty string even with correct model and white list

Environment

OpenCVSharp 4.5.3.20211228

What did you do when you faced the problem?

  • Verified that it works with english model on latin characters
  • Tested the image from the example below in the console with the same model and got this result:
    "Дульный тормоз-компенсатор Зенит "ДТК-1" 762х39 и 5.45х59 для АК"

Example code:

image

var whiteList = "АаБбВвГгДдЕеЁёЖжЗзИиЙйКкЛлМмНнОоПпРрСсТтУуФфХхЦцЧчШшЩщЪъЫыЬьЭэЮюЯя";
var model = OCRTesseract.Create("models", "rus", whiteList, 3, 7);
string text = "";
model.Run(img, out text, out _, out _, out _, ComponentLevels.TextLine);
if (text == "") Console.WriteLine("Empty");

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions