{"id":4320,"date":"2019-11-15T12:56:10","date_gmt":"2019-11-15T12:56:10","guid":{"rendered":"http:\/\/holypython.com\/?p=4320"},"modified":"2021-03-28T00:39:33","modified_gmt":"2021-03-28T00:39:33","slug":"optical-character-recognition-ocr","status":"publish","type":"post","link":"https:\/\/holypython.com\/optical-character-recognition-ocr\/","title":{"rendered":"Optical Character Recognition (OCR)"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-post\" data-elementor-id=\"4320\" class=\"elementor elementor-4320\">\n\t\t\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-53b32a8 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"53b32a8\" data-element_type=\"section\" data-settings=\"{&quot;background_background&quot;:&quot;classic&quot;}\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-cfc2f91\" data-id=\"cfc2f91\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-f775917 elementor-widget elementor-widget-heading\" data-id=\"f775917\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">ABSTRACT<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-2eddac0 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"2eddac0\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-fe88743\" data-id=\"fe88743\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-23250c6 elementor-widget elementor-widget-text-editor\" data-id=\"23250c6\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<ul><li>In this tutorial we will take a closer look at <code>pytesseract<\/code> module and discover some of its powerful features. You will be able to understand basic optical character recognition in a very simple form.<br \/><br \/><\/li><li>We will also use <code>PIL<\/code> library for some image manipulation methods with Python, including: image opening, image displaying, image type conversion, etc.<\/li><\/ul>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-973adc3 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"973adc3\" data-element_type=\"section\" data-settings=\"{&quot;background_background&quot;:&quot;classic&quot;}\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-a36fcae\" data-id=\"a36fcae\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-e1074ba elementor-widget elementor-widget-heading\" data-id=\"e1074ba\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">TUTORIAL<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-bbfd19d elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"bbfd19d\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-7c4b8f8\" data-id=\"7c4b8f8\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-ce22c01 elementor-widget elementor-widget-text-editor\" data-id=\"ce22c01\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Let&#8217;s start with importing the libraries we&#8217;re going to need.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-f893228 elementor-widget elementor-widget-html\" data-id=\"f893228\" data-element_type=\"widget\" data-widget_type=\"html.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<pre rel= \"PYTHON\"><code>import PIL\nfrom PIL import Image\nimport pytesseract\n<\/code><pre>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-0bfb40f elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"0bfb40f\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-7a467ac\" data-id=\"7a467ac\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-33c3448 elementor-widget elementor-widget-heading\" data-id=\"33c3448\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Here is some info about PIL<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-acec39c elementor-drop-cap-yes elementor-drop-cap-view-stacked elementor-widget elementor-widget-text-editor\" data-id=\"acec39c\" data-element_type=\"widget\" data-settings=\"{&quot;drop_cap&quot;:&quot;yes&quot;}\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<pre><b>NAME<\/b>\n    PIL - Pillow (Fork of the Python Imaging Library)\n\n<b>DESCRIPTION<\/b>\n    Pillow is the friendly PIL fork by Alex Clark and Contributors.\n        <a href=\"https:\/\/github.com\/python-pillow\/Pillow\/\" target=\"_blank\" rel=\"noopener\">https:\/\/github.com\/python-pillow\/Pillow\/<\/a><\/pre>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-648d895 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"648d895\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-91c6603\" data-id=\"91c6603\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-425a7af elementor-widget elementor-widget-heading\" data-id=\"425a7af\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Here is some info about pytesseract<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-5fa77d5 elementor-widget elementor-widget-text-editor\" data-id=\"5fa77d5\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>pytesseract is a very popular library for its optical character recognition capabilities. Sometimes, depending on your setup you might need an extra line for pytesseract to work properly. Just find your pytesseract installation directory and point to it with the code below. Note that directory can be different depending on your local setup and you may or may not have to exclude the last bit such as:<\/p><p><span style=\"background-color: rgba(0, 0, 0, 0.05); white-space: pre-wrap;\">r&#8221;C:\\Users\\USA\\Anaconda3\\Tesseract-OCR\\tesseract&#8221; or <\/span><span style=\"white-space: pre-wrap; background-color: rgba(0, 0, 0, 0.05);\">r&#8221;C:\\Users\\USA\\Anaconda3\\Tesseract-OCR\\tesseract\\tesseract.exe&#8221;<\/span><\/p><p>Here is the code:<span style=\"white-space: pre-wrap; background-color: rgba(0, 0, 0, 0.05);\"><br \/><\/span><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-24e86c3 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"24e86c3\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-b157862\" data-id=\"b157862\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-8cfae72 elementor-widget elementor-widget-html\" data-id=\"8cfae72\" data-element_type=\"widget\" data-widget_type=\"html.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<pre rel= \"PYTHON\"><code>pytesseract.pytesseract.tesseract_cmd = r\"C:\\Users\\USA\\Anaconda3\\Tesseract-OCR\\tesseract\\tesseract.exe\"\n<\/code><pre>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-21adc0c elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"21adc0c\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-8c4bf1a\" data-id=\"8c4bf1a\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-c31106d elementor-widget elementor-widget-html\" data-id=\"c31106d\" data-element_type=\"widget\" data-widget_type=\"html.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<pre rel= \"PYTHON\"><code>print(dir(pytesseract.pytesseract))\n<\/code><pre>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-8338f61 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"8338f61\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-1460c3b\" data-id=\"1460c3b\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-cafee4f elementor-widget elementor-widget-text-editor\" data-id=\"cafee4f\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tIf we look at the Package Contents of pytesseract, you can see lot of different object you can discover. In this tutorial we will focus on <code><b>image_to_string.<\/b><\/code>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-2f9b31a elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"2f9b31a\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-8ce8794\" data-id=\"8ce8794\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-fe8644d elementor-widget elementor-widget-text-editor\" data-id=\"fe8644d\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>BytesIO<br \/>Image<br \/>LooseVersion<br \/>OSD_KEYS<br \/>Output<br \/>PandasNotSupported<br \/>QUOTE_NONE<br \/>RGB_MODE<br \/>TSVNotSupported<br \/>TesseractError<br \/>TesseractNotFoundError<br \/>__builtins__<br \/>__cached__<br \/>__doc__<br \/>__file__<br \/>__loader__<br \/>__name__<br \/>__package__<br \/>__spec__<br \/>cleanup<br \/>file_to_dict<br \/>find_loader<br \/>get_errors<br \/>get_pandas_output<br \/>get_tesseract_version<br \/>iglob<br \/>image_to_boxes<br \/>image_to_data<br \/>image_to_osd<br \/>image_to_pdf_or_hocr<br \/><b>image_to_string<br \/><\/b>is_valid<br \/>main<br \/>ndarray<br \/>normcase<br \/>normpath<br \/>numpy_installed<br \/>os<br \/>osd_to_dict<br \/>pandas_installed<br \/>pd<br \/>prepare<br \/>realpath<br \/>run_and_get_output<br \/>run_once<br \/>run_tesseract<br \/>save_image<br \/>shlex<br \/>string<br \/>subprocess<br \/>subprocess_args<br \/>sys<br \/>tempfile<br \/>tesseract_cmd<br \/>wraps<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-62be5f6 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"62be5f6\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-2460860\" data-id=\"2460860\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-ba4219f elementor-widget elementor-widget-text-editor\" data-id=\"ba4219f\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Help on image_to_string object seems quite simple and straightforward.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-b356f2e elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"b356f2e\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-8c6272c\" data-id=\"8c6272c\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-cc1e5fe elementor-widget elementor-widget-html\" data-id=\"cc1e5fe\" data-element_type=\"widget\" data-widget_type=\"html.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<pre rel= \"PYTHON\"><code>help(pytesseract.pytesseract.image_to_string)\n<\/code><pre>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-5b63304 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"5b63304\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-bfa4b35\" data-id=\"bfa4b35\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-5c3cbb7 elementor-widget elementor-widget-text-editor\" data-id=\"5c3cbb7\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Help on function image_to_string in module pytesseract.pytesseract:<\/p><p>image_to_string(image, lang=None, config=&#8221;, nice=0, output_type=&#8217;string&#8217;)<br \/>Returns the result of a Tesseract OCR run on the provided image to string<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-d81d0df elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"d81d0df\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-482646f\" data-id=\"482646f\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-0914993 elementor-widget elementor-widget-html\" data-id=\"0914993\" data-element_type=\"widget\" data-widget_type=\"html.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<pre rel= \"PYTHON\"><code>f = r'c:\/Users\/t\/Desktop\/default.png'\nimg = Image.open(f)\nimg.show()\n<\/code><pre>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-07b17ae elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"07b17ae\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-fc2a8eb\" data-id=\"fc2a8eb\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-be31a48 elementor-widget elementor-widget-image\" data-id=\"be31a48\" data-element_type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img fetchpriority=\"high\" decoding=\"async\" width=\"300\" height=\"300\" src=\"https:\/\/holypython.com\/wp-content\/uploads\/2019\/11\/default-300x300.png\" class=\"attachment-medium size-medium wp-image-4324\" alt=\"\" srcset=\"https:\/\/holypython.com\/wp-content\/uploads\/2019\/11\/default-300x300.png 300w, https:\/\/holypython.com\/wp-content\/uploads\/2019\/11\/default-150x150.png 150w, https:\/\/holypython.com\/wp-content\/uploads\/2019\/11\/default-768x768.png 768w, https:\/\/holypython.com\/wp-content\/uploads\/2019\/11\/default.png 1000w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-b826675 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"b826675\" data-element_type=\"section\" data-settings=\"{&quot;background_background&quot;:&quot;classic&quot;}\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-3bee727\" data-id=\"3bee727\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-10f1c12 elementor-widget elementor-widget-heading\" data-id=\"10f1c12\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">ACTUAL OCR PART<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-4a915c0 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"4a915c0\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-38c0cbd\" data-id=\"38c0cbd\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-8903645 elementor-widget elementor-widget-text-editor\" data-id=\"8903645\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>We&#8217;ve opened an image with text. Let&#8217;s start doing some OCR!<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-4915330 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"4915330\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-b888ecc\" data-id=\"b888ecc\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-03714de elementor-widget elementor-widget-html\" data-id=\"03714de\" data-element_type=\"widget\" data-widget_type=\"html.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<pre rel= \"PYTHON\"><code>text = pytesseract.image_to_string(img)\nprint(text)\n<\/code><pre>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-065e33e elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"065e33e\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-4e062fa\" data-id=\"4e062fa\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-c594497 elementor-widget elementor-widget-heading\" data-id=\"c594497\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Output:<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-c663b51 elementor-widget elementor-widget-text-editor\" data-id=\"c663b51\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p><strong>Holy Python<\/strong><\/p><p><strong>PYTHON HOLLINESS<\/strong><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-1f4b540 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"1f4b540\" data-element_type=\"section\" data-settings=\"{&quot;background_background&quot;:&quot;classic&quot;}\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-080d4bd\" data-id=\"080d4bd\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-7ad1f2d elementor-widget elementor-widget-heading\" data-id=\"7ad1f2d\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">CONCLUSION<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-4972b2c elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"4972b2c\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-14f15ae\" data-id=\"14f15ae\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-d666a31 elementor-widget elementor-widget-text-editor\" data-id=\"d666a31\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Yes, OCR is that simple! Thanks to Python and Pytesseract.\u00a0<\/p><p>OCR&#8217;s scope is deeper than this quick tutorial but this tutorial can get you started!<\/p><ul><li>One simple technique that can be used when OCR is not very successful is to convert image to black and white using PIL library. This usually improves pytesseract&#8217;s reading abilities.<\/li><li>You will discover that image types such as: &#8220;RGB&#8221;, &#8220;RGBA&#8221;,\u00a0 &#8220;RGBa&#8221;, &#8220;1&#8221;, &#8220;L&#8221; can dictate methods you can and cannot use. Sometimes you might have to do image type conversions using .convert(type).<\/li><li>Also, text on the image can blend with the image and for many reasons it can be harder to extract so there are different methods and parameters to prepare the image for pytesseract such as binarization and converting it to black and white type.<\/li><\/ul><p>We hope this quick tutorial will be eye opening and motivating to get you started to explore incredible OCR possibilities with Python.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>ABSTRACT In this tutorial we will take a closer look at pytesseract module and discover some of its powerful features. You will be able to understand basic optical character recognition in a very simple form. We will also use PIL library for some image manipulation methods with Python, including: image opening, image displaying, image type [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":4324,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[35,64,36,52],"tags":[],"class_list":["post-4320","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-advanced","category-computer-vision","category-machine-learning","category-tutorials"],"acf":[],"_links":{"self":[{"href":"https:\/\/holypython.com\/wp-json\/wp\/v2\/posts\/4320","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/holypython.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/holypython.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/holypython.com\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/holypython.com\/wp-json\/wp\/v2\/comments?post=4320"}],"version-history":[{"count":0,"href":"https:\/\/holypython.com\/wp-json\/wp\/v2\/posts\/4320\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/holypython.com\/wp-json\/wp\/v2\/media\/4324"}],"wp:attachment":[{"href":"https:\/\/holypython.com\/wp-json\/wp\/v2\/media?parent=4320"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/holypython.com\/wp-json\/wp\/v2\/categories?post=4320"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/holypython.com\/wp-json\/wp\/v2\/tags?post=4320"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}