SimpleIndex is the best low-cost PDF data extraction software for businesses. It uses the existing text whenever possible instead of OCR, providing 100% accuracy and incredibly fast processing. Complex pattern matching using database lookups and regular expressions locate data anywhere it appears in the file. Extracted data can be saved to CSV, XML or any SQL database.
I have a duplex scanner. How to set up SimpleIndex to scan two sided documents automatically?
Friday, 11 August 2023
Please refer to the Wiki Documentation for the complete Scanning reference. Simplex versus Duplex scanning is a function of your scanner driver. SimpleIndex uses both TWAIN and ISIS drivers. ISIS drivers are faster for high-speed scanners and are preferred. To configure duplex on an ISIS scanner: 1 Select “Use ISIS Driver” from the Scan menu if it is
No Comments
Take control of Sales Tax exemption forms
Monday, 14 November 2022
Automatically fill and file sales tax forms Ben Franklin once noted, “…nothing is certain except death and taxes.” In the case of state sales taxes, they may be unavoidable, but managing your customers’ sales tax exemption forms and making sure you’ve sent current exemption certificates to your vendors doesn’t have to feel like a terminal
I’m using full page OCR. The information is all appearing in the txt file but it is losing format about half way through. Data to the right is ending up at the end of the txt doc. Can this be fixed?
Wednesday, 28 February 2018
Please refer to the Wiki Documentation for the complete Full-Page OCR reference. SimpleIndex version 7 solves this problem with the incorporation of the FineReader OCR engine. Full text in PDFs will now flow with the formatting of the PDF. Legacy Versions: SimpleIndex can also be used with other OCR applications and servers to improve accuracy, formatting and
- Published in OCR
How do you configure full text searching in Retrieval mode?
Wednesday, 28 February 2018
Please refer to the Wiki Documentation for the complete Database Settings reference. On the Database tab there dropdown in the lower portion of the panel for Full Text OCR Field. Put the name of the field that will store the full-text data there. This must be configured both for Insert and Retrieval mode configurations. The database field
- Published in Database & Retrieval, OCR
Can OCR text be saved to Office, Text, HTML or other formats?
Wednesday, 28 February 2018
Yes. On the OCR step of the Job Settings Wizard you can select the text output format need in the “Full-page OCR file type” drop down. By default it is set to PDF, but can be changed to Text (txt), Word (docx), Rich Text (rtf), Open Office (odt), Excel (xlsx), PowerPoint (pptx), ePub Zip (epub),
- Published in Licensing & Installation, OCR
Can SimpleIndex create searchable PDF Image+Text files with hidden text?
Wednesday, 28 February 2018
Yes, it can. You can configure this setting in the Job Settings Wizard by going to the OCR step and checking “Enable full-page OCR”. There are many settings in the OCR step that you can used to customize the output and recognition of images. SimpleIndex has two different OCR engines (Standard and Professional) that can
- Published in Export, OCR, Office PDF Text Processing
PDF Text Processing Demo
Friday, 12 January 2018
This sample job demonstrates the PDF text processing capabilities of SimpleIndex by extracting the Document Number, Date, Document Type, Customer and Total from a number of documents without OCR, by processing the text layer of PDF files. Computer-generated PDF files, such as those created using PDF printer drivers, already contain digitized text. SimpleIndex reads the

