OCR tool developed by Andrew Weymouth, Digital Initiatives Librarian for University of Idaho, over summer and fall of 2025. The tool implements the TrOCR text recognition model and the Kraken BLLA page segmentation model to improve the accuracy of handwritten and cursive archival documents and add digital preservation metadata to processed materials. Opti-Column, a future iteration, will focus on full page newspaper spreads. The tool was developed for overhauling the Center for Digital Inquiry and Learning's digital collection PDF files, to make the collection more discoverable and accessible. The development of the tool is written about in greater detail in Transparent Practices: OCR and AI in the Archives, by Rebecca Hastings and Andrew Weymouth. Submitted to Collections: A Journal for Archives and Museum Professions, October 2025.
-
Notifications
You must be signed in to change notification settings - Fork 0
License
Scholarly-Projects/opticolumn
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
About
No description, website, or topics provided.
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published