🔍 Extract text from images entirely in your browser — no upload, no API keys, no privacy concerns.
🎙️ Try it now → https://naklitechie.github.io/ScanLocal/
-
Five OCR engines in one tool:
- 📜 Default (Tesseract) — Most reliable browser path, broad language support
- 🚀 Fast (PaddleOCR) — Printed text path via Paddle models
- ✍️ Handwriting (TrOCR) — SOTA for handwritten text, ~500 MB
- 🧠 Smart OCR — Transformer fallback for printed text, no boxes
- 🔭 Qwen2-VL (Experimental) — Best-effort visual document reader, very large download
-
100% client-side — Your images never leave your device
-
No dependencies — Single HTML file, no npm, no bundler
-
Offline-ready — Models cached after first load
-
Privacy-first — No tracking, no analytics, no data collection
- Receipts and invoices
- Scanned documents
- Business cards
- Screenshots with text
- Handwritten notes
- Historical documents
- Open the app in your browser
- Choose an engine:
- Default for the safest first try in-browser
- Fast for receipts, documents, screenshots
- Handwriting for handwritten notes
- Smart for an alternate transformer-based read on printed text
- Qwen for experimental visual reading of complex documents
- Pick a Tesseract language if you're using the default engine
- Drop an image or click to upload
- Wait for text extraction
- Copy the extracted text
# Just open index.html in your browser
open ScanLocal/index.htmlcd ScanLocal
python3 -m http.server 8000
# Open http://localhost:8000The app is automatically deployed to:
https://naklitechie.github.io/ScanLocal/
| Component | Technology |
|---|---|
| Framework | Vanilla JS (zero dependencies) |
| OCR Engine 1 | PaddleOCR (PP-OCRv5) via Paddle.js |
| OCR Engine 2 | TrOCR via Transformers.js |
| OCR Engine 3 | Tesseract.js v5 |
| OCR Engine 4 | Printed-text transformer OCR via Transformers.js |
| OCR Engine 5 | Qwen2-VL via Transformers.js / ONNX |
| UI | Custom CSS (NakliTechie design system) |
| Hosting | GitHub Pages |
| Engine | Size | First Load | Cached | Best For |
|---|---|---|---|---|
| Tesseract | ~25 MB/lang | 5-10 sec | Instant | Default browser OCR, broad language support |
| PaddleOCR | ~20 MB | 2-5 sec | Instant | Printed text |
| TrOCR | ~500 MB | 30-60 sec | Instant | Handwriting |
| Smart OCR | ~300-500 MB | 20-60 sec | Instant | Alternate printed-text extraction |
| Qwen2-VL | ~1.8 GB+ | Long / may time out | Cached after load | Complex visual reasoning and document reading |
- Images: PNG, JPG/JPEG, WebP
- Max size: 10 MB per image
- Languages: 100+ (varies by engine)
- Tesseract languages in UI: English, Spanish, French, German, Italian, Portuguese, Russian, Hindi, Arabic, Japanese, Korean, Chinese (Simplified)
Coloured with italy-02 · PERGAMENA — Renaissance manuscript parchment, Vatican library. Yellowed-cream body, ink-brown text, Vatican-cobalt action accent — the place where text is preserved and read, fitting for an OCR tool.
Palette pulled from Rangrez, the global colour-palette library that backs all NakliTechie projects.
ScanLocal is part of the NakliTechie series — browser-native tools that keep your data on your device.
- ❌ No server uploads
- ❌ No API keys
- ❌ No tracking
- ❌ No analytics
- ✅ All processing happens in your browser
- ✅ Models cached locally after first load
MIT License — see LICENSE file.
ScanLocal follows the same blueprint as:
- BabelLocal — Browser translator (200 languages)
- StripLocal — EXIF metadata stripper
- GambitLocal — Chess vs Stockfish
- VoiceVault — Audio transcription
- KingMe — Checkers vs AI
- SnipLocal — Background remover
All single-file, zero-dependency, browser-native apps.
ScanLocal/
├── index.html # The entire app (shipped)
├── README.md # This file (shipped)
├── LICENSE # MIT license (shipped)
├── .gitignore # Git ignore rules
└── PLAN.md # Planning notes (not shipped)
None! Just edit index.html and refresh your browser.
# Enable GitHub Pages on the repository
# Settings → Pages → Source: main branch → /root
# Push to main branch
git push origin mainIssues and PRs welcome! Please read IDEAS.md for the series guidelines before contributing.
Chirag Patnaik
NakliTechie
See PLAN.md for detailed implementation plan.
Phase 1 (Current): Core infrastructure ✅
- Base HTML skeleton
- Engine selector UI (4 engines)
- Dropzone + preview
- PaddleOCR integration
- TrOCR integration
- Tesseract.js integration
- Smart OCR integration
Phase 2: Engine integration Phase 3: UI polish Phase 4: Testing Phase 5: Launch