Inspiration
We've all held something old - a letter from a grandparent, a newspaper clipping tucked in a box, an heirloom passed down without explanation. The object is there, but the story behind it is silent.
We built Echoes because history shouldn't require a history degree to access. Millions of real primary sources sit digitized in archives like the Library of Congress, untouched by most people simply because they're hard to navigate and even harder to feel connected to. We wanted to change that - not by summarizing history, but by giving it a voice.
The question that started everything: what if you could upload a 200-year-old letter and just... hear it?
What We Learned
- How to use ElevenLabs to generate contextually appropriate AI narration - and how much the right voice changes the emotional weight of a document
- How to work with the Library of Congress Chronicling America API to query and clean OCR text from real historical newspapers dating back to the 1770s
- How to run on-device ML with TensorFlow.js (COCO-SSD + MobileNet) for real-time object detection directly in the browser - no server required
- How to chain Anthropic Claude + Wikipedia together to build a research pipeline that gives historical context automatically
- That building something that feels right matters as much as building something that works
How We Built It
Echoes is a Next.js app with four core features, each powered by a different combination of APIs and ML models:
Archive Mode takes any uploaded document or image, runs it through Tesseract.js for OCR, enriches it with Wikipedia context via the MediaWiki API, and sends it to ElevenLabs for narration. Users can then ask live questions about the document and hear AI-generated audio answers back.
Discover Archives queries the Library of Congress Chronicling America collection - 10M+ real digitized newspaper pages
- and pipes results through a cleaning pipeline before narrating them. A "broadcast mode" stitches multiple documents into a single chronological audio narrative.
The Heirloom uses TensorFlow.js with COCO-SSD and MobileNet to detect objects via webcam or image upload, then builds a 5-7 event historical timeline using Wikipedia's API. Each event is individually narrated on demand.
Community is a shared archive backed by localStorage where users submit and browse heirloom stories from others.
The entire UI is built with Tailwind CSS, Framer Motion for animations, and Lucide React for icons - designed around a dark vintage aesthetic with gold accents to match the historical feel.
Challenges
OCR noise from historical documents was a significant technical hurdle. 19th-century newspaper scans often produce garbled, broken text. We built a cleanup pipeline using Claude to reconstruct readable prose before narration - otherwise the audio was incomprehensible.
Chaining async APIs without breaking the user experience took careful orchestration. A single Archive Mode narration involves OCR, Wikipedia lookup, Claude enrichment, and ElevenLabs audio generation - all sequentially. We had to handle failures at each step gracefully so the app never just... died silently.
On-device ML performance with TensorFlow.js was slower than expected on lower-end hardware. We optimized by lazy-loading models and running inference only on user action rather than continuously.
Scope. We had four feature ideas and one hackathon. Shipping all of them required ruthless prioritization and accepting that some things would be rough around the edges - which is exactly what a hackathon is for.
Built With
- anthropic
- api
- claude
- congress
- css
- elevenlabs
- framer
- library
- motion
- next.js
- of
- react
- tailwind
- tensorflow.js
- tesseract.js
- transformers.js
- translate
- typescript
- wikipedia
Log in or sign up for Devpost to join the conversation.