Inspiration -

I have been working with more and more over the last year and wanted to see if was possible to create a Document Manager feature on Reddit that could perform OCR.

What it does -

A fully functional document management app for Reddit that:

Uploads images and PDFs (despite Reddit docs saying only images are supported) ✅ AI-powered analysis using Gemini 2.5 Flash ✅ Auto-generates descriptions from document content ✅ Extracts key information (amounts, dates, companies) ✅ Stores documents in Redis (up to 20 per post, 500KB each) ✅ 7-day caching for AI results (cost optimization) ✅ Rate limiting (100 requests/day per user) ✅ Mobile-friendly UI with Tailwind CSS ✅ Works entirely within Reddit posts

How we built it -

Amazon Kiro was used to create Design Specifications and then build out all the specifications.

Challenges we ran into -

Pivot to Google Gemini AI

Why Gemini?

  • ✅ Pure JavaScript SDK (no native dependencies)
  • ✅ Supports both image and PDF analysis
  • ✅ Vision capabilities for document understanding
  • ✅ Works with Devvit's HTTP fetch (generativelanguage.googleapis.com is globally allow-listed)

What we learned

Devvit does not support environment variables in the traditional sense

  • No .env file support in production
  • No way to inject secrets at build time

What's next for Document Manager

I would like to include options for more models and storage that can be chosen from the main UI. Gemini or another model will be good enough to read handwriting and so it could be used for transcription services and other advanced AI analysis.

Built With

Share this project:

Updates