DocuLens

Inspiration

Tax documents are notoriously complex, packed with legal jargon, financial terms, and dense tables that the average person struggles to understand. Whether it's a W-2, 1040, or an IRS notice, most people either avoid reading them or rely on professionals to interpret the contents.

This creates a gap in financial literacy and transparency—people don’t really know what’s being filed on their behalf, what they owe, or what they’re owed. That confusion can lead to anxiety, missed deductions, or even costly mistakes.

We built DocuLens to bridge that gap—a tool that empowers everyday users to understand their tax documents without needing an accounting degree. By combining real-time AI summarization with intelligent ranking of key information, we aim to make tax paperwork human-friendly, informative, and easy to digest.

What it does

DocuLens allows users to upload a tax document (PDF or image) and then automatically
Extracts and ranks key financial data by relevance
Summarizes content in clear, plain English
Displays a clean, readable breakdown of the document

How we built it

Frontend: Developed with Vite.js, using TypeScript and Tailwind CSS for a modern, fast, and responsive UI.
Backend: Built a secure upload flow and real-time processing pipeline using REST API endpoints.
AI Integration: Used the Gemini API to extract, categorize, and simplify financial data through custom prompt engineering.

Challenges we ran into

Setting up secure file uploads in a Vite-based architecture
Prompting Gemini to return structured data rather than vague summaries
Managing inconsistent formats between PDFs, scanned images, and mixed file types
Designing an intuitive UI while maintaining complex backend logic
The web application would not proceed from the "processing" UI state, so we had to spend upwards of an hour editing and debugging code to try and resolve the issue.

Accomplishments that we're proud of

Achieved a clean and polished UI despite starting from scratch
Built and integrated a working AI-powered backend within hours
Successfully generated human-readable summaries from raw tax content
Created a functional and helpful tool usable by non-technical individuals

What we learned

Integrating LLMs in real-time apps: We learned how to process user-uploaded documents with Gemini on the fly and return accurate, digestible summaries.
Prompt engineering matters: Small changes in the prompts significantly affected output structure and quality.
Modern frontend architecture with Vite: This was our first large-scale app using Vite, and we explored efficient component design, routing, and performance tuning.
Handling file uploads and API connections: We gained valuable experience in formData handling, file streaming, and backend pipeline design.
User-first design thinking: Building for non-technical users taught us to focus on clarity, readability, and ease of navigation.
Teamwork under pressure: We improved in dividing work, managing dependencies, syncing across branches, and delivering on tight deadlines.

What's next for DocuLens

Support for more government documents: We're expanding beyond tax forms to include social security letters, immigration documents, FAFSA/student loans, healthcare bills, and more. We want to make all bureaucratic paperwork human-readable.
OCR for handwritten/scanned files: Adding OCR will allow users to upload images or scanned documents for accurate text extraction.
Multilingual support: We plan to introduce summaries and chatbot interactions in multiple languages to help non-English speakers.
Export & share features: Users will soon be able to download, print, or email simplified versions of their documents.
Privacy & security: We’re working on end-to-end encryption, temporary file storage, and optional user accounts to ensure data remains safe and confidential.