The intelligence layer for finance teams who live inside PDFs.
DocuFlow won 2nd place in the AWS Financial AI Hack 2025 — a global hackathon — for demonstrating how document intelligence can reshape finance operations.
This organisation was created to solve a problem every finance team knows: drowning in unstructured PDFs—invoices, contracts, vendor agreements—while auditors, compliance officers, and analysts chase down data across spreadsheets, email threads, and filing cabinets. DocuFlow turns that chaos into a single, AI-powered pipeline.
Finance teams are buried in documents that don't talk to each other:
- Manual data entry issues — Manual data entry from invoices is slow, costly, and error-prone, leading to frequent mistakes and delays.
- Contract clause search challenges — Finding specific clauses in lengthy financial contracts is like searching for a needle in a haystack, consuming valuable time.
- Compliance and auditing risks — Compliance and auditing processes remain manual, reactive, and often incomplete, increasing risk exposure.
- Lack of actionable insights — Financial professionals lack instant access to actionable insights from their own documents, hindering decision-making and efficiency.
DocuFlow exists to fix that. It centralises upload, extraction, validation, and AI-assisted review so analysts can move from ingestion to investigation without leaving the browser.
| Problem | DocuFlow Approach |
|---|---|
| Manual PDF data entry | Landing AI ADE automatically extracts invoice and contract metadata — under 45 seconds |
| Invoices vs. contracts mismatch | Compliance engine compares invoices to contract terms; flags overcharges and policy violations |
| "Where did we agree to that?" | RAG-powered semantic search + Gemini answers questions using your actual contracts |
| Audit scramble | Dashboard cards, exception heatmaps, and inline AI summaries surface the "so what?" behind every document |
| Context switching | AI chat anchored to specific documents—ask questions in context, no copy/paste |
- "Show invoices with anomalies"
- "What are payment terms for Vendor X?"
Responses are factually grounded in your documents — no hallucination.
| Category | Value |
|---|---|
| Save Time & Money | 80–90% reduction in manual data entry. Find critical answers in seconds instead of hours. |
| Increase Accuracy | Automated data extraction eliminates transcription errors. Anomalies and risks are flagged before they become costly. |
| Proactive Compliance | Real-time risk and policy monitoring. All documents timestamped and fully searchable for audit readiness. |
| Scalable & Modern | Built on AWS, FastAPI, and React. Designed to handle thousands of documents seamlessly. |
- Upload & Extract — Users upload PDFs through a React frontend. FastAPI + Landing AI extract structured data from invoices and contracts in seconds.
- Store & Embed — Extracted data is saved in AWS RDS (PostgreSQL). Text is converted into semantic vector embeddings using Google Gemini and stored in pgvector.
- Query & Retrieve — User questions are embedded and matched against the vector database. RAG retrieves the most relevant document segments.
- Generate & Answer — Google Gemini generates natural language answers, grounded in retrieved content for factual, precise responses.
- Log in & role selection — Choose your role (Auditor, Chartered Accountant, Compliance Officer) for a customized dashboard and workflow.
- Upload invoice PDF — Drag and drop; AI extracts key data in under 45 seconds, including anomaly detection for errors or outliers.
- Ask the AI assistant — Interact using natural language; receive precise, instant answers about your invoices and contracts.
- Review risks & summaries — Access flagged risks and contract clause summaries, saving hours of manual review and ensuring compliance.
| Repository | Purpose |
|---|---|
| client-side | DocuFlow frontend — React/TypeScript dashboard for upload, document tables, AI chat, and role-aware workflows for Auditors, Chartered Accountants, and Compliance Officers |
| server-side | Document processing API — FastAPI pipeline with Landing AI ADE, Gemini embeddings, PostgreSQL (pgvector), and compliance automation |
Together, they form an end-to-end flow: upload PDFs → extract structured data → vectorise for search → store in Postgres → serve AI-assisted insights and compliance checks through the dashboard.
DocuFlow was born from the AWS Financial AI Hack (team: Skywalkers77), where it placed 2nd globally. The goal was to demonstrate how document intelligence—Landing AI's ADE, Google Gemini, and AWS infrastructure—can reshape finance operations: faster audits, proactive compliance, and analysts freed from repetitive manual work.
This organisation houses the full stack so others can extend, pilot, or learn from the approach.
- Client: client-side/README.md — setup, architecture, component playbook
- Server: server-side/README.md — API reference, endpoints, compliance engine
- Demo: YouTube walkthrough
- Pitch deck: Google Slides


