This file is mostly generated by Cursor AI
A FastAPI-based backend service for processing receipts, extracting transaction data using OCR and LLM, and managing financial transactions in a PostgreSQL database. We established PostgreSQL local database connection with a Java & Spring-Boot based backend service. Thereafter, we migrated the API endpoints and database connection to Python for deployment and seamless integration of backend and frontend
- OCR Processing: Extract text from images and PDFs (including scanned documents) using DocTR
- Intelligent Receipt Parsing: Use LLM (OpenAI/Anthropic) to extract structured transaction data from receipt OCR results
- Transaction Management: Store and retrieve financial transactions with categories and payment methods
- AI-Powered Comments: Generate contextual comments for transactions using LLM
- Database Integration: PostgreSQL database with automatic schema initialization
-REST API: Initial web framework to test endpoints and data insertion into the PostgreSQL database locally. -Spring Boot: Backend Framework for Java to test endpoints for data retrieval and insertion with our locally set up PostgreSQL database before migrating to cloud service.
- FastAPI: Modern, fast web framework for building APIs
- DocTR: Document Text Recognition for OCR processing
- PyTorch: Deep learning framework (supports CUDA for GPU acceleration)
- PostgreSQL: Relational database for transaction storage
- OpenAI/Anthropic: LLM providers for text extraction and comment generation
- Uvicorn: ASGI server for running the FastAPI application
- Python 3.9+
- PostgreSQL 15+ (or Docker for containerized setup)
- CUDA-capable GPU (optional, for faster OCR processing)
- API keys for OpenAI or Anthropic
-
Clone the repository (if not already done):
cd finance-agent -
Create a virtual environment:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Set up environment variables: Create a
.envfile in the root directory:# Database Configuration DB_HOST=localhost DB_PORT=5436 DB_NAME=spendy-db DB_USER=postgres DB_PASSWORD=postgres # LLM Configuration (choose one or both) OPENAI_API_KEY=your_openai_api_key_here OPENAI_MODEL=gpt-4o-mini OPENAI_MAX_TOKENS=2000 # Optional: Anthropic Configuration ANTHROPIC_API_KEY=your_anthropic_api_key_here ANTHROPIC_MODEL=claude-3-5-sonnet-20241022 ANTHROPIC_MAX_TOKENS=4096
-
Set up PostgreSQL database:
Option A: Using Docker Compose (Recommended)
docker-compose up -d
This will start a PostgreSQL container and automatically initialize the schema.
Option B: Manual Setup
- Install PostgreSQL locally
- Create a database named
spendy-db - Run the schema initialization script from
docker/postgres/init/01-init-schema.sql
The cloud deployed server is available at: https://finaunty-python.onrender.comh
-
Activate the virtual environment (if not already active):
source venv/bin/activate -
Start the FastAPI server:
uvicorn main:app --reload --host 0.0.0.0 --port 8000
The application will be available at
http://localhost:8000 -
Access the API documentation:
- Swagger UI:
http://localhost:8000/docs - ReDoc:
http://localhost:8000/redoc
- Swagger UI:
- GET
/: Root endpoint with API information - GET
/health: Health check endpoint
-
POST
/ocr: Extract text from images or PDFs using OCR- Accepts: Image files (JPEG, PNG) or PDF files
- Returns: OCR results in DocTR format
- Automatically uses text extraction for PDFs with text layers (faster)
-
POST
/ocr_render: Extract and process receipt data from images/PDFs- Accepts: Image files or PDF files
- Returns: Structured transaction data extracted using LLM
- Processes OCR results and extracts items, prices, quantities, merchant info, etc.
-
POST
/manual-input: Create a transaction manually- Request body:
ManualInputRequest(amount, note, category, method, occurredAt, currency, merchant) - Returns:
ManualInputResponsewith transaction details
- Request body:
-
GET
/transactions: Retrieve all transactions- Returns: List of all transactions with category and payment method details
-
POST
/insert-transactions: Bulk insert transactions- Request body: List of
InsertTransactionRequestobjects - Returns: List of inserted transactions with generated IDs and AI comments
- Automatically creates categories and payment methods if they don't exist
- Request body: List of
-
POST
/call_ai_comment: Generate AI comments for transactions- Request body: List of transaction data
- Returns: AI-generated comment string
curl -X POST "http://localhost:8000/ocr_render" \
-H "accept: application/json" \
-H "Content-Type: multipart/form-data" \
-F "file=@receipt.jpg"Response:
{
"merchant": "FAIRPRICE XTRA",
"payment_method": "CASH",
"total": 21.44,
"items": [
{
"item": "COLLAR SHABU SHABU",
"price": 7.70,
"quantity": 1,
"total": 7.70
}
]
}curl -X POST "http://localhost:8000/insert-transactions" \
-H "Content-Type: application/json" \
-d '[
{
"amount": 21.44,
"currency": "SGD",
"category": "Groceries",
"method": "CASH",
"occurredAt": "2024-01-15T10:30:00Z",
"merchant": "FAIRPRICE XTRA",
"note": "Weekly groceries"
}
]'curl -X GET "http://localhost:8000/transactions"finance-agent/
├── main.py # FastAPI application and endpoints
├── requirements.txt # Python dependencies
├── docker-compose.yml # Docker Compose configuration for PostgreSQL
├── Dockerfile # Docker configuration (if applicable)
├── tools/ # Utility modules
│ ├── call_llm.py # LLM API integration (OpenAI/Anthropic)
│ └── llm_system_prompt.py # LLM prompts for receipt parsing
└── docker/
└── postgres/
└── init/
└── 01-init-schema.sql # Database schema initialization
The application uses the following main tables:
- categories: Stores transaction categories (Food, Transport, Shopping, etc.)
- payment_methods: Stores payment methods (Cash, Card, etc.)
- transactions: Stores transaction records with references to categories and payment methods
See docker/postgres/init/01-init-schema.sql for the complete schema.
The OCR model is automatically configured on startup:
- Uses CUDA if available, otherwise falls back to CPU
- Model: DocTR with
fast_basedetection andparseqrecognition architectures - Supports orientation detection and page straightening
The application supports both OpenAI and Anthropic:
- Default provider can be configured via environment variables
- Model selection and token limits are configurable
- See
tools/call_llm.pyfor implementation details
The --reload flag enables auto-reload on code changes:
uvicorn main:app --reloadThe application uses Python's logging module. Logs are output to stdout with INFO level by default.
- Ensure PyTorch is properly installed
- For GPU support, install CUDA-compatible PyTorch version
- Model weights are cached in
~/.cache/huggingface
- Verify PostgreSQL is running:
docker-compose ps - Check database credentials in
.envfile - Ensure database schema is initialized
- Verify API keys are set in
.envfile - Check API rate limits and quotas
- Review logs for specific error messages
[Add your license information here]