ADKFinancialFraudDetection

Achitecture Diagram

Inspiration

Financial fraud is a multi-billion-dollar problem that evolves faster than traditional rule-based defences can keep up. I was inspired by the idea of using decoupled, specialized “agents”—each focusing on a slice of the problem—to build a pipeline that’s both scalable and intelligent. By combining lightweight Python agents (ADK) with state-of-the-art ML (TensorFlow) and LLM reasoning (Gemini), we aimed to create a system that could detect, explain, alert, and report on fraud in real time.

What it does

Ingests live transaction streams via Pub/Sub
Monitors each transaction with fast, rule-based checks (e.g. amount thresholds, feature anomalies)
Analyzes flagged transactions with a hybrid approach: a local Keras model for quick triage + Gemini LLM for deeper reasoning and consensus
Alerts operations teams through email, console logs, Slack, and Pub/Sub when high-risk transactions occur
Reports all activity into BigQuery, auto-provisioning tables and generating daily/weekly summaries for Looker Studio dashboards
Exposes a React UI for submitting transactions, tracking status, viewing analysis results, and embedding real-time charts

How we built it

Google Cloud ADK
- Defined four agents (MonitoringAgent, HybridAnalysisAgent, AlertAgent, ReportingAgent) each inheriting from BaseAgent and communicating over Pub/Sub channels.
Machine Learning
- Trained an SVM and a small neural net in TensorFlow/Keras on a balanced subset of the Kaggle credit-card fraud dataset.
- Integrated Google’s Gemini API for LLM-powered explanations and fallback reasoning.
Data Streaming & Storage
- Used Cloud Pub/Sub for durable, asynchronous message passing.
- Streamed results into BigQuery with retry/backoff and recovery logging.
Visualization & UI
- Built a React front end (Create React App + Tailwind) to submit transactions, display progress, show alerts, and embed Looker Studio iframes.
- Designed Looker Studio dashboards to visualize daily fraud rates, risk-level distributions, and top fraud indicators.

Challenges we ran into

Data Imbalance & Volume: The full Kaggle dataset is huge and heavily skewed toward non-fraud. We had to sample, shuffle in synthetic fraud cases, and build a progress monitor to keep local demos snappy.
Agent Orchestration: Wiring four independent agents via Pub/Sub required careful topic/subscription naming and async patterns to avoid message loss or duplication.
Hybrid Inference Logic: Tuning thresholds (when to call Gemini vs. local ML) and merging outputs into a consensus decision took several iterations.
Email & Encoding: Sending multi-recipient Gmail alerts with emojis and HTML bodies surfaced non-breaking-space and Unicode issues that we solved with careful MIME handling and an app-password setup.
Cloud Run Deployment: Containerizing each agent and configuring IAM permissions on Cloud Run proved more complex than our hackathon timeline allowed. We pivoted to deploying the React front end on Vercel and our Python agents on Render, giving us seamless CI/CD and public endpoints in minutes.

Accomplishments that we're proud of

End-to-End Pipeline: From submitting a transaction in the UI to seeing it bookmarked in a Looker Studio dashboard, the entire flow runs seamlessly in a few seconds.
Hybrid AI Strategy: We reduced Gemini calls by >70% while maintaining accuracy by escalating only the riskiest transactions.
ADK Best Practices: Each agent follows proper state management, event-driven loops, and non-blocking async integration with Pub/Sub.
Enterprise-Grade Reporting: The Reporting Agent auto-creates BigQuery schemas, streams data with retries, and generates real SQL-backed reports for daily summaries and trends.

What we learned

ADK Patterns: How to structure agents with InvocationContext, EventActions, and session state for both demo and production modes.
Async Pub/Sub Workflows: Best practices around batching, flow control, streaming pull futures, and cleanup in Python.
Hybrid ML & LLM: The power of a “local first, AI next” approach to balance cost, latency, and accuracy.
Cloud-Native Tooling: Auto-provisioning BigQuery datasets/tables, handling errors with exponential backoff, and embedding Looker Studio dashboards.

What's next for ADKFinancialFraudDetection

Agent-to-Agent (A2A) Protocol: Introduce gRPC-based A2A for lower-latency, synchronous handoffs.
Feedback Loop: Capture analyst verdicts to continually retrain the ML model and refine thresholds.
Expanded Notifications: Add SMS, PagerDuty, and Teams integrations for multi-channel alerting.
Advanced Analytics: Build out monthly/quarterly insight reports (customer segmentation, fraud ROI) and auto-distribute via email/Slack.
Scalability & Security: Harden each container for Cloud Run, add IAM controls for Pub/Sub, and integrate VPC Service Controls.
Improved UI: More fleshed out UI and UX for viewing agent orchestration.

Built With

asyncio
bigquery
chart.js
cloud-run
docker
fastapi
gcloud
github-actions
gmail-smtp
google-adk
google-cloud-bigquery
google-cloud-pub/sub
google-cloud-pubsub
google-generativeai
javascript
looker-studio
python
python-dotenv
react
react-toastify
recharts
render
scikit-learn
slack-webhook
tailwind-css
tensorflow/keras
terraform
typescript
typing-extensions
vercel
vertex-ai

Updates

Roxie Reginold started this project — Jun 23, 2025 04:55 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.