Inspiration
AI agents are starting to spend real money. OpenAI and Walmart let users buy products without leaving ChatGPT. Google launched agentic checkout. Perplexity partnered with PayPal for in-chat purchases. McKinsey projects AI agents will handle $3-5 trillion in commerce by 2030.
But right now, giving an AI agent your credit card means trusting it completely. There are no spending limits, no merchant restrictions, no human approval for big purchases. And these agents can be tricked. In August 2025, Guardio Labs created a fake Walmart store in under 10 seconds and watched Perplexity's AI browser autofill real credit card details and complete the purchase on the scam site. No human ever confirmed the buy. Separately, a company lost $250,000 when attackers manipulated their AI banking assistant through prompt injection.
New protocols like Google's AP2 are building the trust infrastructure for this future, but they only work when both the buyer's agent and the merchant's system support the protocol. Today, most AI agents just browse regular websites and fill out checkout forms like a human would, with zero protocol protection.
We asked: what if no money could move until an independent system confirmed the purchase matched what the human actually asked for?
That's Argus, named after the all-seeing guardian of Greek mythology.
What it does
Argus sits between any AI shopping agent and the payment. When an agent tries to buy something, Argus:
- Intercepts the purchase before any money moves
- Verifies intent using a two-stage AI pipeline to confirm the agent is actually buying what the user asked for, not something it hallucinated or was tricked into selecting
- Enforces spending rules like transaction limits, merchant whitelists, category budgets, and approval thresholds
- Controls the payment by issuing a scoped single-use virtual card locked to the specific merchant and amount. If denied, no card exists, and the agent simply cannot pay. If the purchase needs review, the user gets a real-time notification to approve or deny.
Users monitor everything through a live dashboard where they can watch transactions, review the AI's reasoning, and handle approvals.
Agents connect to Argus in two ways: through Google's A2A protocol (Argus publishes an Agent Card so compatible agents can discover it automatically), or through our ADK plugin that drops into any Google ADK agent with a few lines of code.
Argus supports multiple agent profiles out of the box. Different agents (shopping, travel, procurement) each get their own rules and budgets. A company can run a shopping agent, an IT subscription agent, and a customer service agent, each governed by different policies through one Argus instance. Enterprise-ready from day one.
How we built it
Backend (FastAPI + SQLite): REST API with JWT auth, WebSocket updates, and a two-stage Gemini evaluation pipeline. The first AI call extracts what the user wanted from chat history alone (it never sees the product page, protecting it from prompt injection). A rules engine then checks spending limits and merchant restrictions. The second AI call cross-checks everything and makes the final decision.
Frontend (React + Vite + shadcn/ui): A fintech-grade dashboard with live transaction feeds, spending category management, human approval workflows, and connection key management.
Shopping Agent (Google ADK + Gemini Computer Use): An AI agent that takes natural language requests, opens a real browser, navigates e-commerce sites, compares products, and calls Argus before paying. If denied, it automatically looks for alternatives.
A2A Protocol Integration: Argus publishes a standard Agent Card so other agents can discover and interact with it programmatically.
Prompt Injection Defense: Instead of trying to filter malicious content, we built isolation into the architecture. The shopping agent (which may have browsed pages with hidden instructions) never influences how its purchase gets categorized. The evaluation model only sees the original conversation, not the compromised browsing context. Different model, different context, different trust boundary.
Hedera Consensus Service (Audit Trail): Every evaluation decision is hashed (SHA-256) and submitted to a Hedera HCS topic on testnet. This creates an immutable, timestamped audit trail that anyone can independently verify via HashScan. If a dispute arises weeks later, the full decision record exists on-chain with cryptographic proof it hasn't been altered.
Challenges we ran into
- The "AI checking AI" question. Using one AI to verify another sounds circular. We solved this by putting the evaluation model in a completely separate security boundary. The shopping agent has been exposed to potentially malicious web content; Argus only sees the clean original conversation. Same principle as two independent auditors.
- Browser automation reliability. AI-driven browsing is unpredictable. Pages load differently, layouts shift, CAPTCHAs appear. We built retry logic and fallback flows so the agent recovers gracefully.
- Building in parallel. The agent, API, and dashboard are deeply interconnected. We wrote a comprehensive data spec as our single source of truth before writing any code, so two people could build simultaneously without stepping on each other.
- Real-time sync. Keeping the dashboard, API, and agent in sync during human approval required careful WebSocket orchestration and polling fallbacks.
Accomplishments that we're proud of
- A complete end-to-end flow: user types a request, AI browses real websites, Argus intercepts and evaluates, virtual card gets issued, agent completes checkout
- Prompt injection defense built into the architecture, not bolted on as a filter
- Two integration paths: A2A protocol discovery and a direct ADK plugin
- Payment-level enforcement through scoped virtual cards. Argus doesn't just advise, it controls whether money can move
- Multiple agent profiles with independent rules and budgets, enterprise-ready out of the box
- Professional fintech dashboard with real-time updates and human-in-the-loop approvals
- A comprehensive 10+ table data model with full audit trail
- Immutable on-chain audit log via Hedera Consensus Service, independently verifiable through HashScan
What we learned
- The agentic payments space (AP2, ACP, x402, UCP) is moving fast, and the right move is to complement these protocols, not compete
- Define every API contract before writing code when building in parallel
- Defend against prompt injection through isolation, not filtration
- The real security gap isn't in structured protocol flows. It's when agents fall back to browsing regular websites with no protocol protection
What's next for Argus
- BankSocial integration. Argus is designed to sit on top of BankSocial's ecosystem, with intent verification on top of Secura's fraud detection and real virtual card issuance through Remint's card rails. Argus catches bad decisions; Secura catches bad actors.
- AP2 Risk Signal Provider. AP2's spec has an open Risk Payload field waiting for ecosystem tools. We want Argus to be the first engine that fills it with prompt injection scores, intent-drift signals, and seller verification data.
- Real virtual card issuance. Moving from mock cards to live issuance through BankSocial's Remint, making the single-use payment enforcement production-grade.
- SDK for any agent framework. Expanding beyond Google ADK so any purchasing agent can plug into Argus regardless of the underlying framework.
Built With
- docker
- fastapi
- gemini-computer-use
- google-a2a-protocol
- google-adk
- google-gemini
- hedera
- jwt
- playwright
- python
- react
- shadcn/ui
- sqlalchemy
- sqlite
- tailwindcss
- typescript
- vite
- websocket
Log in or sign up for Devpost to join the conversation.