TrustGate

The approval layer AI workflows forgot.

vibeFORWARD Hackathon · Track 2 — Guardrails & Safe Use · New York City

TrustGate sits between an AI agent and the humans who have to sign off, running live guardrail checks on every AI decision and showing why each recommendation should or shouldn't be trusted. We're starting with AI-driven vendor invoice approval — a workflow every Fortune 500 runs, and one that lost businesses $2.9 billion to invoice fraud in 2023 alone.

Why this problem, right now

Enterprises are deploying AI into high-stakes workflows — invoice approvals, insurance claims, hiring, policy interpretation — faster than they can govern them. When it goes wrong, they eat the liability. Today's "guardrail" is hope, plus a human rubber-stamp.

Four incidents that make this concrete:

When	What happened	What it means
Mar 2024	NYC's own MyCity chatbot told small business owners they could take worker tips, fire people for reporting harassment, and refuse Section 8 tenants — all illegal under NYC law	The City of New York shipped an unguarded LLM into a compliance-sensitive workflow. It stayed live after The Markup reported it.
Feb 2024	Air Canada was ruled legally liable for a fare its chatbot hallucinated — tribunal rejected the "the AI is a separate entity" defense	AI hallucinations are no longer embarrassing; they are a balance-sheet liability.
2023	JPMorgan banned ChatGPT firm-wide for 250,000+ employees	The biggest US bank chose "turn it off" over "deploy without a trust layer." That trust layer is the opportunity.
Oct 2025	Deloitte Australia issued a partial refund on a government report due to AI-hallucinated citations	Even Big 4 consultancies, with all the process controls money can buy, ship hallucinated compliance content to paying clients.

And the regulatory tailwind: NYC Local Law 144 (in effect since July 2023) already requires bias audits for AI used in hiring decisions. AI-governance-as-a-product isn't speculative — it's a line item.

What we built

For every AI invoice recommendation, TrustGate runs four guardrail checks in parallel before the decision ever reaches the AP manager's queue:

1. Policy Check (LLM-grounded)

Claude compares the invoice against a written policy document (policy.md) and cites the exact clause violated. Catches over-limit amounts, unapproved vendor categories, missing PO references.

2. Source Verification (Tavily-powered)

For any external claim — vendor legitimacy, business address, domain authenticity — we hit Tavily for live web evidence and surface the source URLs. This is the guardrail that catches vendor impersonation fraud (lookalike domains, shell companies, vendors that don't exist on the open web) — the exact pattern in billions of dollars of FBI-reported Business Email Compromise loss.

3. Anomaly Check (statistical)

Flags invoices whose amount sits more than 2σ above a vendor's historical mean, or first-time-vendor spikes over policy thresholds. Non-LLM on purpose — not every guardrail should depend on a model.

4. Confidence Decomposition (explainability)

The AI agent's "94% confident" is meaningless to a human reviewer. TrustGate breaks that number into three weighted factors and flags when the factors don't support the headline confidence. (This is the Trust & Transparency virtue we borrow from Track 1.)

Each check returns { status: "pass" | "flag" | "block", reason, evidence }. Each decision and verdict writes to an append-only audit log that you can export as JSON — because in regulated workflows, "we have an audit trail" is non-negotiable.

Demo (90 seconds, on stage)

Five seeded invoices, each designed to showcase a different guardrail:

Clean invoice — known vendor, normal amount → all green, one-click approve.
Vendor impersonation — looks like Acme Supplies but domain is .co not .com → Source Verification blocks, with Tavily evidence.
Policy violation — $47,000 invoice under a $10k single-approver cap → Policy Check flags with cited clause.
Historical anomaly — known vendor, 4× their usual amount → Anomaly Check flags with z-score.
Opaque overconfidence — AI says 94% approve, but the rationale is thin → Confidence Decomposition flags, showing that the factors don't support the number.

The demo arc is the pitch arc.

Market

The AI-guardrails category is ~18 months old, so we triangulate from three measured adjacencies plus the cost-of-inaction.

Emerging category — AI TRiSM (Trust, Risk, Security Management): Named by Gartner as a top strategic tech trend. Analyst estimates for AI governance tooling cluster in the $1–3B today growing to $10–15B by 2028–2030 range, with wide variance because the category is still being defined.

Established adjacencies TrustGate plugs into:

Market	Size today	Why it matters
Accounts Payable automation	~$3–5B, ~$7–12B by 2030	Our beachhead
Governance, Risk & Compliance (GRC) software	~$50B+	Adjacent buying center
AI model risk / ML observability (Arize, Fiddler, Credo AI)	~$1–3B, fast-growing	Direct competitive set

Cost of doing nothing (often more persuasive than TAM with enterprise judges):

$2.9B in FBI-reported Business Email Compromise losses in 2023 — invoice/vendor fraud is a major share
JPMorgan's 250,000-employee ChatGPT ban = a rough proxy for the productivity tax every enterprise pays when they have no trust layer
Air Canada = every chatbot hallucination now carries a legal price tag that didn't exist 24 months ago

Initial wedge (SOM): mid-market companies (500–5,000 employees) in regulated industries — financial services, healthcare, insurance, government contractors — running AI-assisted AP or procurement. Tens of thousands of US companies, each a $10k–$100k/yr spender once the category matures. A 1% wedge is nine figures of ARR.

How it works — architecture at a glance

┌──────────────┐       ┌──────────────────────┐       ┌───────────────┐
│  AI Agent    │──────▶│  TrustGate Backend   │──────▶│  Approval     │
│ (recommends) │       │  (4 parallel checks) │       │  Queue UI     │
└──────────────┘       │                      │       └───────────────┘
                       │  • Policy (Claude)   │               │
                       │  • Source (Tavily)   │               ▼
                       │  • Anomaly (stats)   │       ┌───────────────┐
                       │  • Confidence (LLM)  │       │  AP Manager   │
                       └──────────────────────┘       │  approves /   │
                                │                     │  overrides    │
                                ▼                     └───────────────┘
                       ┌──────────────────────┐               │
                       │  Append-only Audit   │◀──────────────┘
                       │  Log (JSON export)   │
                       └──────────────────────┘

Stack

Frontend: Lovable → React + Tailwind
Backend: Python FastAPI (single main.py, ~40 lines of async orchestration)
LLM: Anthropic Claude (claude-sonnet-4-6)
Search: Tavily API
Storage: in-memory + JSON file audit log (intentional — no DB fights at hackathon scope)

Repo layout

.
├── README.md          ← you are here
├── SPEC.md            ← full product + build spec
├── policy.md          ← invoice approval policy (fed to the Policy guardrail)
├── pyproject.toml     ← Python deps (managed by uv)
├── .env.example       ← copy to .env and fill in keys
├── src/
│   ├── main.py        ← FastAPI app + routes
│   ├── llm.py         ← Lava-proxied Claude helper
│   ├── guardrails.py  ← 4 guardrail checks (parallel)
│   ├── seed.py        ← 5 demo invoices
│   └── models.py      ← Pydantic schemas
└── hackathon-info.pdf ← vibeFORWARD brief

For the full build plan, seed-data design, API surface, UI spec, demo script, and judge-specific landing points, see SPEC.md.

Run it locally

Prerequisites

Python 3.11+

uv — fast Python package manager

curl -LsSf https://astral.sh/uv/install.sh | sh   # macOS/Linux
# or: brew install uv

Setup (one-time)

git clone https://github.com/naga-k/trustgate
cd trustgate

# install deps into an isolated .venv
uv sync

# copy the env template and fill in your keys
cp .env.example .env
# edit .env: LAVA_API_KEY is required, TAVILY_API_KEY is optional (falls back to heuristic)

Run the whole app

uv run uvicorn src.main:app --reload --port 8787

One process serves everything:

http://localhost:8787/ — the app (reference frontend, served from frontend/index.html)
http://localhost:8787/docs — auto-generated OpenAPI UI, every route clickable

The reference frontend is intentionally a single static HTML file (Tailwind CDN + vanilla JS) so the whole system runs from one command with no build step, no node_modules, no deploy dependency. Teammates can replace it with a richer frontend (Lovable, Next.js, etc.) and just point at the same API.

Smoke test

# health
curl http://localhost:8787/api/health

# list all 5 seed invoices with their AI recommendations
curl http://localhost:8787/api/invoices | jq

# run all 4 guardrails on the vendor-impersonation invoice
curl -X POST http://localhost:8787/api/invoices/INV-8822/run-guardrails | jq

API routes

Route	Purpose
`GET /api/health`	liveness check
`GET /api/invoices`	list all invoices with AI rec + latest verdicts
`GET /api/invoices/{id}`	full detail for one invoice
`POST /api/invoices/{id}/run-guardrails`	fire all 4 checks in parallel
`POST /api/invoices/{id}/decide`	record a human approve/review/block decision
`GET /api/audit-log`	append-only verdict + decision log

Environment variables

Var	Required?	What
`LAVA_API_KEY`	yes	Lava gateway key (`aks_live_...` or `aks_test_...`). All Claude traffic routes through Lava.
`TAVILY_API_KEY`	optional	Tavily search key for live vendor-legitimacy verification. Without it, Source Verification falls back to a TLD-heuristic.

Team setup

After cloning:

uv sync                     # install deps
cp .env.example .env        # then paste your Lava key (and Tavily if you have one)
uv run uvicorn src.main:app --reload --port 8787

That's it. No global Python install, no pip install juggling, no venv activation ceremony — uv handles all of it.

For the frontend (Lovable-generated React app), see the frontend/ directory once it's committed, or point your Lovable project's API base URL at http://localhost:8787.

Team

Built at vibeFORWARD NYC.

License

MIT — see LICENSE (to be added).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TrustGate

Why this problem, right now

What we built

1. Policy Check (LLM-grounded)

2. Source Verification (Tavily-powered)

3. Anomaly Check (statistical)

4. Confidence Decomposition (explainability)

Demo (90 seconds, on stage)

Market

How it works — architecture at a glance

Stack

Repo layout

Run it locally

Prerequisites

Setup (one-time)

Run the whole app

Smoke test

API routes

Environment variables

Team setup

Team

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
cache		cache
frontend		frontend
invoices		invoices
scripts		scripts
src		src
.env.example		.env.example
.gitignore		.gitignore
API.md		API.md
DEVPOST.md		DEVPOST.md
README.md		README.md
SPEC.md		SPEC.md
hackathon-info.pdf		hackathon-info.pdf
policy.md		policy.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

TrustGate

Why this problem, right now

What we built

1. Policy Check (LLM-grounded)

2. Source Verification (Tavily-powered)

3. Anomaly Check (statistical)

4. Confidence Decomposition (explainability)

Demo (90 seconds, on stage)

Market

How it works — architecture at a glance

Stack

Repo layout

Run it locally

Prerequisites

Setup (one-time)

Run the whole app

Smoke test

API routes

Environment variables

Team setup

Team

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages