AI-powered due diligence copilot for venture capital investors. Upload any startup's pitch deck and get a confidence-scored investment analysis with verified claims, competitive intelligence, and a voice-narrated deal memo.
- Overview
- Key Features
- Tech Stack
- Project Structure
- Setup Instructions
- Environment Variables
- How It Works
- API Endpoints
- Telemetry
- Deployment
- Troubleshooting
- Credits
- License
DealGraph takes a pitch deck (PDF or text), extracts every verifiable claim, and routes each claim to the right verification engine — graph database, live web search, or LLM judgment. The result is a confidence-weighted investment score, a structured memo, and a voice briefing.
| Agent | Role |
|---|---|
| ClaimExtractor | Pulls every verifiable claim from the pitch deck |
| ClaimRouter | Classifies claims: factual_static, factual_dynamic, qualitative, unverifiable |
| GraphResolver | Verifies static facts against the Memgraph knowledge graph (optional) |
| WebResolver | Verifies dynamic facts via Tavily web search |
| LLMJudge | Assesses qualitative claims using LLM reasoning |
| EvidenceNormalizer | Standardizes all evidence with source, freshness, and confidence |
| DealScorer | Confidence-weighted scoring across 5 dimensions |
| MemoWriter | Generates investment memo + 60-90s voice briefing |
- Works for any company — no pre-seeded data required; web search verifies claims in real time
- Claim-level routing — each claim goes to the right resolver (graph, web, LLM, or flagged)
- Confidence-aware scoring — verified claims count more; contradicted claims reduce scores
- Competitive landscape graph — D3 force-directed visualization of competitors
- Voice deal memos — AI-narrated audio briefings via edge-tts
- CopilotKit chat — conversational follow-up with generative UI cards
- Memgraph optional — graph DB enriches results when connected, not required
- Multi-provider LLM — supports Groq, Ollama, Together.ai, and OpenAI
- Backend: Python + FastAPI + Strands Agents
- Frontend: Next.js 15 + React + Tailwind CSS + CopilotKit
- LLM: Groq (Llama 3.3 70B) / Ollama / Together.ai / OpenAI
- Web Search: Tavily (free tier: 1000 searches/month)
- Graph DB: Memgraph (optional, Bolt-compatible)
- TTS: edge-tts (free Microsoft Edge voices)
- Visualization: D3.js force-directed graph
dealgraph/
├── backend/
│ ├── main.py # FastAPI app, endpoints, CopilotKit handler
│ ├── model_config.py # LLM provider factory (Groq/Ollama/Together/OpenAI)
│ ├── requirements.txt
│ ├── agents/
│ │ ├── orchestrator.py # Pipeline: Extract → Route → Resolve → Normalize → Score → Memo
│ │ ├── claim_extractor.py # Step 1: Extract claims from pitch deck
│ │ ├── claim_router.py # Step 2: Classify claims into 4 categories
│ │ ├── llm_judge.py # Step 3c: Assess qualitative claims
│ │ ├── evidence_normalizer.py # Step 4: Standardize evidence (source, confidence)
│ │ ├── deal_scorer.py # Step 5: Confidence-weighted scoring
│ │ ├── memo_writer.py # Step 6: Investment memo + voice briefing
│ │ └── shared_state.py # Per-request state isolation (contextvars)
│ └── tools/
│ ├── graph_resolver.py # Step 3a: Verify against Memgraph
│ ├── web_resolver.py # Step 3b: Verify via Tavily web search
│ ├── neo4j_tools.py # Memgraph connection + Cypher queries
│ ├── minimax_tts.py # edge-tts audio generation
│ └── deck_parser.py # PDF text extraction
├── frontend/
│ ├── src/
│ │ ├── app/
│ │ │ ├── page.tsx # Dashboard (upload, analyze, results)
│ │ │ └── chat/page.tsx # Full-page CopilotKit chat
│ │ ├── components/
│ │ │ ├── cards/ # Shared generative UI cards
│ │ │ │ ├── CompetitorCard.tsx
│ │ │ │ ├── FounderCard.tsx
│ │ │ │ ├── MarketCard.tsx
│ │ │ │ └── DealSummaryCard.tsx
│ │ │ ├── DeckUpload.tsx
│ │ │ ├── DealScorecard.tsx
│ │ │ ├── CompetitiveGraph.tsx # D3 force-directed graph
│ │ │ ├── ClaimTracker.tsx
│ │ │ ├── DealChat.tsx
│ │ │ └── CopilotPopupChat.tsx
│ │ └── lib/
│ │ ├── types.ts
│ │ ├── api.ts
│ │ └── utils.ts
│ └── package.json
├── docs/
│ └── memgraph-railway-option-b.md # Memgraph on Railway guide
├── docker-compose.yml # Memgraph container (optional)
├── .env.example
└── README.md
- Python 3.11+ — Download
- Node.js 18+ — Download
- Tavily API key (free) — Get key
- Groq API key (free) — Get key
git clone <repository-url>
cd dealgraph
cp .env.example backend/.envEdit backend/.env — set at minimum:
LLM_PROVIDER=groq
GROQ_API_KEY=gsk_your_key_here
TAVILY_API_KEY=tvly-your_key_here
cd backend
python -m venv venv
# Windows: venv\Scripts\activate
# macOS/Linux: source venv/bin/activate
pip install -r requirements.txt
uvicorn main:app --reload --port 8000cd frontend
npm install
npm run dev| Service | URL |
|---|---|
| Frontend | http://localhost:3000 |
| Backend API | http://localhost:8000 |
| API Documentation | http://localhost:8000/docs |
For graph-backed verification of known companies:
docker compose up memgraph -d
python seed_memgraph.pyThen add to backend/.env:
MEMGRAPH_URI=bolt://localhost:7687
| Variable | Required | Description |
|---|---|---|
LLM_PROVIDER |
Yes | groq, ollama, together, or openai |
GROQ_API_KEY |
If Groq | Groq Cloud API key |
TAVILY_API_KEY |
Yes | Tavily web search API key |
MEMGRAPH_URI |
No | Bolt URI for Memgraph (e.g. bolt://localhost:7687) |
EDGE_TTS_VOICE |
No | TTS voice (default: en-US-GuyNeural) |
NEXT_PUBLIC_API_URL |
No | Backend URL for frontend (default: http://localhost:8000) |
| Telemetry | Optional observability | |
OTEL_EXPORTER_OTLP_TRACES_ENDPOINT |
No | SigNoz: OTLP endpoint (e.g. https://ingest.us.signoz.cloud) |
OTEL_EXPORTER_OTLP_HEADERS |
No | SigNoz: signoz-ingestion-key=<your-ingestion-key> |
LANGFUSE_SECRET_KEY |
No | Langfuse: LLM tracing (Groq/Together/OpenAI) |
LANGFUSE_PUBLIC_KEY |
No | Langfuse: public key |
LANGFUSE_BASE_URL |
No | Langfuse server (default: https://cloud.langfuse.com) |
PitchDeck (PDF/text)
|
v
ClaimExtractor — pulls 5-15 verifiable claims
|
v
ClaimRouter — classifies each claim
|
+-- factual_static --> GraphResolver --> Evidence
| (Memgraph, if connected)
|
+-- factual_dynamic --> WebResolver --> Evidence
| (Tavily live search)
|
+-- qualitative --> LLMJudge --> Evidence
| (LLM reasoning)
|
+-- unverifiable --> Flag
|
v
EvidenceNormalizer — adds: source, freshness, confidence (0.0-1.0)
|
v
DealScorer — confidence-weighted scoring
| Team 30% | Market 25% | Traction 20%
| Competition 15% | Financials 10%
v
MemoWriter — investment memo + 60-90s voice briefing
| Evidence Status | Confidence | Effect on Score |
|---|---|---|
| Verified | > 0.7 | Full weight |
| Verified | 0.4 - 0.7 | Proportional weight |
| Unverified | Low | Docked 1-2 points for lack of transparency |
| Contradicted | High | Actively reduces score (2-4 range) |
| Flagged | N/A | Noted but not penalized (projections) |
| Endpoint | Method | Description |
|---|---|---|
/api/health |
GET | Health check (includes memgraph: ok / disabled / error) |
/api/analyze |
POST | Run full analysis pipeline |
/api/extract-pdf |
POST | Extract text from uploaded PDF |
/api/audio/{file} |
GET | Serve generated audio files |
/copilotkit |
POST | CopilotKit AG-UI streaming endpoint |
DealGraph supports two optional observability integrations:
- SigNoz — Application performance and distributed tracing (OpenTelemetry). Set
OTEL_EXPORTER_OTLP_TRACES_ENDPOINTand optionallyOTEL_EXPORTER_OTLP_HEADERS(e.g.signoz-ingestion-key=<key>for SigNoz Cloud). FastAPI requests are auto-instrumented. - Langfuse — LLM observability (prompts, completions, token usage, latency). Set
LANGFUSE_SECRET_KEY,LANGFUSE_PUBLIC_KEY, and optionallyLANGFUSE_BASE_URL(e.g.https://eu.cloud.langfuse.com). The backend patches the OpenAI-compatible client at startup, so all Groq, Together, and OpenAI calls are traced. Ollama runs are not traced.
Both are no-ops when the corresponding env vars are unset. See .env.example for full variable names.
See the Railway deployment guide for step-by-step instructions, including optional Memgraph and seed-on-start. Key environment variables:
LLM_PROVIDER=groq
GROQ_API_KEY=<your key>
TAVILY_API_KEY=<your key>
CORS_ORIGINS=https://your-frontend.vercel.app
# Optional: Memgraph (same project) — set after adding Memgraph service
MEMGRAPH_URI=bolt://memgraph.railway.internal:7687
- Import the repo on Vercel
- Set root directory to
frontend - Add environment variable:
NEXT_PUBLIC_API_URL=https://your-backend.railway.app - Deploy
Problem: TAVILY_API_KEY not set — web verification returns empty results
Solution: Get a free key at tavily.com and add it to .env
Problem: LLM returns malformed JSON — claims/scores are empty
Solution: Switch to a larger model. Groq's llama-3.3-70b-versatile is recommended for reliable structured output.
Problem: Memgraph connection refused
Solution: Memgraph is optional. Remove MEMGRAPH_URI from .env to skip graph queries entirely. The pipeline will use web search instead.
Problem: CORS errors in production
Solution: Set CORS_ORIGINS=https://your-frontend-domain.com in the backend environment variables.
Problem: Audio not playing
Solution: Ensure edge-tts is installed (pip install edge-tts). The backend generates a fallback audio file if TTS fails.
Problem: SigNoz shows 401 or no traces
Solution: Check OTEL_EXPORTER_OTLP_HEADERS is set to signoz-ingestion-key=<your-ingestion-key> (from SigNoz Cloud → Settings → Ingestion). Use the correct region endpoint (e.g. https://ingest.us.signoz.cloud for US).
Problem: Langfuse shows no data
Solution: Set LANGFUSE_SECRET_KEY and LANGFUSE_PUBLIC_KEY on the backend (e.g. Railway Variables). Use an LLM provider that goes through the patched client (Groq, Together, or OpenAI — not Ollama). Redeploy and run an analysis; check deploy logs for Langfuse: openai module patched for LLM tracing.
DealGraph was built as part of the AWS x Anthropic x Datadog GenAI Hackathon, hosted by B.E.L.L.E Community on Friday, February 20 at the AWS Builder Loft, San Francisco.
Originally built with (hackathon stack): Amazon Bedrock, Strands Agents, Datadog observability (dashboards, LLM Observability, Datadog MCP), Neo4j (graph database), MiniMax for TTS, CopilotKit and TestSprite (automated tests) per the hackathon’s core infrastructure requirements. The agent pipeline and claim-routed verification design came from this build.
Current stack: The codebase has since been extended to support multiple LLM providers (Groq, Ollama, Together.ai, OpenAI), Tavily for web verification, optional Memgraph for the knowledge graph, edge-tts for voice memos, and optional SigNoz / Langfuse for observability, with Strands Agents remaining at the core of the pipeline.
MIT License