DealGraph

AI-powered due diligence copilot for venture capital investors. Upload any startup's pitch deck and get a confidence-scored investment analysis with verified claims, competitive intelligence, and a voice-narrated deal memo.

Overview

DealGraph takes a pitch deck (PDF or text), extracts every verifiable claim, and routes each claim to the right verification engine — graph database, live web search, or LLM judgment. The result is a confidence-weighted investment score, a structured memo, and a voice briefing.

Agent	Role
ClaimExtractor	Pulls every verifiable claim from the pitch deck
ClaimRouter	Classifies claims: factual_static, factual_dynamic, qualitative, unverifiable
GraphResolver	Verifies static facts against the Memgraph knowledge graph (optional)
WebResolver	Verifies dynamic facts via Tavily web search
LLMJudge	Assesses qualitative claims using LLM reasoning
EvidenceNormalizer	Standardizes all evidence with source, freshness, and confidence
DealScorer	Confidence-weighted scoring across 5 dimensions
MemoWriter	Generates investment memo + 60-90s voice briefing

Key Features

Works for any company — no pre-seeded data required; web search verifies claims in real time
Claim-level routing — each claim goes to the right resolver (graph, web, LLM, or flagged)
Confidence-aware scoring — verified claims count more; contradicted claims reduce scores
Competitive landscape graph — D3 force-directed visualization of competitors
Voice deal memos — AI-narrated audio briefings via edge-tts
CopilotKit chat — conversational follow-up with generative UI cards
Memgraph optional — graph DB enriches results when connected, not required
Multi-provider LLM — supports Groq, Ollama, Together.ai, and OpenAI

Tech Stack

Backend: Python + FastAPI + Strands Agents
Frontend: Next.js 15 + React + Tailwind CSS + CopilotKit
LLM: Groq (Llama 3.3 70B) / Ollama / Together.ai / OpenAI
Web Search: Tavily (free tier: 1000 searches/month)
Graph DB: Memgraph (optional, Bolt-compatible)
TTS: edge-tts (free Microsoft Edge voices)
Visualization: D3.js force-directed graph

Project Structure

dealgraph/
├── backend/
│   ├── main.py                         # FastAPI app, endpoints, CopilotKit handler
│   ├── model_config.py                 # LLM provider factory (Groq/Ollama/Together/OpenAI)
│   ├── requirements.txt
│   ├── agents/
│   │   ├── orchestrator.py             # Pipeline: Extract → Route → Resolve → Normalize → Score → Memo
│   │   ├── claim_extractor.py          # Step 1: Extract claims from pitch deck
│   │   ├── claim_router.py             # Step 2: Classify claims into 4 categories
│   │   ├── llm_judge.py               # Step 3c: Assess qualitative claims
│   │   ├── evidence_normalizer.py      # Step 4: Standardize evidence (source, confidence)
│   │   ├── deal_scorer.py              # Step 5: Confidence-weighted scoring
│   │   ├── memo_writer.py             # Step 6: Investment memo + voice briefing
│   │   └── shared_state.py            # Per-request state isolation (contextvars)
│   └── tools/
│       ├── graph_resolver.py           # Step 3a: Verify against Memgraph
│       ├── web_resolver.py             # Step 3b: Verify via Tavily web search
│       ├── neo4j_tools.py              # Memgraph connection + Cypher queries
│       ├── minimax_tts.py              # edge-tts audio generation
│       └── deck_parser.py              # PDF text extraction
├── frontend/
│   ├── src/
│   │   ├── app/
│   │   │   ├── page.tsx                # Dashboard (upload, analyze, results)
│   │   │   └── chat/page.tsx           # Full-page CopilotKit chat
│   │   ├── components/
│   │   │   ├── cards/                  # Shared generative UI cards
│   │   │   │   ├── CompetitorCard.tsx
│   │   │   │   ├── FounderCard.tsx
│   │   │   │   ├── MarketCard.tsx
│   │   │   │   └── DealSummaryCard.tsx
│   │   │   ├── DeckUpload.tsx
│   │   │   ├── DealScorecard.tsx
│   │   │   ├── CompetitiveGraph.tsx    # D3 force-directed graph
│   │   │   ├── ClaimTracker.tsx
│   │   │   ├── DealChat.tsx
│   │   │   └── CopilotPopupChat.tsx
│   │   └── lib/
│   │       ├── types.ts
│   │       ├── api.ts
│   │       └── utils.ts
│   └── package.json
├── docs/
│   └── memgraph-railway-option-b.md    # Memgraph on Railway guide
├── docker-compose.yml                  # Memgraph container (optional)
├── .env.example
└── README.md

Setup Instructions

Prerequisites

Python 3.11+ — Download
Node.js 18+ — Download
Tavily API key (free) — Get key
Groq API key (free) — Get key

Step 1: Clone and Configure

git clone <repository-url>
cd dealgraph
cp .env.example backend/.env

Edit backend/.env — set at minimum:

LLM_PROVIDER=groq
GROQ_API_KEY=gsk_your_key_here
TAVILY_API_KEY=tvly-your_key_here

Step 2: Backend

cd backend
python -m venv venv
# Windows: venv\Scripts\activate
# macOS/Linux: source venv/bin/activate
pip install -r requirements.txt
uvicorn main:app --reload --port 8000

Step 3: Frontend

cd frontend
npm install
npm run dev

Step 4: Access

Service	URL
Frontend	http://localhost:3000
Backend API	http://localhost:8000
API Documentation	http://localhost:8000/docs

Optional: Memgraph

For graph-backed verification of known companies:

docker compose up memgraph -d
python seed_memgraph.py

Then add to backend/.env:

MEMGRAPH_URI=bolt://localhost:7687

Environment Variables

Variable	Required	Description
`LLM_PROVIDER`	Yes	`groq`, `ollama`, `together`, or `openai`
`GROQ_API_KEY`	If Groq	Groq Cloud API key
`TAVILY_API_KEY`	Yes	Tavily web search API key
`MEMGRAPH_URI`	No	Bolt URI for Memgraph (e.g. `bolt://localhost:7687`)
`EDGE_TTS_VOICE`	No	TTS voice (default: `en-US-GuyNeural`)
`NEXT_PUBLIC_API_URL`	No	Backend URL for frontend (default: `http://localhost:8000`)
Telemetry		Optional observability
`OTEL_EXPORTER_OTLP_TRACES_ENDPOINT`	No	SigNoz: OTLP endpoint (e.g. `https://ingest.us.signoz.cloud`)
`OTEL_EXPORTER_OTLP_HEADERS`	No	SigNoz: `signoz-ingestion-key=<your-ingestion-key>`
`LANGFUSE_SECRET_KEY`	No	Langfuse: LLM tracing (Groq/Together/OpenAI)
`LANGFUSE_PUBLIC_KEY`	No	Langfuse: public key
`LANGFUSE_BASE_URL`	No	Langfuse server (default: `https://cloud.langfuse.com`)

How It Works

PitchDeck (PDF/text)
       |
       v
 ClaimExtractor         — pulls 5-15 verifiable claims
       |
       v
   ClaimRouter           — classifies each claim
       |
       +-- factual_static  --> GraphResolver --> Evidence
       |                        (Memgraph, if connected)
       |
       +-- factual_dynamic --> WebResolver   --> Evidence
       |                        (Tavily live search)
       |
       +-- qualitative     --> LLMJudge     --> Evidence
       |                        (LLM reasoning)
       |
       +-- unverifiable    --> Flag
       |
       v
 EvidenceNormalizer      — adds: source, freshness, confidence (0.0-1.0)
       |
       v
    DealScorer           — confidence-weighted scoring
       |                    Team 30% | Market 25% | Traction 20%
       |                    Competition 15% | Financials 10%
       v
    MemoWriter           — investment memo + 60-90s voice briefing

Scoring Logic

Evidence Status	Confidence	Effect on Score
Verified	> 0.7	Full weight
Verified	0.4 - 0.7	Proportional weight
Unverified	Low	Docked 1-2 points for lack of transparency
Contradicted	High	Actively reduces score (2-4 range)
Flagged	N/A	Noted but not penalized (projections)

API Endpoints

Endpoint	Method	Description
`/api/health`	GET	Health check (includes `memgraph`: ok / disabled / error)
`/api/analyze`	POST	Run full analysis pipeline
`/api/extract-pdf`	POST	Extract text from uploaded PDF
`/api/audio/{file}`	GET	Serve generated audio files
`/copilotkit`	POST	CopilotKit AG-UI streaming endpoint

Telemetry (optional)

DealGraph supports two optional observability integrations:

SigNoz — Application performance and distributed tracing (OpenTelemetry). Set OTEL_EXPORTER_OTLP_TRACES_ENDPOINT and optionally OTEL_EXPORTER_OTLP_HEADERS (e.g. signoz-ingestion-key=<key> for SigNoz Cloud). FastAPI requests are auto-instrumented.
Langfuse — LLM observability (prompts, completions, token usage, latency). Set LANGFUSE_SECRET_KEY, LANGFUSE_PUBLIC_KEY, and optionally LANGFUSE_BASE_URL (e.g. https://eu.cloud.langfuse.com). The backend patches the OpenAI-compatible client at startup, so all Groq, Together, and OpenAI calls are traced. Ollama runs are not traced.

Both are no-ops when the corresponding env vars are unset. See .env.example for full variable names.

Deployment

Backend on Railway

See the Railway deployment guide for step-by-step instructions, including optional Memgraph and seed-on-start. Key environment variables:

LLM_PROVIDER=groq
GROQ_API_KEY=<your key>
TAVILY_API_KEY=<your key>
CORS_ORIGINS=https://your-frontend.vercel.app
# Optional: Memgraph (same project) — set after adding Memgraph service
MEMGRAPH_URI=bolt://memgraph.railway.internal:7687

Frontend on Vercel

Import the repo on Vercel
Set root directory to frontend
Add environment variable: NEXT_PUBLIC_API_URL=https://your-backend.railway.app
Deploy

Troubleshooting

Problem: TAVILY_API_KEY not set — web verification returns empty results

Solution: Get a free key at tavily.com and add it to .env

Problem: LLM returns malformed JSON — claims/scores are empty

Solution: Switch to a larger model. Groq's llama-3.3-70b-versatile is recommended for reliable structured output.

Problem: Memgraph connection refused

Solution: Memgraph is optional. Remove MEMGRAPH_URI from .env to skip graph queries entirely. The pipeline will use web search instead.

Problem: CORS errors in production

Solution: Set CORS_ORIGINS=https://your-frontend-domain.com in the backend environment variables.

Problem: Audio not playing

Solution: Ensure edge-tts is installed (pip install edge-tts). The backend generates a fallback audio file if TTS fails.

Problem: SigNoz shows 401 or no traces

Solution: Check OTEL_EXPORTER_OTLP_HEADERS is set to signoz-ingestion-key=<your-ingestion-key> (from SigNoz Cloud → Settings → Ingestion). Use the correct region endpoint (e.g. https://ingest.us.signoz.cloud for US).

Problem: Langfuse shows no data

Solution: Set LANGFUSE_SECRET_KEY and LANGFUSE_PUBLIC_KEY on the backend (e.g. Railway Variables). Use an LLM provider that goes through the patched client (Groq, Together, or OpenAI — not Ollama). Redeploy and run an analysis; check deploy logs for Langfuse: openai module patched for LLM tracing.

Credits

DealGraph was built as part of the AWS x Anthropic x Datadog GenAI Hackathon, hosted by B.E.L.L.E Community on Friday, February 20 at the AWS Builder Loft, San Francisco.

Originally built with (hackathon stack): Amazon Bedrock, Strands Agents, Datadog observability (dashboards, LLM Observability, Datadog MCP), Neo4j (graph database), MiniMax for TTS, CopilotKit and TestSprite (automated tests) per the hackathon’s core infrastructure requirements. The agent pipeline and claim-routed verification design came from this build.

Current stack: The codebase has since been extended to support multiple LLM providers (Groq, Ollama, Together.ai, OpenAI), Tavily for web verification, optional Memgraph for the knowledge graph, edge-tts for voice memos, and optional SigNoz / Langfuse for observability, with Strands Agents remaining at the core of the pipeline.

License

MIT License

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DealGraph

Table of Contents

Overview

Key Features

Tech Stack

Project Structure

Setup Instructions

Prerequisites

Step 1: Clone and Configure

Step 2: Backend

Step 3: Frontend

Step 4: Access

Optional: Memgraph

Environment Variables

How It Works

Scoring Logic

API Endpoints

Telemetry (optional)

Deployment

Backend on Railway

Frontend on Vercel

Troubleshooting

Credits

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
backend		backend
docs		docs
frontend		frontend
memgraph		memgraph
shared		shared
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml

Folders and files

Latest commit

History

Repository files navigation

DealGraph

Table of Contents

Overview

Key Features

Tech Stack

Project Structure

Setup Instructions

Prerequisites

Step 1: Clone and Configure

Step 2: Backend

Step 3: Frontend

Step 4: Access

Optional: Memgraph

Environment Variables

How It Works

Scoring Logic

API Endpoints

Telemetry (optional)

Deployment

Backend on Railway

Frontend on Vercel

Troubleshooting

Credits

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages