Inspiration
Intake screening is where a law firm wins or loses before a single brief is filed. Every call must be checked for conflicts, statute-of-limitations risk, and economic viability, often under time pressure and with incomplete information. Today that work is manual, inconsistent, and buried in unstructured notes.
We built Donna because intake screeners do repeatable, high-stakes work that AI can accelerate, but the final call must stay with an attorney. Donna automates everything up to accept/reject, then pauses for human review. No legal advice to callers, no autonomous decisions.
What it does
Donna turns an intake call into a structured screening recommendation:
Capture — guided intake chat or pasted transcript Screen — a LangGraph pipeline extracts facts, checks gates, and scores the matter Review — an attorney sees the recommendation, rationale, comparables, and source transcript, then takes or passes (with optional override) Hard conflicts and expired SOL short-circuit early — there is no point valuing a matter the firm cannot ethically or legally take. Everything else flows through comparable-matter retrieval and expected-value scoring before pausing at human_review.
How we built it
Backend — LangGraph + Claude Frontend — Next.js review UI The web/ app mirrors the pipeline API shape (start_intake / resume_with_decision) with a three-phase flow: Intake → Processing → Attorney Review. The review dashboard surfaces legal gates, EV breakdown, comparable matters, and the source transcript side-by-side so every recommendation is auditable.
Closing the loop (offline) Profitability arrives months after intake, so Donna does not learn online. A batch reconcile step compares predicted vs. realized economics per cohort. Recalibration stays as a human decision.
Challenges we ran into
Drawing the UPL line. The LLM extracts facts and explains reasoning; it never gives callers legal advice and never makes the final accept/reject call. Legally load-bearing checks (conflicts, SOL, EV threshold) are deterministic and inspectable.
When to stop vs. when to score. A hard conflict or expired SOL must halt before valuation — but borderline SOL cases need a human even if EV looks good. Routing logic in the graph had to encode that cleanly.
Entity matching without embeddings. Conflicts are too important to fuzzy-match with vectors. We built normalized substring + token-superset matching so "Meridian Health Systems" in a transcript hits an indexed client reliably.
Sparse comparable cohorts. k-NN over (claim_type, jurisdiction) sometimes returns thin cohorts; the retriever widens automatically, and an empty index fails loud rather than fabricating numbers.
Delayed labels. You cannot train on profitability at intake time. We designed for offline reconciliation instead of pretending the model can self-correct in production.
Stateful human-in-the-loop. LangGraph's interrupt / Command(resume=...) pattern required careful API design so the frontend, CLI, and future HTTP layer all share the same pause-resume contract.
Accomplishments that we're proud of
A complete intake screening pipeline from transcript to attorney decision, with sample cases for clean accept, hard conflict, and expired SOL paths Hybrid architecture — AI for language, code for math and legal gates A review UI that makes every recommendation traceable: transcript, gates, EV components, and comparables in one view An honest learning design — predicted vs. realized reconciliation without unsafe online feedback loops Guided intake mode that compiles a structured, auditable transcript before screening even begins
What we learned
Intake AI is not a chatbot problem, it is a workflow orchestration problem with strict accountability requirements. The highest-leverage design choice was separating explanation (LLM) from decision inputs (deterministic code). Attorneys trust Donna when they can see exactly why a matter scored the way it did.
We also learned that fail-loud beats fail-soft in legal tooling: missing SOL data or an empty matter index should raise an error, not silently guess. And in contingency-fee PI work, firm economics are naturally expressed as expected value over a cohort, comparables-based EV is more defensible than asking a model to invent a dollar figure.
What's next for Donna
Wire the live backend — HTTP layer around agent/pipeline.py, replacing mock data in web/src/lib/api/intake.ts Durable graph state — swap in SqliteSaver so intakes survive process restarts and can be served from an API Production conflict index — integrate with firm CRM / matter-management systems instead of a static JSON seed Expand SOL coverage — more jurisdictions, discovery-date accrual rules, and maintained rule tables Cohort calibration dashboard — surface trainer reconciliation output so partners can adjust EV thresholds with data, not gut feel Real-time intake — stream guided intake and pipeline progress for live call-center use
Built With
- api
- chromadb
- css
- langgraph
- next.js
- openai
- pydantic
- python
- react
- sentence-transformers
- tailwind
- typescript
Log in or sign up for Devpost to join the conversation.