Digital Twin for Lung Cancer Treatment Planning
TumorTwin is a hackathon prototype that converts a chest CT scan into:
- a segmented 3D tumor mesh
- similar historical cases from a synthetic cohort
- cohort-level statistics
- an LLM-generated treatment plan
- live clinical trials and literature via browser agents
The system is designed as a modular clinical AI pipeline with a React frontend and FastAPI backend.
Client
↓
FastAPI Backend
├── DICOM Parsing
├── Preprocessing
├── Segmentation
├── 3D Mesh Generation
├── Tumor Encoding
├── Similar Case Retrieval
├── Cohort Statistics
├── Evidence Summary
├── Treatment Plan Generation
└── Browser Agents (Trials + Literature)
↓
In-Memory Session Store
↓
Frontend Polling + Dashboard Rendering
Stages 1–9 run sequentially. Trial and literature search run asynchronously after plan generation.
tumortwin/
├── tumortwin-backend/
│ ├── app/
│ │ ├── main.py
│ │ ├── store.py
│ │ ├── fhir_mock.py
│ │ ├── routes/
│ │ ├── pipeline/
│ │ └── agents/
│ ├── data/
│ ├── scripts/
│ └── requirements.txt
│
└── tumortwin-frontend/
├── src/
│ ├── api/
│ ├── views/
│ ├── components/
│ ├── hooks/
│ └── mocks/
└── public/
- API: FastAPI, Uvicorn
- Imaging: pydicom, SimpleITK
- Segmentation / Mesh: scikit-image, trimesh
- Feature Extraction: pyradiomics
- Retrieval: ChromaDB
- LLM: OpenAI / Claude
- Web Search: browser-use
- Accepts ZIP of
.dcmfiles - Reads and stacks slices into a 3D volume
- Extracts metadata (age, sex, spacing, institution)
- Converts raw voxels to Hounsfield Units
- Resamples to isotropic spacing
- Clips + normalizes lung window
- Crops to lung region
- Primary: segmentation model
- Fallback: ROI / bounding-box flow
- Produces binary tumor mask
- Runs marching cubes on tumor mask
- Smooths + decimates mesh
- Exports
.glbfor frontend rendering - Computes tumor geometry metrics
- Extracts radiomics features
- L2-normalizes embedding vector
- Queries ChromaDB using cosine similarity
- Returns top-K synthetic historical cases
- Computes weighted distributions across retrieved cases:
- histology
- mutations
- stage
- treatment pathways
- outcomes
- LLM summarizes retrieved cohort evidence
- Uses only retrieved case data
- LLM generates structured treatment recommendation:
- first-line therapy
- alternatives
- biomarker suggestions
- follow-up plan
- Browser agents search:
- ClinicalTrials.gov
- PubMed
- Returns structured, display-ready results
- Cached fallback supported
- Framework: React + Vite
- Styling: Tailwind CSS
- 3D Rendering: Three.js / React Three Fiber
- Charts: Recharts
- State: Zustand
- HTTP: Axios
- Upload CT ZIP
- Optional MRN input
- Starts new session
- Polls backend stage status
- Shows stage-by-stage progress
- 3D tumor viewer
- Similar case cards
- Cohort stats
- Evidence summary
- Treatment plan
- Clinical trials
- Supporting literature
Base path: /api/v1
POST /sessions
GET /sessions/{id}/status
GET /sessions/{id}/dicom
GET /sessions/{id}/fhir
GET /sessions/{id}/preprocess
GET /sessions/{id}/segmentation
GET /sessions/{id}/mesh
GET /sessions/{id}/embedding
GET /sessions/{id}/similar
GET /sessions/{id}/statistics
GET /sessions/{id}/summary
GET /sessions/{id}/treatment-plan
GET /sessions/{id}/trials
GET /sessions/{id}/literature
GET /mesh/{session_id}.glb
{
"name": "segmentation",
"status": "completed",
"started_at": "2026-04-05T10:01:23Z",
"finished_at": "2026-04-05T10:01:45Z",
"error": null
}TumorTwin uses a seeded synthetic lung cancer casebase for retrieval.
Each case includes:
- demographics
- histology
- mutation profile
- TNM stage
- treatment history
- survival / response outcome
- embedding vector
Synthetic cases are stored in:
data/synthetic_cases.json
Generate with:
python scripts/seed_cases.pycd tumortwin-backend
python -m venv venv
source venv/bin/activate # or venv\Scripts\activate on Windows
pip install -r requirements.txt
playwright install
python scripts/seed_cases.py
uvicorn app.main:app --reload --port 8000cd tumortwin-frontend
npm install
npm run devFrontend can develop against mocked API responses in src/mocks/ and later switch to the live backend without component changes.
- In-memory session store for simplicity and speed
- ChromaDB in-process to avoid infra overhead
- Radiomics features instead of GPU-heavy learned embeddings
- GLB mesh output for fast browser rendering
- Mock-first frontend integration for parallel development
- Browser-based retrieval for real-time trial / literature enrichment
- Srinivas Sriram: Backend
- Krish Kankure: Backend
- Aaditya Pillai: Frontend
- Varun Sinha: Frontend