TumorTwin 🫁

Digital Twin for Lung Cancer Treatment Planning

TumorTwin is a hackathon prototype that converts a chest CT scan into:

a segmented 3D tumor mesh
similar historical cases from a synthetic cohort
cohort-level statistics
an LLM-generated treatment plan
live clinical trials and literature via browser agents

The system is designed as a modular clinical AI pipeline with a React frontend and FastAPI backend.

Architecture

Client
  ↓
FastAPI Backend
  ├── DICOM Parsing
  ├── Preprocessing
  ├── Segmentation
  ├── 3D Mesh Generation
  ├── Tumor Encoding
  ├── Similar Case Retrieval
  ├── Cohort Statistics
  ├── Evidence Summary
  ├── Treatment Plan Generation
  └── Browser Agents (Trials + Literature)
  ↓
In-Memory Session Store
  ↓
Frontend Polling + Dashboard Rendering

Stages 1–9 run sequentially. Trial and literature search run asynchronously after plan generation.

Repository Structure

tumortwin/
├── tumortwin-backend/
│   ├── app/
│   │   ├── main.py
│   │   ├── store.py
│   │   ├── fhir_mock.py
│   │   ├── routes/
│   │   ├── pipeline/
│   │   └── agents/
│   ├── data/
│   ├── scripts/
│   └── requirements.txt
│
└── tumortwin-frontend/
    ├── src/
    │   ├── api/
    │   ├── views/
    │   ├── components/
    │   ├── hooks/
    │   └── mocks/
    └── public/

Backend

Stack

API: FastAPI, Uvicorn
Imaging: pydicom, SimpleITK
Segmentation / Mesh: scikit-image, trimesh
Feature Extraction: pyradiomics
Retrieval: ChromaDB
LLM: OpenAI / Claude
Web Search: browser-use

Pipeline

1. DICOM Parsing

Accepts ZIP of .dcm files
Reads and stacks slices into a 3D volume
Extracts metadata (age, sex, spacing, institution)

2. Preprocessing

Converts raw voxels to Hounsfield Units
Resamples to isotropic spacing
Clips + normalizes lung window
Crops to lung region

3. Tumor Segmentation

Primary: segmentation model
Fallback: ROI / bounding-box flow
Produces binary tumor mask

4. Mesh Generation

Runs marching cubes on tumor mask
Smooths + decimates mesh
Exports .glb for frontend rendering
Computes tumor geometry metrics

5. Tumor Encoding

Extracts radiomics features
L2-normalizes embedding vector

6. Similar Case Retrieval

Queries ChromaDB using cosine similarity
Returns top-K synthetic historical cases

7. Cohort Statistics

Computes weighted distributions across retrieved cases:
- histology
- mutations
- stage
- treatment pathways
- outcomes

8. Evidence Summary

LLM summarizes retrieved cohort evidence
Uses only retrieved case data

9. Treatment Plan

LLM generates structured treatment recommendation:
- first-line therapy
- alternatives
- biomarker suggestions
- follow-up plan

10. Trials + Literature

Browser agents search:
- ClinicalTrials.gov
- PubMed
Returns structured, display-ready results
Cached fallback supported

Frontend

Stack

Framework: React + Vite
Styling: Tailwind CSS
3D Rendering: Three.js / React Three Fiber
Charts: Recharts
State: Zustand
HTTP: Axios

Views

Upload

Upload CT ZIP
Optional MRN input
Starts new session

Processing

Polls backend stage status
Shows stage-by-stage progress

Dashboard

3D tumor viewer
Similar case cards
Cohort stats
Evidence summary
Treatment plan
Clinical trials
Supporting literature

API

Base path: /api/v1

Core Endpoints

POST   /sessions
GET    /sessions/{id}/status
GET    /sessions/{id}/dicom
GET    /sessions/{id}/fhir
GET    /sessions/{id}/preprocess
GET    /sessions/{id}/segmentation
GET    /sessions/{id}/mesh
GET    /sessions/{id}/embedding
GET    /sessions/{id}/similar
GET    /sessions/{id}/statistics
GET    /sessions/{id}/summary
GET    /sessions/{id}/treatment-plan
GET    /sessions/{id}/trials
GET    /sessions/{id}/literature
GET    /mesh/{session_id}.glb

Status Response

{
  "name": "segmentation",
  "status": "completed",
  "started_at": "2026-04-05T10:01:23Z",
  "finished_at": "2026-04-05T10:01:45Z",
  "error": null
}

Synthetic Data

TumorTwin uses a seeded synthetic lung cancer casebase for retrieval.

Each case includes:

demographics
histology
mutation profile
TNM stage
treatment history
survival / response outcome
embedding vector

Synthetic cases are stored in:

data/synthetic_cases.json

Generate with:

python scripts/seed_cases.py

Running the Project

Backend

cd tumortwin-backend
python -m venv venv
source venv/bin/activate   # or venv\Scripts\activate on Windows
pip install -r requirements.txt
playwright install
python scripts/seed_cases.py
uvicorn app.main:app --reload --port 8000

Frontend

cd tumortwin-frontend
npm install
npm run dev

Mock Mode

Frontend can develop against mocked API responses in src/mocks/ and later switch to the live backend without component changes.

Design Decisions

In-memory session store for simplicity and speed
ChromaDB in-process to avoid infra overhead
Radiomics features instead of GPU-heavy learned embeddings
GLB mesh output for fast browser rendering
Mock-first frontend integration for parallel development
Browser-based retrieval for real-time trial / literature enrichment

Contributions

Srinivas Sriram: Backend
Krish Kankure: Backend
Aaditya Pillai: Frontend
Varun Sinha: Frontend

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
tumortwin-backend		tumortwin-backend
tumortwin-frontend		tumortwin-frontend
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

TumorTwin 🫁

Architecture

Repository Structure

Backend

Stack

Pipeline

1. DICOM Parsing

2. Preprocessing

3. Tumor Segmentation

4. Mesh Generation

5. Tumor Encoding

6. Similar Case Retrieval

7. Cohort Statistics

8. Evidence Summary

9. Treatment Plan

10. Trials + Literature

Frontend

Stack

Views

Upload

Processing

Dashboard

API

Core Endpoints

Status Response

Synthetic Data

Running the Project

Backend

Frontend

Mock Mode

Design Decisions

Contributions

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages