Skip to content

Atharva-Jayappa/atlas-lahacks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ATLAS

The on-prem agentic operating layer for the emergency department. LA Hacks 2026 · UCLA · Track: Catalyst for Care

Five specialist AI agents on a $4K box in the hospital, plus a patient-side companion in your pocket. Records never leave the building. Records never leave the patient's body. Every claim has a receipt. Every receipt is doctor-correctable.


1. Why we built this

US emergency departments are the most chaotic, highest-stakes coordination environments in healthcare — and the numbers aren't opinion:

Metric Reality
Doctor time on documentation ~2 hours of every 4-hour shift
Median ED length of stay 4–6 hours; 12+ at high-volume hospitals
ED boarding (waiting for an inpatient bed) 36% of admissions board >4 hours
Documentation-related burnout in EM physicians Highest of any specialty
Annual US healthcare spend on coordination failures ~$200B
Annual US prior-auth waste ~$25B

These aren't separate problems. They're the same problem from different angles: the ED has rich data and brilliant clinicians, but the data doesn't reach the clinicians at the moment they need it.

Existing "ambient AI" scribes (Abridge, Suki, Nuance DAX) all ship encounter audio to AWS — unusable for hospitals that legally or culturally cannot send PHI to the cloud. Epic and Cerner ship AI on a 7-year roadmap. Until 2025, sustained 70B-class inference cost $30K+ in GPUs. The ASUS Ascent GX10 at $4K is genuinely new hardware economics, and it opened the window for ATLAS.


2. The impact

ATLAS targets four ED pain points simultaneously, on one substrate:

  1. Documentation burden. Scribe writes the H&P note from live encounter audio, structured and ready to sign — saving ~2 hours of every 4-hour shift. Burnout reduction is real and measurable; staff retention ROI is the line CIOs care about.
  2. Inter-department coordination. Quartermaster watches imaging, labs, and pharmacy queues, predicts bottlenecks from historical wait times, and pings the right department before a patient is stuck waiting.
  3. Discharge prior auth. Advocate builds the PA packet during the visit so the patient leaves the ED with it already submitted. Standard wait of 18 days collapses to ~6 hours. $25B/year of industry waste, addressed visit-by-visit.
  4. Handoff quality. Conductor turns the full encounter context into a structured SBAR handoff at shift change — eliminating the verbal-handoff patient-safety hazard.

And underneath all four: PHI never leaves the building. Cloud AI products ship your body to Amazon's servers. ATLAS runs entirely on a $4K box on the hospital floor, with zero outbound calls during inference. The same engine extends to ICU coordination, OR scheduling, and ambulatory clinic — the ED is the slice that makes the demo undeniable.

On the patient side, the Companion app mirrors this guarantee: encounter audio, paper records, and insurance documents never leave the patient's phone. The hospital and the patient hold parallel ledgers of the same encounter, and neither ledger ever crosses the firewall to the cloud.


3. Tech stack

3.1 ASUS Ascent GX10 — on-prem local inference

The GX10 is the heart of ATLAS. 128 GB unified memory holds Qwen3-32B (heavy reasoning), Qwen3-8B (high-frequency coordination), Whisper-large-v3 (STT), and Qwen3-Embedding-8B simultaneously, with petaflop FP16/FP8 enabling 5 concurrent agent streams.

Why on-prem matters — the HIPAA story:

  • HIPAA's Privacy Rule and Security Rule treat encounter audio, vitals, chart text, lab values, and imaging as Protected Health Information (PHI). Every cloud transmission is a Business Associate Agreement, an audit surface, and a breach-disclosure obligation.
  • ATLAS runs entirely on the GX10 on the hospital LAN. Zero outbound calls during inference. No BAAs to negotiate, no audit surface to defend, no cloud breach pathway.
  • State-level legislation banning cloud transmission of certain PHI categories is proliferating in 2026; on-prem is the only future-proof posture for many systems.
  • Network-resilient by design: ED internet is famously flaky, and an ambulance bay can't depend on a working uplink. ATLAS doesn't.

Inference runtime: vLLM primary (excellent Qwen3 + Blackwell ARM CUDA support), Ollama backup for bulletproof demo recovery. The orchestrator abstracts model backends behind a single ATLAS_LLM_BACKEND flag — vllm, ollama, or mock — with zero code or prompt changes between them.

3.2 Fetch.ai — OmegaClaw + Agentverse + ASI:One

We use Fetch.ai's stack at two layers, satisfying the full challenge spec:

Layer 1 — ATLAS via Agentverse uAgent + OmegaClaw Telegram bot (PHI-side):

  • One uAgent registered: atlas-companion
  • OmegaClaw is the attending physician's chat front-end via Telegram
  • Attending texts: "Bed 4 status?" → ASI:One reasons over the intent → routes to the atlas-companion uAgent → uAgent calls our internal orchestrator on the GX10 → returns a synthesized, PHI-redacted status line
  • PHI never touches ASI:One. ASI:One only sees the routing intent and the natural-language response after our local agents have prepared it. Patient identifiers, vitals, and chart text stay on-prem.

Layer 2 — Pulse via ASI:One direct (cloud literature side, no PHI):

  • Four specialist Agentverse uAgents: pubmed-fetcher, fda-alerts, guidelines-watch, differential-educator
  • ASI:One reasons over the doctor's literature query → routes to the right specialist → returns curated answer with citations
  • Zero patient data ever flows here — strict architectural firewall (separate routes, separate sessions, separate auth scopes, no PHI fields in the request schema)

Registered Agentverse profiles:

uAgent Profile URL
pubmed-fetcher https://agentverse.ai/agents/agent1q0mwpar22cw5fs2p0awxud0a8jsz8p8jxqjrrqdnumef9mduxm5ks5x2naf
fda-alerts https://agentverse.ai/agents/agent1qfa97j3uyct9894cv4q9f2525hj9cjvzzn474d40t4sn5jyrh9f5xu45uej
guidelines-watch https://agentverse.ai/agents/agent1qfwqlzxrquexkc4fafcvyu9c3z7a6srat0c8vv9g75fcwwepspvvjsz0rjf
differential-educator https://agentverse.ai/agents/agent1q08z8uy4vhpk98plsdy5jjpkn7ddud42l5usl4nqcrplgjfnrwc6yn3js8j
atlas-companion https://agentverse.ai/agents/agent1qwpr622maahvh7jp4u4cjhtp33azrjlhtp2tpdfpgxph0625cnvysx9ngcl

3.3 ZETIC Melange — the on-device Patient Companion

The Companion app is a Kotlin Android app powered by the ZETIC Melange SDK, which dispatches inference automatically to NPU hardware (Qualcomm HTP/DSP, Google Tensor, MediaTek APU) without per-vendor code.

Role Model Notes
Summarization + document extraction + insurance chat Gemma 3 4B Instruct (Melange-supported) All three text-generation features
LLM fallback LiquidAI LFM2.5 1.2B Instruct (Melange-supported) Drop-in if Gemma 3 4B can't sustain TTFT
STT (encounter audio) Whisper-tiny (ONNX) Multilingual ambient encounter audio
Embeddings (RAG) all-MiniLM-L6-v2 (ONNX) 384-dim, ~25 MB
OCR Google ML Kit Native Android, on-device
Triage acuity classifier Distilled BERT-class (ONNX) Sub-second on-device

3.4 Backend stack

Layer Tech
Orchestrator + agents FastAPI + Python, premise/evidence-binding contract enforced at the orchestrator
Database Postgres + TimescaleDB + pgvector (encounters, orders, results, PA policies, agent_traces audit log, clinical taxonomy)
Speech-to-text Whisper-large-v3 on the GX10
Heavy reasoning agents Qwen3-32B, FP16 (Scribe note structuring, Advocate, Conductor, Reality Check)
High-frequency agent Qwen3-8B, FP16 (Quartermaster)
Web dashboard React + Vite + shadcn/ui ("Show Our Work" UI)
Mobile companion Kotlin + Jetpack Compose + ZETIC Melange
Cloud literature (Pulse) Fetch.ai ASI:One + Agentverse uAgents

All clinical grounding (HPO, ICD-10-CM, CPT, RxNorm, LOINC, curated PA policies) is loaded locally — no external API calls during inference.


4. The five agents

ATLAS isn't a single chatbot wearing a healthcare wig. It's five specialist agents that genuinely cooperate, each scoped to a real ED workflow.

4.1 Scribe — real-time clinical documentation

Whisper-large-v3 streams the encounter audio; Qwen3-32B (non-thinking mode for speed) structures it into a SOAP note, extracts orders, drafts Rx, and emits structured facts. Every fact must reference a transcript timestamp — hallucinations are caught by the contract, not by hope.

4.2 Quartermaster — inter-department coordination

Qwen3-8B watches the order/results event bus, cross-references department capacity and historical wait times, and surfaces predicted bottlenecks before the patient is stuck. "Last 3 chest-pain CT-PEs at this hour waited >60min — proactively escalating to radiology."

4.3 Advocate — discharge prior auth automation (centerpiece)

Qwen3-32B in thinking mode matches chart values to insurer policy criteria (BCBS / Aetna / Cigna / UHC), assembles the PA packet, computes an approval probability, and exposes the full premise chain. The doctor can click any premise, mark it "incorrect," and Advocate regenerates live in under 3 seconds. Stops "patient leaves the ED, waits 18 days for the pharmacy" from happening.

4.4 Conductor — disposition + handoff generation

Qwen3-32B turns the full encounter context into a structured SBAR handoff at shift change or admission, plus discharge instructions and follow-up orders. The verbal-handoff safety hazard goes away.

4.5 Reality Check — adversarial auditor

Qwen3-32B in thinking mode reviews every output of the other four agents, flags hallucinations, surfaces missing evidence, catches contradictions, and overrides confidence scores. If Scribe's diagnosis disagrees with Advocate's policy match, Reality Check pauses the chain and asks the human. This is the agent compliance teams care about.

4.6 "Show Our Work" — the killer feature

Every claim from every agent is a JSON object with mandatory evidence_refs, policy_refs, premise_chain, and confidence. The orchestrator rejects any output missing them. The dashboard surfaces all of it: 📋 Evidence · 🧠 Premise chain · ❓ What we don't know · 🚩 Reality Check flags · ⏮️ Replay · ✏️ Correct premise. Compliance gets an audit trail. Joint Commission gets reproducibility. Doctors get the ability to challenge any output and watch it correct itself live.


5. Pulse — the cloud-side literature companion

Pulse is a separate product surface that lives next to ATLAS but never touches patient data. It's the doctor's literature companion, powered by Fetch.ai's ASI:One.

5.1 Pulse chatbot interface

A dedicated tab in the doctor's web dashboard, plus a standalone Telegram bot. The chat UI looks like a clinical-grade ChatGPT — but every answer arrives with explicit citations from PubMed / FDA / published guidelines, and the network tab shows traffic on /pulse/*, never /api/*. The channel separation is the firewall's UX surface.

5.2 Pulse functionalities

  • PubMed search with structured summary — "Summarize the latest CGRP biologic guidelines for cluster headache (2025)" routes via ASI:One to the pubmed-fetcher Agentverse uAgent
  • FDA drug recall + safety alert feedfda-alerts agent surfaces active recalls and black-box warnings relevant to a query
  • "What's new this week" digestsguidelines-watch curates ACC, AHA, NEJM, Lancet updates by specialty
  • Hypothetical differential educationdifferential-educator answers "what's on the differential for X presentation in Y demographic" in general terms, never tied to a real patient
  • Drug interaction explainer — built on public pharmacology, not patient meds

5.3 What Pulse does NOT do (the firewall)

  • Never receives patient identifiers, vitals, lab values, or chart text
  • Never queries about specific patients ("the patient in bed 4")
  • Never shares session state, cookies, or auth scopes with ATLAS
  • The Pulse request schema has no PHI fields — even a malicious caller cannot smuggle PHI in
  • ATLAS runs on localhost; Pulse is the only outbound surface

If a doctor asks a "patient-flavored" question, the UI nudges them to re-phrase it as a hypothetical: "Pulse can't see your patient. Want to ask about general management of [condition]?" Two-tier privacy (PHI on-prem, public literature in the cloud) is exactly how hospital compliance teams think about clinical AI.


6. The Patient Companion app

The Kotlin Android Companion is the patient-side dual of ATLAS Advocate: where Advocate handles prior auth for the doctor at discharge, the Companion app helps the patient navigate everything that lands on their side of that same insurance flow. Two main features anchor the demo.

6.1 Encounter summarization (transcript → plain English)

Patient taps record at the start of the visit (with explicit on-device consent prompt). The full pipeline runs locally:

  1. Whisper-tiny streams encounter audio to a transcript on the phone
  2. The on-device LLM (Gemma 3 4B Instruct via ZETIC Melange) produces a structured patient_summary JSON: chief complaint, plain-English summary, diagnoses, medications with dose + instructions, follow-ups
  3. The summary view renders for the patient — "You have cluster headache. Start Emgality 240mg starter pack. Follow up with neurology in 7 days."
  4. Audio never leaves the phone. Even ATLAS doesn't see the patient-side recording. The hospital ledger and the patient ledger are intentionally parallel.

6.2 Insurance assistant — on-device RAG (the Advocate mirror)

Patient chats in plain English: "Is Emgality covered? What's my copay? What does this denial letter mean?" The pipeline:

  1. The patient's Summary of Benefits and Coverage (SBC), Explanations of Benefits (EOB), denial letters, and current encounter summary are chunked and embedded with all-MiniLM-L6-v2 (384-dim ONNX)
  2. Query embedding → brute-force cosine retrieval over a Kotlin list of ~hundreds of chunks (microseconds; no ChromaDB — see §10.7 of the System Overview for why ChromaDB doesn't fit on Android)
  3. Top-k chunks → on-device LLM produces an answer with mandatory chunk citations inline
  4. Citation-or-refuse contract: if the retrieved chunks don't actually support the answer, the assistant refuses rather than fabricates — "I can't find that in your documents — call your insurer."
  5. Every reply ends with a fixed disclaimer: "Educational only; verify with your insurer."

The Companion app also includes a document vault (camera + ML Kit OCR for paper reports, structured field extraction by the on-device LLM) and a light intake screen (distilled BERT acuity classifier; pings the GX10 with a minimal intake bundle so ATLAS knows the patient is en route).


7. Repo layout

backend/        FastAPI orchestrator + 5 agents (Scribe / Quartermaster / Advocate / Conductor / Reality Check)
backend/db      Postgres + TimescaleDB + pgvector schema
backend/pulse   Fetch.ai ASI:One literature service (PHI-firewalled)
backend/scripts seed, prewarm, smoke, record_replay, eval_rc
web/            React + shadcn ED dashboard ("Show Our Work" UI)
mobile/         Kotlin Patient Companion app (ZETIC Melange on-device)

8. Quickstart

# 0. db
docker compose up -d db

# 1. backend
cd backend
pip install -r requirements.txt
cp .env.example .env   # then fill ASI_ONE_API_KEY, AGENTVERSE_API_KEY, TELEGRAM_BOT_TOKEN
python scripts/seed.py

# 2. orchestrator on :8000  (Vite dev proxy maps /api → :8000)
ATLAS_LLM_BACKEND=ollama uvicorn orchestrator:app --reload --port 8000

# 3. Pulse on :8002 (separate process — that IS the firewall surface)
#    Pulse uses ASI:One; ATLAS_LLM_BACKEND is ignored here.
uvicorn pulse.server:app --reload --port 8002

# 4. web on :5173
cd ../web && npm i && npm run dev

ATLAS_LLM_BACKEND selects the runtime: vllm (production, GX10), ollama (laptop dev / demo backup), or mock (CI). The orchestrator contract is identical across backends — agent prompts and structured outputs are unchanged.


9. What ATLAS is, and is not

✅ Decision support for clinicians · workflow automation · documentation assistance · coordination orchestration · administrative form generation

❌ Not a medical device · not autonomous (no auto-execution without clinician sign-off) · not a replacement for clinical judgment · not a PHI cloud relay

ATLAS sits in the Clinical Decision Support (CDS) safe harbor under the 21st Century Cures Act: used by healthcare professionals, evidence-based, traceable, and structurally reviewable via "Show Our Work." We're the workflow layer, not the medicine.

About

ATLAS — on-prem agentic ED operating layer (LA Hacks 2026)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors