RadiSpect

Inspiration

Reading CT and X-ray studies often means scrolling through hundreds of slices to find small, high-stakes pathologies. Many hospitals can’t use cloud APIs for PHI, and hosting giant ViTs locally is impractical. I wanted a tool that:

runs fully on CPU, on-prem,
gives interpretable, per-finding evidence (not just a score),
and can be tuned to a hospital’s own data.

To the best of my knowledge (Aug 10, 2025), no one had shown a CPU-only, on-device Convolutional Sparse Autoencoder (Conv-SAE) doing unsupervised, evidence-grounded localization in radiology with single-latent ablation masks. That gap and clinicians’ need for trustworthy visual evidence parked RadiSpect.

What it does

RadiSpect helps clinicians see where to look and why:

1) Safe Spans (peer review of an existing report):
Click a finding in the report -> see the Conv-SAE’s per-latent mask aligned to that finding. High-activation latents not covered in the text are highlighted as potential misses.

2) Clinician Assist (draft a new report):
Shows top-activated latents with overlays. Accept the useful ones, ignore false positives, and use the masks as evidence breadcrumbs.

3) Report Cross-Check (QA):

If a report is written, RadiSpect flags strongly activated latents that the text didn’t mention (an on-device second set of eyes).

How we built it

Model. A compact Convolutional Sparse Autoencoder trained unsupervised on images:

Evidence maps via single-latent ablation. For latent $z_i$:

$$ \hat{x}^{(i=0)} = g_\phi\big([z_1,\dots,0,\dots,z_n]\big),\quad \Delta_i = \big| \hat{x} - \hat{x}^{(i=0)} \big|,\quad M_i = \mathrm{Thresh}(\Delta_i) $$

We overlay $M_i$ on the original image to show what changes when only that concept is removed.

Label mining (v0). For each latent $i$, collect reports from images with high $z_i$, extract frequent phrases to propose a short label (optionally refined later).
Quality gates. We keep masks that meet energy/consistency checks (e.g., energy $E_i=\sum M_i$; basic monotonicity under increased ablation strength).

Data & stack.

Dataset: IU X-Ray (paired images + reports).
Runtime: Python + PyTorch (CPU), Streamlit viewer.

- Why CPU-only? To honor privacy constraints and prove edge feasibility. Also, many radiologists work without GPUs and it's important to tailor to that.

Challenges we ran into

CT scarcity & time: Open CT datasets with aligned reports are scarce; training pipelines are heavier and unlikely to succeed within the timeframe of this hackathon. I built the X-ray proof of concept first to validate compute and workflow.
Unsupervised noise: Without labels, some masks bleed into irrelevant areas. I use energy thresholds, simple consistency checks, and human judgment in the loop.
Threshold tuning: I iterated on per-latent scaling and percentile rules to keep overlays stable.
Label fidelity: Phrase mining can be noisy. I kept labels short, factual, and tied to top-activator galleries rather than over-promising.

- Provable privacy: Judges want proof. I run in offline mode, show “Local Mode: ON (0 API calls)”, and keep models small.

Accomplishments that we're proud of

End-to-end on device: Training + inference on CPU, no cloud, no GPUs.
1:1 provenance: Every accepted finding is backed by a single-latent ablation mask.
Clinically aligned UX: Three flows that map to real tasks: peer review, assistive drafting, and QA.
Lightweight & fast: Small Conv-SAE with interactive overlays on commodity hardware.
Clear scope of novelty: First to combine on-device Conv-SAE + unsupervised evidence maps for radiology localization.

What we learned

Interpretability drives adoption. Visual evidence beats a single probability score when clinicians need to trust a tool.
Unsupervised doesn't mean unreliable. With sparsity + careful ablations, evidence maps can be useful even without segmentation labels.
Human-in-the-loop matters. False positives are manageable when the UI lets clinicians quickly accept/ignore overlays.
Compute realism helps. Proving CPU-only viability makes deployment conversations (privacy, security, cost) much easier.
Metrics need to match the task. Energy/coverage and “no mask -> no claim” are better aligned to evidence-grounded assistance than generic accuracy.

What’s next for RadiSpect

CT extension (retrain, same method):
Train on HU-windowed slices or short 2.5D stacks; aggregate per-slice masks (e.g., max-energy) to surface top frames. No architecture change required.
Packaging for hospitals: One-click local install; logs proving offline mode; admin knob for on-prem fine-tuning.

Built With

convolutional-sparse-autoencoder
python
streamlit
torch

Updates

Arjun Naik started this project — Aug 10, 2025 08:29 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.