RetinaScan: AI-Powered Retinal Disease Risk Detection
Inspiration
Early-stage eye conditions such as Diabetic Retinopathy (DR) typically present no symptoms. Because they develop silently, they often progress entirely undetected until the damage is already irreversible. By the time someone notices blurred vision, floaters, or permanent vision loss, it is often too late to restore what has been lost.
But this silent progression is not limited to the eyes. Chronic, life-altering diseases such as hypertension and even Alzheimer's disease can develop quietly in the body for years. The challenge is that our current gold-standard diagnostics — such as \$2,000+ MRI scans or invasive laboratory panels — are often too expensive, too slow, and too inaccessible for routine, large-scale screening.
This raises a critical question:
$$\text{Can we detect systemic disease earlier using a simple, non-invasive retinal image?}$$
The retina is the only place in the human body where microvascular structures can be observed non-invasively. Research has shown that the optic disc, retinal vessels, and foveal region encode subtle biomarkers that correlate with systemic disease states. If these vascular changes can be quantified and modeled, then early disease risk may be estimated before symptoms appear — and before the damage becomes irreversible.
RetinaScan was built around this idea: transforming a single retinal fundus image into actionable, AI-driven disease risk insights — making early detection scalable, affordable, and accessible to anyone with a smartphone or low-cost fundus camera.
What We Built
RetinaScan is a full-stack web application that accepts a retinal fundus image and returns a structured clinical risk report powered by GPT-4o Vision. The pipeline works as follows:
- A user uploads a retinal fundus photograph through the web interface
- The image is sent to OpenAI's GPT-4o Vision API with a carefully engineered clinical prompt
- The model analyzes the image for biomarkers including vessel tortuosity, optic disc pallor, microaneurysms, hemorrhages, hard exudates, and arteriovenous nicking
- A structured risk report is returned covering Diabetic Retinopathy, Hypertensive Retinopathy, Glaucoma, and Age-Related Macular Degeneration (AMD)
The risk score for each condition is modeled as a weighted combination of detected biomarker signals. For a condition $C$ with $n$ contributing biomarkers $b_1, b_2, \ldots, b_n$ each carrying weight $w_i$:
$$R(C) = \frac{\sum_{i=1}^{n} w_i \cdot b_i}{\sum_{i=1}^{n} w_i} \times 100$$
where $b_i \in [0, 1]$ encodes the presence and severity of biomarker $i$, and $R(C) \in [0, 100]$ is the normalized risk percentage for condition $C$.
How We Built It
| Layer | Technology |
|---|---|
| Frontend | Next.js 14, Tailwind CSS, TypeScript |
| AI Backend | OpenAI GPT-4o Vision API |
| Image Handling | Browser FileReader API → Base64 → multipart POST |
| Deployment | Vercel (serverless, edge-optimized) |
| Environment | Vercel Environment Variables (production secrets) |
We deliberately kept the architecture lean and serverless. There are no databases, no user accounts, and no stored images — every analysis is stateless and ephemeral, which means zero patient data is ever persisted, an important ethical consideration when working with medical imagery.
The prompt engineering was the most technically intensive part of the project. We iterated through over a dozen prompt versions to get GPT-4o to return structured, clinically grounded output rather than generic disclaimers. The final prompt instructs the model to act as an ophthalmology screening assistant, enumerate specific findings, assign confidence levels, and format output in a consistent schema our frontend could parse reliably.
Challenges We Faced
1. The Hugging Face API retirement. Our original pipeline used a legacy Hugging Face inference endpoint for retinal classification. Mid-build, we discovered the endpoint had been deprecated. We pivoted entirely to GPT-4o Vision within a couple of hours — a forced architectural change that ultimately produced far richer, more interpretable output.
2. Environment variable management on Vercel. .env.local is correctly gitignored, but this meant our API key was absent from the production deployment. We solved this by provisioning the OPENAI_API_KEY directly through Vercel's environment variable system, keeping secrets out of version control entirely.
3. Git author mismatch on deployment. The Vercel CLI rejected our deployment because the local Git author identity didn't match the authenticated Vercel account. We resolved this by correcting the Git author config and amending the commit to update the author metadata before redeploying.
4. Prompt hallucination and over-confidence. Early versions of our prompt caused GPT-4o to return highly confident diagnoses on obviously synthetic or low-quality test images. We added explicit uncertainty scaffolding to the prompt — instructing the model to flag image quality issues and express findings probabilistically rather than definitively. This made the output more honest and more clinically appropriate.
5. Scope of claims vs. scope of a hackathon. The hardest non-technical challenge was being honest about what this tool is and isn't. RetinaScan is a screening aid, not a diagnostic device. We made sure every layer of the UI communicates this clearly. Building something that gestures at real clinical utility without overclaiming is a design discipline we take seriously.
What We Learned
- The retina really is a window into systemic health. The academic literature on retinal biomarkers for cardiovascular and neurological disease is deeper and more compelling than we expected going in.
- Prompt engineering is software engineering. Structured output, uncertainty quantification, and clinical framing all required the same rigor as writing production code.
- Stateless, privacy-first architecture is not just ethical — it's simpler. By never storing images, we eliminated an entire class of security and compliance concerns before they could arise.
- Pivoting fast is a superpower. Losing our primary model API mid-hackathon was stressful, but shipping a better product because of it was a reminder that constraints drive creativity.
What's Next
$$\text{Risk}(t) = R_0 \cdot e^{\lambda t}$$
If biomarker severity compounds over time at rate $\lambda$, early detection at $t_0$ versus $t_1 > t_0$ represents an exponential difference in intervention window. This is the core public health argument for RetinaScan at scale.
Future directions include fine-tuning a vision model on labeled fundus datasets (EyePACS, APTOS 2019), integrating longitudinal tracking so patients can monitor change over time, and exploring deployment on low-cost smartphone-attached fundus cameras for community health clinics in underserved regions.
The eye is not just a sensory organ. It is a diagnostic surface — and we've barely begun to read it.
Built at CareTechHacks 2026.
Built With
- html5
- lucide
- next.js
- node.js
- openai-gpt-4o-vision-api
- react
- retfound
- shadcn/ui
- sharp
- tailwind-css
- typescript
- vercel
Log in or sign up for Devpost to join the conversation.