Inspiration

Heart disease remains the leading cause of death worldwide, yet many risk prediction tools are either overly simplistic, biased, or difficult to interpret. We wanted to build something that prioritizes public health impact and responsible AI. Instead of just chasing high accuracy, we focused on transparency, calibration, and ethical risk communication — ensuring that users receive realistic, explainable insights rather than misleading probabilities.

What it does

RiskRight is an explainable heart disease risk assessment tool built on validated UCI clinical data. Users input health information such as age, BMI-related indicators, blood pressure, cholesterol, and lifestyle factors. The system then provides: A cohort-based model probability A population-adjusted risk score A clear risk band (Low / Moderate / High) A binary risk flag (Higher / Lower risk) Feature-level explanations and actionable recommendations The goal is not diagnosis, but early awareness and informed decision-making.

How we built it

Dataset Evaluation & Selection Initially tested a synthetic health dataset but identified severe class imbalance. Rejected a secondary dataset due to duplicate entries and credibility concerns. Ultimately selected the validated UCI Heart Disease dataset. Data Preprocessing Removed duplicate rows to prevent metric inflation. Verified class balance and feature consistency. Engineered a clean training pipeline. Modeling & Calibration Trained a supervised classification model. Implemented population-adjusted probability scaling. Created conservative risk bands for safer interpretation. Explainability & Interface Added feature importance insights. Built a Streamlit app for real-time interaction. Designed user-friendly language and risk messaging.

Challenges we ran into

Severe class imbalance in our first dataset, leading to skewed predictions. Duplicate row inflation in an alternative dataset, falsely boosting performance. Overconfident probabilities for healthy users. Translating technical risk outputs into responsible, non-alarming language. Balancing model accuracy with ethical communication was one of our biggest challenges.

Accomplishments that we're proud of

Identifying and correcting dataset bias instead of ignoring inflated metrics. Implementing automatic deduplication in the pipeline. Building a population-adjusted calibration layer. Creating an explainable, beginner-friendly interface. Delivering a full production-ready Streamlit app in hackathon time. We didn’t just build a classifier — we built a responsible health risk framework.

What we learned

Clean data matters more than complex models. Metrics can be misleading without auditing for imbalance and duplication. Calibration is crucial in healthcare predictions. Responsible AI requires thoughtful communication, not just technical skill. Most importantly, we learned that accuracy alone does not equal trust.

What's next for RiskRight

Advanced calibration methods (e.g., isotonic regression). Personalized lifestyle intervention recommendations. A mobile-friendly deployment for broader accessibility.

Share this project:

Updates