EquiHER: AI for Gender-Equal Healthcare
Overview
EquiHER is an AI-powered diagnostic risk flagging system designed to support fairer medical decision-making.
Women are often diagnosed later or incorrectly because symptoms can present differently and many medical datasets historically contain male-dominant patient data.
EquiHER is designed as a clinical support tool, highlighting cases where additional diagnostic attention may be beneficial.
Inspiration
Women frequently experience delayed or incorrect diagnoses for conditions such as:
- cardiovascular disease
- autoimmune disorders
- neurological disorders
One contributing factor is that many medical datasets historically focused on male populations.
This project explores how machine learning can help identify cases where diagnostic oversight risk may be higher.
What It Does
EquiHER analyzes 15 clinical variables and predicts whether a patient may be at higher risk of diagnostic oversight.
If the predicted probability exceeds a threshold:
$$ P(\text{risk}) > \tau $$
the system flags the case for additional clinical review.
The current model achieves approximately 84–87% validation accuracy.
Model Architecture
The neural network follows a fully connected architecture:
$$ 15 \rightarrow 128 \rightarrow 64 \rightarrow 32 \rightarrow 1 $$
Each layer transformation follows:
$$ h^{(l)} = \sigma(W^{(l)}h^{(l-1)} + b^{(l)}) $$
Where:
- \(h^{(l)}\) represents the hidden layer activation
- \(W^{(l)}\) represents the weight matrix
- \(b^{(l)}\) represents the bias vector
- \(\sigma\) represents the activation function (ReLU)
Prediction Function
The final output is converted into a probability using the sigmoid function:
$$ \hat{y} = \frac{1}{1 + e^{-z}} $$
where
$$ z = Wh + b $$
This produces a probability between 0 and 1.
Training Objective
The model is trained using binary cross-entropy loss:
$$ L = -\frac{1}{n} \sum_{i=1}^{n} \left[y_i \log(\hat{y}_i) + (1-y_i)\log(1-\hat{y}_i)\right] $$
Where:
- \(y_i\) is the true label
- \(\hat{y}_i\) is the predicted probability
- \(n\) is the number of samples
Optimization
Model parameters are updated using gradient descent:
$$ \theta_{t+1} = \theta_t - \eta \nabla L(\theta) $$
Where:
- \(\theta\) represents model parameters
- \(\eta\) represents the learning rate
- \(\nabla L(\theta)\) represents the gradient of the loss function
Data Preparation
Because real medical datasets are difficult to access due to privacy restrictions, a synthetic dataset was generated with realistic clinical ranges and correlations.
Feature normalization was applied using:
$$ x' = \frac{x - \mu}{\sigma} $$
Where:
- \(\mu\) represents the feature mean
- \(\sigma\) represents the standard deviation
Challenges
Data Availability
Medical datasets are highly restricted due to patient privacy regulations, making synthetic data generation necessary.
Overfitting
Preventing overfitting required tuning parameters such as:
- learning rate
- batch size
- model depth
Clinical Usability
The goal was to design a system that supports clinicians rather than replacing them.
Impact
EquiHER contributes toward the following United Nations Sustainable Development Goals.
SDG 3.8 — Universal Health Coverage
Improving healthcare quality through AI-assisted clinical decision support.
SDG 5 — Gender Equality
Reducing gender bias in medical decision-making.
Example Prediction Workflow
- Patient clinical values are entered
- Data is normalized
- The neural network computes:
$$ \hat{y} = f(x_1, x_2, ..., x_{15}) $$
- A diagnostic risk score is generated
If
$$ \hat{y} > 0.7 $$
the case is flagged for additional clinical review.
Log in or sign up for Devpost to join the conversation.