🧩 Where it started

Alzheimer’s detection still relies heavily on expensive tools like MRI or PET scans.

EEG offers a cheaper, non-invasive alternative - but the signal is noisy, and datasets are small.

That’s what made this challenge interesting.


⚙️ What we built

We built a model that classifies EEG recordings as Alzheimer’s Disease (AD) or Cognitively Normal (CN).

Instead of jumping straight into deep learning, we focused on extracting meaningful patterns:

  • brain wave frequency bands (delta → gamma)
  • statistical and spectral features
  • clinically relevant ratios (like theta/alpha)

🏗️ How we approached it

With only ~38 training subjects, complexity wasn’t the priority - validity was.

So we:

  • segmented signals using sliding windows (30s / 15s overlap)
  • used XGBoost for fast iteration and interpretability
  • enforced Leave-One-Subject-Out cross-validation (LOSO-CV) to avoid leakage

Every prediction was made at the subject level, not just individual signal snippets.


⚠️ Where it got difficult

The hardest part wasn’t building the model - it was trusting the results.

  • extremely small dataset
  • high risk of overfitting
  • class imbalance
  • misleading validation if not done properly

We constantly had to ask:

“Is this real performance… or are we fooling ourselves?”


🏆 What we’re proud of

  • achieving ~81.6% subject-level accuracy on a very limited dataset
  • building a pipeline that avoids data leakage end-to-end
  • balancing performance with interpretability (important for clinical context)
  • exploring both classical ML and deep learning approaches

🧠 What we learned

  • evaluation strategy matters more than model choice in small datasets
  • deep learning isn’t always the answer - simpler models can outperform when data is limited
  • domain knowledge (EEG biomarkers) is critical for feature design
  • “good results” mean nothing without proper validation

🚀 What’s next for MLers

(Lasmar, Maésha, Richard, Ahmed)

  • scaling the approach with larger datasets
  • improving generalization and robustness
  • exploring hybrid models (XGBoost + deep learning ensembles)
  • pushing further into AI for healthcare applications

Built With

Share this project:

Updates