This repository contains my coursework and hands-on projects completed as part of the eCornell Machine Learning Foundations course. It showcases the full Machine Learning Life Cycle using real-world datasets and Python-based tools.
The goal of this portfolio is to demonstrate proficiency in core machine learning workflows—ranging from data preprocessing to model deployment—while applying ethical and interpretable practices. Projects include work in classification, regression, and deep learning using libraries like pandas, scikit-learn, and Keras.
- Gain experience implementing end-to-end ML pipelines
- Practice comparing algorithms using metrics like F1-score, accuracy, RMSE, etc.
- Understand data preparation, model training, and evaluation in Python
- Build reproducible and interpretable ML solutions
- Apply algorithmic fairness and responsible AI principles
Each project follows the machine learning life cycle:
- Business Understanding – Define the ML problem in context
- Data Understanding – Explore datasets using visualizations and summary stats
- Data Preparation – Clean, encode, and scale data
- Modeling – Train and evaluate models (e.g., Logistic Regression, k-NN, Decision Trees, CNN)
- Evaluation – Analyze performance using appropriate metrics
- Deployment (optional) – Save models using
picklefor future inference
- Compared classification models (KNN, DT) on Airbnb data, identifying KNN as most accurate with 85% accuracy and low bias.
- Built a CNN for digit classification achieving over 98% accuracy on MNIST.
- Evaluated logistic regression models from scratch and using scikit-learn.
- Performed model selection using cross-validation and hyperparameter tuning.
Project notebooks include:
- Confusion matrices
- ROC curves
- Data distribution plots
- Accuracy/loss training curves (in CNN project)
- Explore deployment using Flask or Streamlit
- Add more datasets to test scalability of workflows
- Perform explainability analysis using SHAP or LIME
- Refactor code into reusable functions or class structures
All code in this repository was written and documented by me, Camila Lightfoot, as part of the eCornell course. I applied principles learned in lectures and labs, and extended them to improve the robustness and interpretability of models.
To run these projects locally:
- Clone this repository:
git clone https://github.com/CamilaLightfoot/My-eCornell-Portfolio-ML-lifecycle-projects.git cd My-eCornell-Portfolio-ML-lifecycle-projects
Option 1: Run in Jupyter Online (Recommended for Simplicity) You can view and run this project directly in a web browser using Jupyter environments:
- Try it in JupyterLite or GitHub Codespaces
- Or visit Jupyter Notebook Viewer (NBViewer) to explore without setup
Option 2: You can run this project directly on Google Colab without installing anything locally:
Camila Lightfoot Computer Science Student at George Mason University AI Studio Fellow | Break Through Tech
Connect on LinkedIn: https://www.linkedin.com/in/camilalightfoot/
GitHub Profile: https://github.com/CamilaLightfoot
This project is for educational purposes only. Not licensed for commercial use.
Let me know if you'd like help pasting this directly into your GitHub repository or formatting any additional visualizations or badges (like build: passing, last updated, etc.)!