Skip to content

CamilaLightfoot/My-eCornell-Portfolio-ML-lifecycle-projects

Repository files navigation

My eCornell Portfolio – ML Lifecycle Projects

This repository contains my coursework and hands-on projects completed as part of the eCornell Machine Learning Foundations course. It showcases the full Machine Learning Life Cycle using real-world datasets and Python-based tools.

Overview

The goal of this portfolio is to demonstrate proficiency in core machine learning workflows—ranging from data preprocessing to model deployment—while applying ethical and interpretable practices. Projects include work in classification, regression, and deep learning using libraries like pandas, scikit-learn, and Keras.

Objectives and Goals

  • Gain experience implementing end-to-end ML pipelines
  • Practice comparing algorithms using metrics like F1-score, accuracy, RMSE, etc.
  • Understand data preparation, model training, and evaluation in Python
  • Build reproducible and interpretable ML solutions
  • Apply algorithmic fairness and responsible AI principles

Methodology

Each project follows the machine learning life cycle:

  1. Business Understanding – Define the ML problem in context
  2. Data Understanding – Explore datasets using visualizations and summary stats
  3. Data Preparation – Clean, encode, and scale data
  4. Modeling – Train and evaluate models (e.g., Logistic Regression, k-NN, Decision Trees, CNN)
  5. Evaluation – Analyze performance using appropriate metrics
  6. Deployment (optional) – Save models using pickle for future inference

Results & Key Findings

  • Compared classification models (KNN, DT) on Airbnb data, identifying KNN as most accurate with 85% accuracy and low bias.
  • Built a CNN for digit classification achieving over 98% accuracy on MNIST.
  • Evaluated logistic regression models from scratch and using scikit-learn.
  • Performed model selection using cross-validation and hyperparameter tuning.

Visualizations

Project notebooks include:

  • Confusion matrices
  • ROC curves
  • Data distribution plots
  • Accuracy/loss training curves (in CNN project)

Next Steps

  • Explore deployment using Flask or Streamlit
  • Add more datasets to test scalability of workflows
  • Perform explainability analysis using SHAP or LIME
  • Refactor code into reusable functions or class structures

Individual Contributions

All code in this repository was written and documented by me, Camila Lightfoot, as part of the eCornell course. I applied principles learned in lectures and labs, and extended them to improve the robustness and interpretability of models.

Installation & How to Run

To run these projects locally:

  1. Clone this repository:
    git clone https://github.com/CamilaLightfoot/My-eCornell-Portfolio-ML-lifecycle-projects.git
    cd My-eCornell-Portfolio-ML-lifecycle-projects
    
    

Try It Online

Option 1: Run in Jupyter Online (Recommended for Simplicity) You can view and run this project directly in a web browser using Jupyter environments:

Option 2: You can run this project directly on Google Colab without installing anything locally: Open In Colab

Contact

Camila Lightfoot Computer Science Student at George Mason University AI Studio Fellow | Break Through Tech

Connect on LinkedIn: https://www.linkedin.com/in/camilalightfoot/

GitHub Profile: https://github.com/CamilaLightfoot

License

This project is for educational purposes only. Not licensed for commercial use. Let me know if you'd like help pasting this directly into your GitHub repository or formatting any additional visualizations or badges (like build: passing, last updated, etc.)!

About

Briefly explain what your project or portfolio is (e.g., "Includes all of my Jupyter Notebook assignments from Machine Learning Foundations").

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors