Inspiration

Our team chose this project because we were passionate about using data to help businesses make informed decisions that could drive real-world impact. This Data Challenge presented an exciting opportunity to apply data science techniques to solve a problem for Uber, predicting which new driver signups would go on to become active drivers.

What it does

The final deliverable of our project is a business-oriented presentation for the Uber Driver’s team. We’ve identified key factors that influence whether a driver will go from signing up to completing their first trip. Our findings are designed to help Uber improve their signup process and increase driver activation rates by providing actionable insights on what areas to focus on.

How we built it

To build our predictive model, we applied machine learning and data preprocessing techniques. We began by cleaning and transforming the data, creating new features that better captured time-related events. Then, we selected and trained various models to identify the most influential factors in the signup-to-driver conversion process.

Challenges we ran into

One of the major challenges we faced was the skewed nature of the dataset — around 90% of the data consisted of drivers who did not complete their first trip. This imbalance made it difficult to train accurate models. To address this, we employed resampling techniques, threshold adjustments, and ensemble methods to improve model performance and reduce bias towards the non-driver class.

Accomplishments that we're proud of

We successfully overcame the class imbalance challenge by implementing effective resampling strategies and adjusting thresholds, leading to robust model performance.

What we learned

  • We gained hands-on experience in handling imbalanced datasets, experimenting with techniques like SMOTE and threshold adjustments to improve model accuracy.
  • We learned how to select and apply the right predictive models for binary classification tasks, specifically focusing on models like Logistic Regression and XGBoost.
  • We also improved our ability to present data-driven findings in a way that resonates with non-technical stakeholders, honing our skills in communicating complex analysis in a business context.

What's next for Predicting Driver Signups

Reflect on what we learned during this project and focus on applying it in future endeavours.

Built With

Share this project:

Updates