Inspiration
Our team chose this project because we were passionate about using data to help businesses make informed decisions that could drive real-world impact. This Data Challenge presented an exciting opportunity to apply data science techniques to solve a problem for Uber, predicting which new driver signups would go on to become active drivers.
What it does
The final deliverable of our project is a business-oriented presentation for the Uber Driver’s team. We’ve identified key factors that influence whether a driver will go from signing up to completing their first trip. Our findings are designed to help Uber improve their signup process and increase driver activation rates by providing actionable insights on what areas to focus on.
How we built it
To build our predictive model, we applied machine learning and data preprocessing techniques. We began by cleaning and transforming the data, creating new features that better captured time-related events. Then, we selected and trained various models to identify the most influential factors in the signup-to-driver conversion process.
Challenges we ran into
One of the major challenges we faced was the skewed nature of the dataset — around 90% of the data consisted of drivers who did not complete their first trip. This imbalance made it difficult to train accurate models. To address this, we employed resampling techniques, threshold adjustments, and ensemble methods to improve model performance and reduce bias towards the non-driver class.
Accomplishments that we're proud of
We successfully overcame the class imbalance challenge by implementing effective resampling strategies and adjusting thresholds, leading to robust model performance.
What we learned
- We gained hands-on experience in handling imbalanced datasets, experimenting with techniques like SMOTE and threshold adjustments to improve model accuracy.
- We learned how to select and apply the right predictive models for binary classification tasks, specifically focusing on models like Logistic Regression and XGBoost.
- We also improved our ability to present data-driven findings in a way that resonates with non-technical stakeholders, honing our skills in communicating complex analysis in a business context.
What's next for Predicting Driver Signups
Reflect on what we learned during this project and focus on applying it in future endeavours.
Log in or sign up for Devpost to join the conversation.