Counting Cars: Predicting Vehicle Population Estimation

Inspiration

Understanding the future of transportation is crucial for sustainable energy planning, infrastructure development, and environmental impact assessment. Predicting vehicle populations with precision allows policymakers and industry leaders to make data-driven decisions about fuel demand, emissions, and urban mobility. We were inspired by the challenge of leveraging AI and data science to bridge the gap between historical vehicle trends and future projections.

What It Does

Counting Cars predicts the future vehicle population based on historical data from 2019 to 2024. Using machine learning, we analyze trends in vehicle categories, fuel types, and sustainability metrics to estimate the composition of vehicles on the road in 2025. This helps stakeholders make informed decisions regarding energy use, transportation policies, and infrastructure investments.

How We Built It

We combined the initial dataset provided by Chevron with oil and gas prices we received from other sources. Our approach included:

  • Data Preprocessing: We handled missing values using statistical imputation.
  • Feature Engineering/Synthesis: We applied one-hot encoding to columns we deemed as categorical data. We also created a new column, vehicleAge, by subtracting the model year from the year of inventory. We included ordinal encoding for the "number of vehicles registered at the same address" column.
  • Machine Learning Models: We first created a baseline metric using basic Ridge and Lasso models. We then made a LightGBM model to compare our metrics against those metrics. Bayesian optimization was used in order to hypertune our LightGBM parameters. All models incorporated an 80/20 train/test split on the training data. We then tested our model on all of the scoring data to evaluate various RMSE metrics.

Challenges We Ran Into

  • Handling missing and inconsistent data across different years and categories.
  • Ensuring our predictions accounted for trends beyond the features given in the dataset.
  • Balancing model complexity with interpretability to make our insights actionable.

Accomplishments That We're Proud Of

  • Successfully integrating additional features into the data to enhance prediction accuracy and provide a more generalized output.
  • Visualizing key relationship to provide vehicle population insights.
  • Achieving high model accuracy through advanced machine learning techniques.

What We Learned

  • The importance of high-quality data preprocessing in predictive modeling.
  • How to integrate additional and more comprehensive data with vehicle population forecasting.
  • Optimizing model performance and exploring the capabilities of different models.

What's Next for Counting Cars

  • Expanding the dataset to include real-time vehicle registration data.
  • Incorporating more sustainability metrics, such as emissions and fuel efficiency trends.
  • Including the impact of policies and economic trends.
  • Deploying a public dashboard to visualize vehicle population trends and insights.

Built With

Share this project:

Updates