Uber dataset with 5,000 rides.xlsx

lexsense August 31, 2025 Leave a Comment Reasoning

The message you shared looks like a blog post excerpt (dated August 31, 2025, from “lexsense”) describing a synthetic or sample Uber dataset with 5,000 rides in .xlsx format (converted to .csv for the app). It includes typical columns like:

Timestamps (e.g., request/start/end times)
City
Distance
Fare
Driver/customer IDs
Payment type
Ratings
Ride status (e.g., completed, cancelled)

The post offers a pre-built Streamlit dashboard (uber_dashboard_streamlit.py) for interactive visualization, with features like filters, KPIs (total rides, revenue, avg fare/distance, completion rate, avg rating), time-based charts, payment breakdowns, city-level insights, peak hours, cancellations, and a driver leaderboard.

Since I don’t have direct access to your local files or the exact dataset/Streamlit code, here’s how to proceed based on common practices and similar public resources:

If You’re Looking to Run or Recreate This Dashboard

Dataset Availability
No exact public match exists for a precisely “5,000 rides” Uber dataset with those exact columns (many public ones are either smaller personal exports, NYC-specific pickups, or larger fare-prediction sets). However, close alternatives include:

GeeksforGeeks sample UberDataset.csv (small, with START_DATE, END_DATE, CATEGORY, etc.) — download from their media link.
Kaggle datasets like “Uber Rides Dataset” by mubasherbajwa or “Uber Trips Dataset” by rohiteng (some have ~50k rows but include distance, fare, status, payment, timestamps, city-like info).
GitHub repos (e.g., Geo-y20/Uber-Rides-Data-Analysis) with UberDataset.csv containing ride details. If your .xlsx is custom/synthetic, stick with it and convert to CSV via Excel/Pandas if needed:

import pandas as pd
   df = pd.read_excel("Uber dataset with 5,000 rides.xlsx")
   df.to_csv("uber_data.csv", index=False)

import pandas as pd
   df = pd.read_excel("Uber dataset with 5,000 rides.xlsx")
   df.to_csv("uber_data.csv", index=False)

   import pandas as pd
   df = pd.read_excel("Uber dataset with 5,000 rides.xlsx")
   df.to_csv("uber_data.csv", index=False)

Running the Provided Streamlit App
Follow the instructions in the post:

Place uber_dashboard_streamlit.py and uber_data.csv in the same folder.
Install dependencies:
bash pip install streamlit pandas numpy matplotlib
Launch:
bash streamlit run uber_dashboard_streamlit.py
Access in browser (usually http://localhost:8501). Use the sidebar to filter by date, city, status, etc. If the CSV path is hardcoded, edit the script (look for pd.read_csv("uber_data.csv") or similar and adjust).

Typical Code Structure for Such a Dashboard
If you don’t have the .py file or want to rebuild/enhance it, here’s a basic skeleton matching the described features:

   import streamlit as st
   import pandas as pd
   import numpy as np
   import matplotlib.pyplot as plt
   import plotly.express as px  # Optional for better charts

   st.set_page_config(page_title="Uber Rides Dashboard", layout="wide")

   # Load data
   @st.cache_data
   def load_data():
       return pd.read_csv("uber_data.csv")  # Update path if needed

   df = load_data()

   # Assume datetime parsing (adjust column names)
   df['timestamp'] = pd.to_datetime(df['timestamp'])  # or 'request_time', etc.
   df['date'] = df['timestamp'].dt.date
   df['hour'] = df['timestamp'].dt.hour

   # Sidebar filters
   st.sidebar.header("Filters")
   date_range = st.sidebar.date_input("Date Range", [df['date'].min(), df['date'].max()])
   cities = st.sidebar.multiselect("City", options=df['city'].unique(), default=df['city'].unique())
   statuses = st.sidebar.multiselect("Ride Status", options=df['ride_status'].unique(), default=df['ride_status'].unique())

   # Apply filters (example)
   filtered_df = df[(df['date'].between(date_range[0], date_range[1])) &
                    (df['city'].isin(cities)) &
                    (df['ride_status'].isin(statuses))]

   # KPIs
   col1, col2, col3, col4 = st.columns(4)
   col1.metric("Total Rides", len(filtered_df))
   col2.metric("Total Revenue", f"${filtered_df['fare'].sum():,.2f}")
   col3.metric("Avg Fare", f"${filtered_df['fare'].mean():.2f}")
   col4.metric("Avg Rating", f"{filtered_df['ratings'].mean():.1f} ⭐")

   # Charts (expand as needed)
   st.subheader("Rides & Revenue Over Time")
   time_df = filtered_df.groupby('date').agg({'fare': 'sum', 'ride_status': 'count'}).reset_index()
   fig = px.line(time_df, x='date', y=['fare', 'ride_status'], title="Trends")
   st.plotly_chart(fig)

   # More sections: Payment split (pie chart), Ratings hist, City bar, etc.
   # Driver leaderboard example
   st.subheader("Driver Leaderboard")
   driver_stats = filtered_df.groupby('driver_id').agg(
       trips=('ride_status', 'count'),
       earnings=('fare', 'sum'),
       avg_rating=('ratings', 'mean')
   ).sort_values('earnings', ascending=False).head(10)
   st.dataframe(driver_stats)

   # Add more: peak hours (bar chart on 'hour'), cancellations by city, etc.

Add plotly, seaborn, or altair for fancier visuals.

Enhancements You Mentioned

Geo Features + Maps — Yes, possible if your dataset includes pickup/dropoff lat/long (common in NYC Uber datasets). Use st.map() or folium/pydeck in Streamlit.
Convert to Plotly Dash — Dash is great for more complex callbacks. I can outline a Dash version if preferred (multi-page, advanced interactivity).

Files

▶️ Streamlit app: uber_dashboard_streamlit.py
📂 Dataset (5,000 rides): uber_data.csv

How to run

Put both files in the same folder.
In a terminal: pip install streamlit pandas numpy matplotlib streamlit run uber_dashboard_streamlit.py
If your CSV isn’t in the same folder, update the CSV path in the app’s sidebar.

What you get

Sidebar filters: date range, city, status, payment type, distance & fare sliders.
KPI cards: rides, revenue, avg fare, avg distance, completion rate, avg rating.
Charts: rides & revenue over time, payment split, ratings distribution, rides by city, revenue by city, peak hours, cancellations by city.
Driver leaderboard with trips, earnings, and avg rating.

Post Views: 5

Uber dataset with 5,000 rides.xlsx

If You’re Looking to Run or Recreate This Dashboard

Enhancements You Mentioned

Files

How to run

What you get

Leave a Reply Cancel reply

Explore LexSense

Categories

Datasets & Corpora

Advert

Learning Hub

Recent Articles

NLP Tools and Resources

Advert

Text Widget

Recent Comments

Archives

Unordered List

Ordered List

Definition List

WordPress Tutorials

Sample Text

Advent

Link List

Recent Comments

Text Widget

Unordered List

Ordered List

Definition List

If You’re Looking to Run or Recreate This Dashboard

Enhancements You Mentioned

Files

How to run

What you get

Related posts:

Leave a Reply Cancel reply

Explore LexSense

Categories

Datasets & Corpora

Advert

Learning Hub

Recent Articles

NLP Tools and Resources

Advert

Text Widget

Recent Comments

Archives

Unordered List

Ordered List

Definition List

WordPress Tutorials

Sample Text

Tags

Advent

Link List

Recent Comments

Text Widget

Unordered List

Ordered List

Definition List