The message you shared looks like a blog post excerpt (dated August 31, 2025, from “lexsense”) describing a synthetic or sample Uber dataset with 5,000 rides in .xlsx format (converted to .csv for the app). It includes typical columns like:
- Timestamps (e.g., request/start/end times)
- City
- Distance
- Fare
- Driver/customer IDs
- Payment type
- Ratings
- Ride status (e.g., completed, cancelled)
The post offers a pre-built Streamlit dashboard (uber_dashboard_streamlit.py) for interactive visualization, with features like filters, KPIs (total rides, revenue, avg fare/distance, completion rate, avg rating), time-based charts, payment breakdowns, city-level insights, peak hours, cancellations, and a driver leaderboard.
Since I don’t have direct access to your local files or the exact dataset/Streamlit code, here’s how to proceed based on common practices and similar public resources:
If You’re Looking to Run or Recreate This Dashboard
- Dataset Availability
No exact public match exists for a precisely “5,000 rides” Uber dataset with those exact columns (many public ones are either smaller personal exports, NYC-specific pickups, or larger fare-prediction sets). However, close alternatives include:
- GeeksforGeeks sample UberDataset.csv (small, with START_DATE, END_DATE, CATEGORY, etc.) — download from their media link.
- Kaggle datasets like “Uber Rides Dataset” by mubasherbajwa or “Uber Trips Dataset” by rohiteng (some have ~50k rows but include distance, fare, status, payment, timestamps, city-like info).
- GitHub repos (e.g., Geo-y20/Uber-Rides-Data-Analysis) with UberDataset.csv containing ride details. If your .xlsx is custom/synthetic, stick with it and convert to CSV via Excel/Pandas if needed:
import pandas as pd
df = pd.read_excel("Uber dataset with 5,000 rides.xlsx")
df.to_csv("uber_data.csv", index=False) import pandas as pd
df = pd.read_excel("Uber dataset with 5,000 rides.xlsx")
df.to_csv("uber_data.csv", index=False)- Running the Provided Streamlit App
Follow the instructions in the post:
- Place
uber_dashboard_streamlit.pyanduber_data.csvin the same folder. - Install dependencies:
bash pip install streamlit pandas numpy matplotlib - Launch:
bash streamlit run uber_dashboard_streamlit.py - Access in browser (usually http://localhost:8501). Use the sidebar to filter by date, city, status, etc. If the CSV path is hardcoded, edit the script (look for
pd.read_csv("uber_data.csv")or similar and adjust).
- Typical Code Structure for Such a Dashboard
If you don’t have the .py file or want to rebuild/enhance it, here’s a basic skeleton matching the described features:
import streamlit as st
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import plotly.express as px # Optional for better charts
st.set_page_config(page_title="Uber Rides Dashboard", layout="wide")
# Load data
@st.cache_data
def load_data():
return pd.read_csv("uber_data.csv") # Update path if needed
df = load_data()
# Assume datetime parsing (adjust column names)
df['timestamp'] = pd.to_datetime(df['timestamp']) # or 'request_time', etc.
df['date'] = df['timestamp'].dt.date
df['hour'] = df['timestamp'].dt.hour
# Sidebar filters
st.sidebar.header("Filters")
date_range = st.sidebar.date_input("Date Range", [df['date'].min(), df['date'].max()])
cities = st.sidebar.multiselect("City", options=df['city'].unique(), default=df['city'].unique())
statuses = st.sidebar.multiselect("Ride Status", options=df['ride_status'].unique(), default=df['ride_status'].unique())
# Apply filters (example)
filtered_df = df[(df['date'].between(date_range[0], date_range[1])) &
(df['city'].isin(cities)) &
(df['ride_status'].isin(statuses))]
# KPIs
col1, col2, col3, col4 = st.columns(4)
col1.metric("Total Rides", len(filtered_df))
col2.metric("Total Revenue", f"${filtered_df['fare'].sum():,.2f}")
col3.metric("Avg Fare", f"${filtered_df['fare'].mean():.2f}")
col4.metric("Avg Rating", f"{filtered_df['ratings'].mean():.1f} ⭐")
# Charts (expand as needed)
st.subheader("Rides & Revenue Over Time")
time_df = filtered_df.groupby('date').agg({'fare': 'sum', 'ride_status': 'count'}).reset_index()
fig = px.line(time_df, x='date', y=['fare', 'ride_status'], title="Trends")
st.plotly_chart(fig)
# More sections: Payment split (pie chart), Ratings hist, City bar, etc.
# Driver leaderboard example
st.subheader("Driver Leaderboard")
driver_stats = filtered_df.groupby('driver_id').agg(
trips=('ride_status', 'count'),
earnings=('fare', 'sum'),
avg_rating=('ratings', 'mean')
).sort_values('earnings', ascending=False).head(10)
st.dataframe(driver_stats)
# Add more: peak hours (bar chart on 'hour'), cancellations by city, etc.Add plotly, seaborn, or altair for fancier visuals.
Enhancements You Mentioned
- Geo Features + Maps — Yes, possible if your dataset includes pickup/dropoff lat/long (common in NYC Uber datasets). Use
st.map()orfolium/pydeckin Streamlit. - Convert to Plotly Dash — Dash is great for more complex callbacks. I can outline a Dash version if preferred (multi-page, advanced interactivity).
Files
- ▶️ Streamlit app: uber_dashboard_streamlit.py
- 📂 Dataset (5,000 rides): uber_data.csv
How to run
- Put both files in the same folder.
- In a terminal:
pip install streamlit pandas numpy matplotlib streamlit run uber_dashboard_streamlit.py - If your CSV isn’t in the same folder, update the CSV path in the app’s sidebar.
What you get
- Sidebar filters: date range, city, status, payment type, distance & fare sliders.
- KPI cards: rides, revenue, avg fare, avg distance, completion rate, avg rating.
- Charts: rides & revenue over time, payment split, ratings distribution, rides by city, revenue by city, peak hours, cancellations by city.
- Driver leaderboard with trips, earnings, and avg rating.



