This project analyzes Brazilian e-commerce sales and delivery performance using the Olist dataset, with an end-to-end workflow built using Python, Google BigQuery, SQL, and Power BI. The goal is to provide stakeholders with actionable insights on sales trends, category performance, delivery reliability, and state-level revenue distribution.
- Introduction
- Dataset
- Data Pipeline and Modeling
- Power BI Dashboard
- Dashboard Screenshots
- How to Run
- Results and Insights
- Future Work
- Author
This project was built as a portfolio-quality Data Analyst project to demonstrate:
- End-to-end data ingestion and transformation
- SQL-based KPI modeling in BigQuery
- Power BI data modeling (relationships, dimensions, DAX measures)
- Interactive dashboard design with slicers, Top-N analysis, trends, and geospatial visuals
The dashboard focuses on:
- Monthly sales performance (Orders, Revenue, AOV)
- Delivery performance (On-time Rate, Avg Delivery Days)
- Category-level revenue and trends
- State-level ranking and geographic performance mapping
The project uses the Olist Brazilian E-Commerce Public Dataset (Kaggle), which includes order, payment, review, product, customer, seller, and geolocation data.
olist_orders_dataset.csvolist_order_items_dataset.csvolist_order_payments_dataset.csvolist_order_reviews_dataset.csvolist_products_dataset.csvolist_customers_dataset.csvolist_sellers_dataset.csvolist_geolocation_dataset.csvproduct_category_name_translation.csv
This project follows a layered data architecture in BigQuery:
CSV files are loaded into BigQuery raw tables using a Python batch ingestion script.
SQL transformations clean and standardize raw tables into staging tables (stg_*).
Business-ready modeled tables are created (including a sales fact table used for KPI derivation).
Aggregated reporting views used directly in Power BI:
kpi_monthly_saleskpi_category_monthlykpi_state_monthly
DimDate/DimMonthdimensions for time analysis- Relationships across KPI tables
- Dedicated
_Measurestable for DAX measures - Weighted KPI calculations to avoid incorrect aggregations for averages/rates
- Synced slicers across dashboard pages
A 3-page interactive Power BI dashboard was built to support exploratory and executive-style analysis.
-
Overview
- Monthly KPI cards (Orders, Revenue, AOV, On-time Rate, Avg Delivery Days)
- Orders and Revenue trends by month
- Monthly KPI detail table
-
Category Performance
- Top categories by revenue (Top-N bar chart)
- Category KPI table (Orders, Revenue, AOV)
- Revenue trend (Top 5 categories) for readability
- Category and month slicers
-
State Performance
- Top states by revenue
- State revenue filled map (Brazil geocoding fixed using helper column)
- Revenue trend for selected state
- State KPI detail table
- State and month slicers
- Google Cloud project with BigQuery API enabled
- Service account key JSON
- Python 3.x
- Power BI Desktop
- Olist dataset CSV files downloaded locally
git clone <your-repo-url>
cd sales-inventory-dashboardpython -m venv .venv
.venv\Scripts\Activate.ps1If script execution is blocked:
Set-ExecutionPolicy -Scope Process -ExecutionPolicy BypassThen activate again:
.venv\Scripts\Activate.ps1python3 -m venv .venv
source .venv/bin/activatepip install -r requirements.txtCreate a .env file in the project root:
GOOGLE_APPLICATION_CREDENTIALS=C:\Users\yourname\Downloads\your-service-account-key.json
GCP_PROJECT_ID=sales-inventory-dashboard
BQ_RAW_DATASET=retail_analytics
BQ_STAGING_DATASET=retail_staging
BQ_MART_DATASET=retail_mart
BQ_KPI_DATASET=retail_kpi
python src/00_bq_smoke_test.pyExample: data/raw/
python src/load_olist_to_bigquery.pyExecute saved queries / SQL scripts for:
- staging layer
- mart layer
- KPI layer
Note: In this project, the SQL transformations were initially developed and saved as BigQuery saved queries. They can be exported and organized into local .sql files under sql/staging, sql/mart, and sql/kpi for version control.
Load:
- kpi_monthly_sales
- kpi_category_monthly
- kpi_state_monthly
Then refresh the model and open the .pbix dashboard file.
- Built an interactive KPI monitoring workflow for sales + delivery performance
- Enabled Top-N category and state analysis for faster revenue concentration insights
- Created trend-based views for month-over-month performance tracking
- Added a state revenue map with corrected Brazil geocoding using a helper field (StateCountry = [state] & ", Brazil")
- Correct handling of averages and rates in Power BI using DAX measures (instead of summing monthly aggregates)
- BigQuery → Power BI integration and modeling across multiple KPI views
- Visual interaction tuning (e.g., keeping Top-N charts stable while allowing detailed slicer-driven analysis)
- Add scheduled refresh / scheduled queries in BigQuery
- Add data quality checks (nulls, duplicates, schema validation)
- Add inventory proxy metrics (since Olist is not a true inventory dataset)
- Add executive one-page summary with recommendations
- Export and version all BigQuery SQL queries into local .sql files for reproducibility
sales-inventory-dashboard/
├── README.md
├── .gitignore
├── requirements.txt
├── src/
│ ├── 00_bq_smoke_test.py
│ └── load_olist_to_bigquery.py
├── sql/
│ ├── staging/
│ ├── mart/
│ ├── kpi/
│ └── validation/
├── powerbi/
│ └── Retail_Sales_Delivery_Dashboard_PowerBI.pbix
├── screenshots/
│ ├── overview_page.png
│ ├── category_performance_page.png
│ └── state_performance_page.png
├── data/
│ └── README.md # optional (document where to place source CSVs)
└── docs/ # optional


