In this guide, you’ll learn how to create pivot tables in Python using Pandas.
A pivot table is a powerful data summarization tool. It allows you to:
- Group data by one or more columns
- Aggregate values (sum, mean, max, min, median, etc.)
- Create structured summary reports
- Analyze large datasets efficiently
In Pandas, pivot tables are created using the pivot_table() function.
Basic Syntax of pivot_table()
The general structure looks like this:
df.pivot_table(
index='column_to_group_by',
values='column_to_aggregate',
aggfunc='aggregation_function'
)
index→ Column(s) used for groupingvalues→ Column to summarizeaggfunc→ Aggregation function (sum, mean, max, min, etc.)
Now let’s apply this step-by-step using practical examples.
Step 1: Create a Sample Sales Dataset
We’ll use the following DataFrame that contains sales data across four quarters:
import pandas as pd
data = {
'person': ['A','B','C','D','E','A','B','C','D','E','A','B','C','D','E','A','B','C','D','E'],
'sales': [1000,300,400,500,800,1000,500,700,50,60,1000,900,750,200,300,1000,900,250,750,50],
'quarter': [1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4],
'country': ['US','Japan','Brazil','UK','US',
'Brazil','Japan','Brazil','US','US',
'US','Japan','Brazil','UK','Brazil',
'Japan','Japan','Brazil','UK','US']
}
df = pd.DataFrame(data)
print(df)
This dataset includes:
- Salesperson
- Sales amount
- Quarter
- Country
We will now create different pivot tables based on this dataset.
Example 1: Create a Pivot Table for Total Sales per Person
Goal: Calculate total sales across all quarters for each salesperson.
import pandas as pd
data = {
'person': ['A','B','C','D','E','A','B','C','D','E','A','B','C','D','E','A','B','C','D','E'],
'sales': [1000,300,400,500,800,1000,500,700,50,60,1000,900,750,200,300,1000,900,250,750,50],
'quarter': [1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4],
'country': ['US','Japan','Brazil','UK','US',
'Brazil','Japan','Brazil','US','US',
'US','Japan','Brazil','UK','Brazil',
'Japan','Japan','Brazil','UK','US']
}
df = pd.DataFrame(data)
pivot = df.pivot_table(index='person', values='sales', aggfunc='sum')
print(pivot)
Output:
sales
person
A 4000
B 2600
C 2100
D 1500
E 1210
This groups the data by person and calculates total sales using sum.
Example 2: Create a Pivot Table for Total Sales by Country
Goal: Calculate total sales for each country.
import pandas as pd
data = {
'person': ['A','B','C','D','E','A','B','C','D','E','A','B','C','D','E','A','B','C','D','E'],
'sales': [1000,300,400,500,800,1000,500,700,50,60,1000,900,750,200,300,1000,900,250,750,50],
'quarter': [1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4],
'country': ['US','Japan','Brazil','UK','US',
'Brazil','Japan','Brazil','US','US',
'US','Japan','Brazil','UK','Brazil',
'Japan','Japan','Brazil','UK','US']
}
df = pd.DataFrame(data)
pivot = df.pivot_table(index='country', values='sales', aggfunc='sum')
print(pivot)
Output:
sales
country
Brazil 3400
Japan 3600
UK 1450
US 2960
This groups the data by country and calculates total sales.
Example 3: Create a Multi-Level Pivot Table (Person and Country)
Goal: Calculate total sales per person within each country.
import pandas as pd
data = {
'person': ['A','B','C','D','E','A','B','C','D','E','A','B','C','D','E','A','B','C','D','E'],
'sales': [1000,300,400,500,800,1000,500,700,50,60,1000,900,750,200,300,1000,900,250,750,50],
'quarter': [1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4],
'country': ['US','Japan','Brazil','UK','US',
'Brazil','Japan','Brazil','US','US',
'US','Japan','Brazil','UK','Brazil',
'Japan','Japan','Brazil','UK','US']
}
df = pd.DataFrame(data)
pivot = df.pivot_table(index=['person','country'], values='sales', aggfunc='sum')
print(pivot)
This creates a hierarchical summary table.
Example 4: Find the Maximum Sales by Country
Instead of summing, you can use a different aggregation function.
pivot = df.pivot_table(index='country', values='sales', aggfunc='max')
print(pivot)
This shows the highest single sale recorded in each country.
Example 5: Apply Multiple Aggregation Functions
You can calculate median, mean, and minimum values at the same time:
pivot = df.pivot_table(
index='country',
values='sales',
aggfunc=['median','mean','min']
)
print(pivot)
This provides deeper statistical insight grouped by country.
Final Thoughts
Pivot tables are one of the most powerful features in Pandas for data analysis. They allow you to quickly summarize large datasets using:
- Single-level grouping
- Multi-level grouping
- Different aggregation functions
- Multiple aggregations simultaneously
Once you understand how pivot_table() works, you can apply it to financial analysis, sales reporting, performance tracking, and many other real-world use cases.
