You can use the nunique() function to count the number of unique values in a pandas DataFrame.
This function uses the following basic syntax:
#count unique values in each column df.nunique() #count unique values in each row df.nunique(axis=1)
The following examples show how to use this function in practice with the following pandas DataFrame:
import pandas as pd
#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'],
'points': [8, 8, 13, 13, 22, 22, 25, 29],
'assists': [5, 8, 7, 9, 12, 9, 9, 4],
'rebounds': [11, 8, 11, 6, 6, 5, 9, 12]})
#view DataFrame
df
team points assists rebounds
0 A 8 5 11
1 A 8 8 8
2 A 13 7 11
3 A 13 9 6
4 B 22 12 6
5 B 22 9 5
6 B 25 9 9
7 B 29 4 12
Example 1: Count Unique Values in Each Column
The following code shows how to count the number of unique values in each column of a DataFrame:
#count unique values in each column
df.nunique()
team 2
points 5
assists 5
rebounds 6
dtype: int64
From the output we can see:
- The ‘team’ column has 2 unique values
- The ‘points’ column has 5 unique values
- The ‘assists’ column has 5 unique values
- The ‘rebounds’ column has 6 unique values
Example 2: Count Unique Values in Each Row
The following code shows how to count the number of unique values in each row of a DataFrame:
#count unique values in each row
df.nunique(axis=1)
0 4
1 2
2 4
3 4
4 4
5 4
6 3
7 4
dtype: int64
From the output we can see:
- The first row has 4 unique values
- The second row has 2 unique values
- The third row has 4 unique values
And so on.
Example 3: Count Unique Values by Group
The following code shows how to count the number of unique values by group in a DataFrame:
#count unique 'points' values, grouped by team
df.groupby('team')['points'].nunique()
team
A 2
B 3
Name: points, dtype: int64
From the output we can see:
- Team ‘A’ has 2 unique ‘points’ values
- Team ‘B’ has 3 unique ‘points’ values
Example 4: Count Unique Values with a Condition
You can also combine nunique() with filtering to count unique values that meet specific conditions. For instance, let’s count the number of unique values where points are greater than 10:
#count unique values where points > 10
df[df['points'] > 10].nunique()
team 2
points 4
assists 4
rebounds 5
dtype: int64
From the output we can see:
- There are 2 unique teams with points > 10
- There are 4 unique point values > 10
- There are 4 unique assist values for rows with points > 10
- There are 5 unique rebound values for rows with points > 10
Example 5: Count Unique Values Using value_counts()
While nunique() gives you the count of unique values, value_counts() provides both the unique values and their frequencies. Let’s see how this works with our dataset:
#count occurrences of each unique value in the 'points' column
df['points'].value_counts()
8 2
13 2
22 2
25 1
29 1
Name: points, dtype: int64
From the output we can see:
- The value 8 appears 2 times
- The value 13 appears 2 times
- The value 22 appears 2 times
- The value 25 appears 1 time
- The value 29 appears 1 time
We can confirm that there are 5 unique values in the ‘points’ column, which matches our earlier nunique() result.
Wrapping Up
The nunique() function in pandas provides a straightforward way to count unique values in your data. This can be useful for:
- Understanding data variety in each column
- Checking for duplicate values
- Analyzing data distribution by groups
- Determining the cardinality of categorical variables
By using nunique() along with other pandas functions like groupby() and value_counts(), you can gain deeper insights into your data’s composition and structure.
Additional Resources
The following tutorials explain how to perform other common operations in pandas:
How to Count Observations by Group in Pandas
How to Count Missing Values in Pandas
How to Use Pandas value_counts() Function