Article Categories

Selected Reading

Python - Removing Duplicate Dicts in List

Python Server Side Programming Programming

When working with lists of dictionaries in Python, you may encounter duplicate entries that need to be removed. Since dictionaries are mutable and unhashable, they cannot be directly compared or stored in sets. This article explores four effective methods to remove duplicate dictionaries from a list.

Method 1: Using List Comprehension with Tuple Conversion

This approach converts each dictionary to a sorted tuple for comparison ?

def remove_duplicates(dict_list):  
    seen = set()
    result = []
    for d in dict_list:
        tuple_form = tuple(sorted(d.items()))
        if tuple_form not in seen:
            seen.add(tuple_form)
            result.append(d)
    return result

# Example data
cities = [
    {"Place": "Haldwani", "State": "Uttarakhand"},
    {"Place": "Hisar", "State": "Haryana"},
    {"Place": "Shillong", "State": "Meghalaya"},
    {"Place": "Kochi", "State": "Kerala"},
    {"Place": "Bhopal", "State": "Madhya Pradesh"},
    {"Place": "Kochi", "State": "Kerala"},  # Duplicate
    {"Place": "Haridwar", "State": "Uttarakhand"}
]

unique_cities = remove_duplicates(cities)
print(unique_cities)

[{'Place': 'Haldwani', 'State': 'Uttarakhand'}, {'Place': 'Hisar', 'State': 'Haryana'}, {'Place': 'Shillong', 'State': 'Meghalaya'}, {'Place': 'Kochi', 'State': 'Kerala'}, {'Place': 'Bhopal', 'State': 'Madhya Pradesh'}, {'Place': 'Haridwar', 'State': 'Uttarakhand'}]

Method 2: Using Pandas DataFrame

Pandas provides a built-in method for handling duplicates in large datasets ?

import pandas as pd

def remove_duplicates_pandas(dict_list):
    df = pd.DataFrame(dict_list)
    df.drop_duplicates(inplace=True)
    return df.to_dict(orient='records')

# Example data
cities = [
    {"Place": "Haldwani", "State": "Uttarakhand"},
    {"Place": "Hisar", "State": "Haryana"},
    {"Place": "Shillong", "State": "Meghalaya"},
    {"Place": "Kochi", "State": "Kerala"},
    {"Place": "Bhopal", "State": "Madhya Pradesh"},
    {"Place": "Kochi", "State": "Kerala"},  # Duplicate
    {"Place": "Haridwar", "State": "Uttarakhand"}
]

unique_cities = remove_duplicates_pandas(cities)
print(unique_cities)

[{'Place': 'Haldwani', 'State': 'Uttarakhand'}, {'Place': 'Hisar', 'State': 'Haryana'}, {'Place': 'Shillong', 'State': 'Meghalaya'}, {'Place': 'Kochi', 'State': 'Kerala'}, {'Place': 'Bhopal', 'State': 'Madhya Pradesh'}, {'Place': 'Haridwar', 'State': 'Uttarakhand'}]

Method 3: Using Hash with Frozenset

This method creates a hash from dictionary items using frozenset for efficient comparison ?

def make_hashable(d):
    return hash(frozenset(d.items()))

def remove_duplicates_hash(dict_list):
    seen = set()
    result = []
    for d in dict_list:
        hash_value = make_hashable(d)
        if hash_value not in seen:
            seen.add(hash_value)
            result.append(d)
    return result

# Example data
cities = [
    {"Place": "Haldwani", "State": "Uttarakhand"},
    {"Place": "Hisar", "State": "Haryana"},
    {"Place": "Shillong", "State": "Meghalaya"},
    {"Place": "Kochi", "State": "Kerala"},
    {"Place": "Bhopal", "State": "Madhya Pradesh"},
    {"Place": "Kochi", "State": "Kerala"},  # Duplicate
    {"Place": "Haridwar", "State": "Uttarakhand"}
]

unique_cities = remove_duplicates_hash(cities)
print(unique_cities)

[{'Place': 'Haldwani', 'State': 'Uttarakhand'}, {'Place': 'Hisar', 'State': 'Haryana'}, {'Place': 'Shillong', 'State': 'Meghalaya'}, {'Place': 'Kochi', 'State': 'Kerala'}, {'Place': 'Bhopal', 'State': 'Madhya Pradesh'}, {'Place': 'Haridwar', 'State': 'Uttarakhand'}]

Method 4: Using Helper Function with Sorted Tuples

This approach uses a helper function to convert dictionaries to sorted tuples for comparison ?

def dict_to_sorted_tuple(d):
    return tuple(sorted(d.items()))

def remove_duplicates_helper(dict_list):
    seen = set()
    result = []
    for d in dict_list:
        tuple_form = dict_to_sorted_tuple(d)
        if tuple_form not in seen:
            seen.add(tuple_form)
            result.append(d)
    return result

# Example data
cities = [
    {"Place": "Haldwani", "State": "Uttarakhand"},
    {"Place": "Hisar", "State": "Haryana"},
    {"Place": "Shillong", "State": "Meghalaya"},
    {"Place": "Kochi", "State": "Kerala"},
    {"Place": "Bhopal", "State": "Madhya Pradesh"},
    {"Place": "Kochi", "State": "Kerala"},  # Duplicate
    {"Place": "Haridwar", "State": "Uttarakhand"}
]

unique_cities = remove_duplicates_helper(cities)
print(unique_cities)

[{'Place': 'Haldwani', 'State': 'Uttarakhand'}, {'Place': 'Hisar', 'State': 'Haryana'}, {'Place': 'Shillong', 'State': 'Meghalaya'}, {'Place': 'Kochi', 'State': 'Kerala'}, {'Place': 'Bhopal', 'State': 'Madhya Pradesh'}, {'Place': 'Haridwar', 'State': 'Uttarakhand'}]

Comparison of Methods

Method	Performance	Memory Usage	Best For
List Comprehension	Good	Low	Small to medium datasets
Pandas	Excellent	High	Large datasets with complex data
Hash with Frozenset	Very Good	Medium	Fast comparison needed
Helper Function	Good	Low	Clean, readable code

Conclusion

Choose the method based on your specific needs: use pandas for large datasets, frozenset hashing for performance, or tuple conversion for simplicity. All methods effectively remove duplicate dictionaries while preserving the original order of unique entries.

Aayush Shukla

Updated on: 2026-03-27T10:42:44+05:30

1K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started

Previous Next