Python - Removing Duplicate Dicts in List

When working with lists of dictionaries in Python, you may encounter duplicate entries that need to be removed. Since dictionaries are mutable and unhashable, they cannot be directly compared or stored in sets. This article explores four effective methods to remove duplicate dictionaries from a list.

Method 1: Using List Comprehension with Tuple Conversion

This approach converts each dictionary to a sorted tuple for comparison ?

def remove_duplicates(dict_list):  
    seen = set()
    result = []
    for d in dict_list:
        tuple_form = tuple(sorted(d.items()))
        if tuple_form not in seen:
            seen.add(tuple_form)
            result.append(d)
    return result

# Example data
cities = [
    {"Place": "Haldwani", "State": "Uttarakhand"},
    {"Place": "Hisar", "State": "Haryana"},
    {"Place": "Shillong", "State": "Meghalaya"},
    {"Place": "Kochi", "State": "Kerala"},
    {"Place": "Bhopal", "State": "Madhya Pradesh"},
    {"Place": "Kochi", "State": "Kerala"},  # Duplicate
    {"Place": "Haridwar", "State": "Uttarakhand"}
]

unique_cities = remove_duplicates(cities)
print(unique_cities)
[{'Place': 'Haldwani', 'State': 'Uttarakhand'}, {'Place': 'Hisar', 'State': 'Haryana'}, {'Place': 'Shillong', 'State': 'Meghalaya'}, {'Place': 'Kochi', 'State': 'Kerala'}, {'Place': 'Bhopal', 'State': 'Madhya Pradesh'}, {'Place': 'Haridwar', 'State': 'Uttarakhand'}]

Method 2: Using Pandas DataFrame

Pandas provides a built-in method for handling duplicates in large datasets ?

import pandas as pd

def remove_duplicates_pandas(dict_list):
    df = pd.DataFrame(dict_list)
    df.drop_duplicates(inplace=True)
    return df.to_dict(orient='records')

# Example data
cities = [
    {"Place": "Haldwani", "State": "Uttarakhand"},
    {"Place": "Hisar", "State": "Haryana"},
    {"Place": "Shillong", "State": "Meghalaya"},
    {"Place": "Kochi", "State": "Kerala"},
    {"Place": "Bhopal", "State": "Madhya Pradesh"},
    {"Place": "Kochi", "State": "Kerala"},  # Duplicate
    {"Place": "Haridwar", "State": "Uttarakhand"}
]

unique_cities = remove_duplicates_pandas(cities)
print(unique_cities)
[{'Place': 'Haldwani', 'State': 'Uttarakhand'}, {'Place': 'Hisar', 'State': 'Haryana'}, {'Place': 'Shillong', 'State': 'Meghalaya'}, {'Place': 'Kochi', 'State': 'Kerala'}, {'Place': 'Bhopal', 'State': 'Madhya Pradesh'}, {'Place': 'Haridwar', 'State': 'Uttarakhand'}]

Method 3: Using Hash with Frozenset

This method creates a hash from dictionary items using frozenset for efficient comparison ?

def make_hashable(d):
    return hash(frozenset(d.items()))

def remove_duplicates_hash(dict_list):
    seen = set()
    result = []
    for d in dict_list:
        hash_value = make_hashable(d)
        if hash_value not in seen:
            seen.add(hash_value)
            result.append(d)
    return result

# Example data
cities = [
    {"Place": "Haldwani", "State": "Uttarakhand"},
    {"Place": "Hisar", "State": "Haryana"},
    {"Place": "Shillong", "State": "Meghalaya"},
    {"Place": "Kochi", "State": "Kerala"},
    {"Place": "Bhopal", "State": "Madhya Pradesh"},
    {"Place": "Kochi", "State": "Kerala"},  # Duplicate
    {"Place": "Haridwar", "State": "Uttarakhand"}
]

unique_cities = remove_duplicates_hash(cities)
print(unique_cities)
[{'Place': 'Haldwani', 'State': 'Uttarakhand'}, {'Place': 'Hisar', 'State': 'Haryana'}, {'Place': 'Shillong', 'State': 'Meghalaya'}, {'Place': 'Kochi', 'State': 'Kerala'}, {'Place': 'Bhopal', 'State': 'Madhya Pradesh'}, {'Place': 'Haridwar', 'State': 'Uttarakhand'}]

Method 4: Using Helper Function with Sorted Tuples

This approach uses a helper function to convert dictionaries to sorted tuples for comparison ?

def dict_to_sorted_tuple(d):
    return tuple(sorted(d.items()))

def remove_duplicates_helper(dict_list):
    seen = set()
    result = []
    for d in dict_list:
        tuple_form = dict_to_sorted_tuple(d)
        if tuple_form not in seen:
            seen.add(tuple_form)
            result.append(d)
    return result

# Example data
cities = [
    {"Place": "Haldwani", "State": "Uttarakhand"},
    {"Place": "Hisar", "State": "Haryana"},
    {"Place": "Shillong", "State": "Meghalaya"},
    {"Place": "Kochi", "State": "Kerala"},
    {"Place": "Bhopal", "State": "Madhya Pradesh"},
    {"Place": "Kochi", "State": "Kerala"},  # Duplicate
    {"Place": "Haridwar", "State": "Uttarakhand"}
]

unique_cities = remove_duplicates_helper(cities)
print(unique_cities)
[{'Place': 'Haldwani', 'State': 'Uttarakhand'}, {'Place': 'Hisar', 'State': 'Haryana'}, {'Place': 'Shillong', 'State': 'Meghalaya'}, {'Place': 'Kochi', 'State': 'Kerala'}, {'Place': 'Bhopal', 'State': 'Madhya Pradesh'}, {'Place': 'Haridwar', 'State': 'Uttarakhand'}]

Comparison of Methods

Method Performance Memory Usage Best For
List Comprehension Good Low Small to medium datasets
Pandas Excellent High Large datasets with complex data
Hash with Frozenset Very Good Medium Fast comparison needed
Helper Function Good Low Clean, readable code

Conclusion

Choose the method based on your specific needs: use pandas for large datasets, frozenset hashing for performance, or tuple conversion for simplicity. All methods effectively remove duplicate dictionaries while preserving the original order of unique entries.

Updated on: 2026-03-27T10:42:44+05:30

1K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements