How to remove duplicate tuples from a list in python?

Question

I have a list that contains list of tuples as follows.

mylist = [['xxx', 879], ['yyy', 315], ['xxx', 879], ['zzz', 171], ['yyy', 315]]

I want to remove the duplicate tuples from mylist and get an output as follows.

mylist = [['xxx', 879], ['yyy', 315], ['zzz', 171]]

It seems like set in python does not work for it.

mylist = list(set(mylist))

Is there any fast and easy way of doing this in python (perhaps using libraries)?

Possible duplicate of How do you remove duplicates from a list in whilst preserving order? — javidcf
– javidcf, Commented Jan 17, 2018 at 11:54
Or if you don't need to preserve order check out Removing duplicates in lists. — javidcf
– javidcf, Commented Jan 17, 2018 at 11:54
I don't believe the question is a duplicate of that specific Q&A, though I'd guess there is a better one out there... — coldspeed95
– coldspeed95, Commented Jan 17, 2018 at 11:58
the reason its not working for you is, you have a list of list , and a list cannot be added to a set because lists are not hashable . — John Joseph Fernandes
– John Joseph Fernandes, Commented Jan 17, 2018 at 11:59

RoadRunner · Accepted Answer · 2018-01-17 12:08:44Z

6

It seems like you want to preserve order. In that case you can keep a set that keeps track of what lists have been added.

Here is an example:

mylist = [['xxx', 879], ['yyy', 315], ['xxx', 879], ['zzz', 171], ['yyy', 315]]

# set that keeps track of what elements have been added
seen = set()

no_dups = []
for lst in mylist:

    # convert to hashable type
    current = tuple(lst)

    # If element not in seen, add it to both
    if current not in seen:
        no_dups.append(lst)
        seen.add(current)

print(no_dups)

Which Outputs:

[['xxx', 879], ['yyy', 315], ['zzz', 171]]

Note: Since lists are not hashable, you can add tuples instead to the seen set.

edited Jan 17, 2018 at 12:08

answered Jan 17, 2018 at 11:56

RoadRunner

26.4k6 gold badges46 silver badges78 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Jonathon McMurray Over a year ago

@cᴏʟᴅsᴘᴇᴇᴅ it appears to retain the order the elements are encountered in the original list?

John Joseph Fernandes · Accepted Answer · 2018-01-17 12:32:44Z

6

The reason that you're not able to do this is because you have a list of lists and not a list of tuples.

What you could do is:

mytuplelist = [tuple(item) for item in mylist]
mylist = list(set(mytuplelist))

or

mylist = list(set(map(tuple, mylist)))

edited Jan 17, 2018 at 12:32

answered Jan 17, 2018 at 12:02

John Joseph Fernandes

3693 silver badges7 bronze badges

5 Comments

coldspeed95 Over a year ago

@NickA No, but I do have reason to believe it may be the case based on their output, unless OP clarifies. :)

Ma0 Over a year ago

@cᴏʟᴅsᴘᴇᴇᴅ Respectfully, you can do so in the Questions comment section.

coldspeed95 Over a year ago

@NickA I apologise for jumping the gun! I've edited my answer to be a little more appropriate, and will stop plugging this idea since it apparently doesn't matter much to OP anymore.

Nick is tired Over a year ago

@cᴏʟᴅsᴘᴇᴇᴅ looks great, have a rare +1

John Joseph Fernandes Over a year ago

you could also do a mylist = list(set(map(tuple, mylist)))

coldspeed95 · Accepted Answer · 2018-01-17 12:31:41Z

5

You need to write code that keeps the first of the sub-lists, dropping the rest. The simplest way to do this is to reverse mylist, load it into an dict object, and retrieve its key-value pairs as lists again.

>>> list(map(list, dict(mylist).items()))

Or, using a list comprehension -

>>> [list(v) for v in dict(mylist).items()]

[['zzz', 171], ['yyy', 315], ['xxx', 879]]

Note, that this answer does not maintain order! Also, if your sub-lists can have more than 2 elements, an approach involving hashing the tuplized versions of your data, as @JohnJosephFernandez' answer shows, would be the best thing to do.

edited Jan 17, 2018 at 12:31

answered Jan 17, 2018 at 11:54

coldspeed95

407k106 gold badges746 silver badges799 bronze badges

5 Comments

Chris_Rands Over a year ago

Can you explain the logic behind the reversals? Also I think this fails for something like mylist = [['xxx', 879], ['xxx', 200]]

coldspeed95 Over a year ago

@Chris_Rands Sorry, they're part of an older solution I should have removed. They do nothing there.

coldspeed95 Over a year ago

@Chris_Rands I have to confess that I did misread the question at first, thinking that the key (first sublist item) was the same, and OP wanted the first, dropping all the other duplicates. Because of that, I reversed the list and sent the entries into a dict, so that, when retrieving back, the last key-value pairs that were inserted, overwriting the previous, were the first pairs in the original list. I hope I made sense!

Chris_Rands Over a year ago

Right well I arrived late so haven't followed the evolution of the question, but fact remains that [list(v) for v in dict([['xxx', 879], ['xxx', 200]]).items()] is not list(set(tuple(item) for item in [['xxx', 879], ['xxx', 200]])) and I think the latter (like John Joseph wrote) is what is wanted. But the OP accepted your answer so I may be wrong! Perhaps this situation never arises in their data anyway

coldspeed95 Over a year ago

@Chris_Rands Yup, I was rather surprised myself, I'm not afraid to admit I made a meal of answering! Well, OPs are fickle beasts, I've made the necessary edits and disclaimers, I hope that does for now. ;)

Jonathon McMurray · Accepted Answer · 2018-01-17 11:58:14Z

2

Another option:

>>> mylist = [['xxx', 879], ['yyy', 315], ['xxx', 879], ['zzz', 171], ['yyy', 315]]
>>> y = []
>>> for x in mylist:
...     if not x in y:
...             y+=[x]
...
>>> y
[['xxx', 879], ['yyy', 315], ['zzz', 171]]

answered Jan 17, 2018 at 11:58

Jonathon McMurray

2,9911 gold badge13 silver badges22 bronze badges

Collectives™ on Stack Overflow

How to remove duplicate tuples from a list in python?

4 Answers 4

1 Comment

5 Comments

5 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

1 Comment

5 Comments

5 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related