3

I have a list that contains list of tuples as follows.

mylist = [['xxx', 879], ['yyy', 315], ['xxx', 879], ['zzz', 171], ['yyy', 315]]

I want to remove the duplicate tuples from mylist and get an output as follows.

mylist = [['xxx', 879], ['yyy', 315], ['zzz', 171]]

It seems like set in python does not work for it.

mylist = list(set(mylist))

Is there any fast and easy way of doing this in python (perhaps using libraries)?

4
  • 4
    Possible duplicate of How do you remove duplicates from a list in whilst preserving order? Commented Jan 17, 2018 at 11:54
  • Or if you don't need to preserve order check out Removing duplicates in lists. Commented Jan 17, 2018 at 11:54
  • 1
    I don't believe the question is a duplicate of that specific Q&A, though I'd guess there is a better one out there... Commented Jan 17, 2018 at 11:58
  • 1
    the reason its not working for you is, you have a list of list , and a list cannot be added to a set because lists are not hashable . Commented Jan 17, 2018 at 11:59

4 Answers 4

6

It seems like you want to preserve order. In that case you can keep a set that keeps track of what lists have been added.

Here is an example:

mylist = [['xxx', 879], ['yyy', 315], ['xxx', 879], ['zzz', 171], ['yyy', 315]]

# set that keeps track of what elements have been added
seen = set()

no_dups = []
for lst in mylist:

    # convert to hashable type
    current = tuple(lst)

    # If element not in seen, add it to both
    if current not in seen:
        no_dups.append(lst)
        seen.add(current)

print(no_dups)

Which Outputs:

[['xxx', 879], ['yyy', 315], ['zzz', 171]]

Note: Since lists are not hashable, you can add tuples instead to the seen set.

Sign up to request clarification or add additional context in comments.

1 Comment

@cᴏʟᴅsᴘᴇᴇᴅ it appears to retain the order the elements are encountered in the original list?
6

The reason that you're not able to do this is because you have a list of lists and not a list of tuples.

What you could do is:

mytuplelist = [tuple(item) for item in mylist]
mylist = list(set(mytuplelist))

or

mylist = list(set(map(tuple, mylist)))

5 Comments

@NickA No, but I do have reason to believe it may be the case based on their output, unless OP clarifies. :)
@cᴏʟᴅsᴘᴇᴇᴅ Respectfully, you can do so in the Questions comment section.
@NickA I apologise for jumping the gun! I've edited my answer to be a little more appropriate, and will stop plugging this idea since it apparently doesn't matter much to OP anymore.
@cᴏʟᴅsᴘᴇᴇᴅ looks great, have a rare +1
you could also do a mylist = list(set(map(tuple, mylist)))
5

You need to write code that keeps the first of the sub-lists, dropping the rest. The simplest way to do this is to reverse mylist, load it into an dict object, and retrieve its key-value pairs as lists again.

>>> list(map(list, dict(mylist).items()))

Or, using a list comprehension -

>>> [list(v) for v in dict(mylist).items()]

[['zzz', 171], ['yyy', 315], ['xxx', 879]]

Note, that this answer does not maintain order! Also, if your sub-lists can have more than 2 elements, an approach involving hashing the tuplized versions of your data, as @JohnJosephFernandez' answer shows, would be the best thing to do.

5 Comments

Can you explain the logic behind the reversals? Also I think this fails for something like mylist = [['xxx', 879], ['xxx', 200]]
@Chris_Rands Sorry, they're part of an older solution I should have removed. They do nothing there.
@Chris_Rands I have to confess that I did misread the question at first, thinking that the key (first sublist item) was the same, and OP wanted the first, dropping all the other duplicates. Because of that, I reversed the list and sent the entries into a dict, so that, when retrieving back, the last key-value pairs that were inserted, overwriting the previous, were the first pairs in the original list. I hope I made sense!
Right well I arrived late so haven't followed the evolution of the question, but fact remains that [list(v) for v in dict([['xxx', 879], ['xxx', 200]]).items()] is not list(set(tuple(item) for item in [['xxx', 879], ['xxx', 200]])) and I think the latter (like John Joseph wrote) is what is wanted. But the OP accepted your answer so I may be wrong! Perhaps this situation never arises in their data anyway
@Chris_Rands Yup, I was rather surprised myself, I'm not afraid to admit I made a meal of answering! Well, OPs are fickle beasts, I've made the necessary edits and disclaimers, I hope that does for now. ;)
2

Another option:

>>> mylist = [['xxx', 879], ['yyy', 315], ['xxx', 879], ['zzz', 171], ['yyy', 315]]
>>> y = []
>>> for x in mylist:
...     if not x in y:
...             y+=[x]
...
>>> y
[['xxx', 879], ['yyy', 315], ['zzz', 171]]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.