Are Sets Mutable in Python? Practical Guidance for 2026

I still remember the first time a production bug came down to a set that “mysteriously” changed after I passed it into a helper function. The code looked clean, the tests were green, and yet a user’s notification preferences vanished. The root cause wasn’t exotic—just a misunderstanding about mutability. If you work in Python long enough, this question will pop up: are sets mutable? The short answer is yes, but the nuance is where real-world reliability lives. When you know which operations mutate a set, which ones return a new object, and which element types are even allowed inside, you stop guessing and start predicting behavior. That’s where I want to take you: from “I think sets are mutable” to “I can reason about every edge case.” You’ll see how to add and remove elements safely, how to avoid the classic “unhashable type” trap, how set operations affect performance, and when a frozen alternative is the safer call. I’ll also show patterns I use in modern Python work—testing, type hints, and debugging workflows that save hours.

Mutability, in plain terms

When I say a structure is mutable, I mean I can change its contents without creating a brand-new object. A list is mutable: you can append, remove, and update elements in place. A tuple is not: you must create a new tuple to represent a change. A set sits with lists on this question—sets are mutable. You can add elements, remove elements, and update them via set operations. The set object stays the same identity, but its members change.

I explain it to juniors using a whiteboard analogy. A list is a whiteboard where the order of notes matters. A set is a whiteboard where only the presence of a note matters, not where it is. You can erase notes and add new ones on both boards. That’s mutability. The important twist is that each note must be written in ink that won’t smear. In Python terms, set elements must be hashable (immutable). That’s why you can put integers, strings, tuples, and frozensets inside a set, but not lists or dictionaries.

Mutability isn’t just academic. It affects function behavior, caching, thread safety, and even how you test code. If you pass a set into a function and the function mutates it, the caller sees the change. If you expect a pure function and you get side effects, your results can shift in hard-to-reproduce ways. In my experience, this is one of the top two causes of “works on my machine” bugs with data pipelines.

The operations that mutate sets

When you work with sets, it helps to group operations into two buckets: in-place mutations and new-object results. If you keep that mental map, you’ll never be surprised.

Common in-place mutations

  • add(elem) adds a single element to the set.
  • update(iterable) adds multiple elements.
  • remove(elem) removes a specific element and raises an error if missing.
  • discard(elem) removes a specific element but does not error if missing.
  • pop() removes and returns an arbitrary element.
  • clear() empties the set.
  • differenceupdate, intersectionupdate, symmetricdifferenceupdate update in place.

Here is a runnable example that shows the set identity staying constant while its contents change:

# A simple set that we will mutate

active_users = {"lena", "omar", "vi", "chris"}

beforeid = id(activeusers)

# Add a new user

active_users.add("sanjay")

# Remove a user if present

active_users.discard("vi")

afterid = id(activeusers)

print(active_users)

print("same object:", beforeid == afterid)

If you run this, you’ll see the set contents change, but beforeid == afterid stays True. That’s mutability, concrete and undeniable.

Operations that return a new set

These operations are non-mutating; they give you a brand-new set:

  • union, intersection, difference, symmetric_difference
  • The operators |, &, -, ^

Example:

team_a = {"lena", "omar", "vi"}

team_b = {"chris", "omar", "sanjay"}

combined = teama.union(teamb)

print("teama:", teama)

print("teamb:", teamb)

print("combined:", combined)

Because union returns a new set, teama and teamb are unchanged. I prefer using these non-mutating methods inside functions meant to be pure; it helps keep side effects away from business logic.

What it really means that elements must be immutable

Sets are mutable, but their elements must be immutable. That sounds contradictory until you remember the core mechanism: hashing. A set uses a hash table to store elements, and the hash value must remain stable. If you could insert a list and then change its contents, the hash could change and the set would lose track of where it stored it.

Here’s the classic failure case:

shopping_cart = set()

item = ["coffee", 12]

shopping_cart.add(item)

This raises a TypeError: unhashable type: ‘list‘. The list can change, so Python forbids it as a set element. If you need a mutable container, wrap it in something immutable, like a tuple:

shopping_cart = set()

item = ("coffee", 12)

shopping_cart.add(item)

print(shopping_cart)

If the content is nested, your immutability has to be real all the way down. A tuple that contains a list is still unhashable. I recommend a rule of thumb: if you can change it in place, you can’t put it in a set.

Practical guidance I use

  • Use tuples for fixed-size records.
  • Use frozenset when you want a set to be an element of another set.
  • Avoid custom classes as set elements unless you fully control their hash and eq behavior.

Mutability and function design

This is where things get serious. Passing a mutable set into a function means the function can modify the caller’s data. Sometimes that is exactly what you want. Often it isn’t.

The mutating pattern (explicit)

I use this when the function’s purpose is to change the set, and the name or docstring makes that clear:

def markverified(emails, verifiedemail):

# Mutates the input set; caller expects this

emails.add(verified_email)

verified = {"[email protected]", "[email protected]"}

mark_verified(verified, "[email protected]")

print(verified)

The pure pattern (safe by default)

When I want to keep the input safe, I create a new set inside the function:

def withverified(emails, verifiedemail):

# Returns a new set; input remains unchanged

updated = set(emails)

updated.add(verified_email)

return updated

verified = {"[email protected]", "[email protected]"}

updated = with_verified(verified, "[email protected]")

print("original:", verified)

print("updated:", updated)

The second approach costs a bit of memory, but it reduces surprise. For code that runs in response to user actions or API requests, I choose safety over a tiny speed gain. The only time I prefer the mutating pattern is when performance or memory is a real constraint, or when I’m clearly working in an in-place API (like set.update).

Common mistakes and how I avoid them

I’ve seen the same mistakes repeat for years. Here are the ones that still bite experienced Python developers, and how I avoid them.

Mistake 1: Using a set as a default argument

This looks innocent but is a classic trap:

def add_tag(tags=set(), tag=None):

if tag:

tags.add(tag)

return tags

Default arguments are evaluated once, so the set is shared across calls. This can turn into a hidden global. The fix is to use None and create a new set inside:

def add_tag(tags=None, tag=None):

if tags is None:

tags = set()

if tag:

tags.add(tag)

return tags

Mistake 2: Confusing remove and discard

remove raises a KeyError when the element isn’t present. discard does not. If your data is noisy or user-driven, I use discard unless I truly need to detect missing data. This makes API behavior more stable under input variance.

Mistake 3: Expecting deterministic pop

set.pop() removes an arbitrary element, not the “first” one. Sets are unordered. If you need predictable removal, use a list or collections.deque, or convert the set to a sorted list. Treat pop as “give me any element to process now.”

Mistake 4: Mutating while iterating

Mutating a set while iterating over it raises a RuntimeError. You can work around it by iterating over a copy:

active_users = {"lena", "omar", "vi", "chris"}

for user in list(active_users):

if user.startswith("v"):

active_users.remove(user)

I use this pattern for cleanup tasks and batch maintenance jobs.

Mistake 5: Assuming set operations are all in-place

A

B creates a new set, A

= B updates in place. I always choose the operator that matches my intention; it saves me from accidental mutation later in the codebase.

When to use sets and when not to

Sets shine when you need membership checks, uniqueness, or set algebra. They are a natural fit for IDs, tags, feature flags, and de-duplication.

I use sets when:

  • I need to check membership many times (like “has this user already been processed?”).
  • I need to remove duplicates while keeping no order.
  • I want to do union, intersection, or difference quickly.

I avoid sets when:

  • I need ordering. A list or tuple is better, or a sorted list if I need consistent order.
  • I need to store unhashable items, like dictionaries.
  • I need to preserve duplicates. Use a list or collections.Counter.

If you find yourself adding and removing items while also caring about insertion order, consider dict instead. Since Python 3.7, dictionaries preserve insertion order, and you can use keys like a set. I often use a dict with dummy values when I need “ordered set” behavior.

Performance and scaling realities

Sets are fast for membership testing because they are hash-based. In practice, membership checks are typically constant time for each lookup, though there is overhead for hashing. I see performance wins when the alternative is repeatedly scanning a list.

Here’s a practical comparison I use to explain it:

  • List membership checks are like flipping through a notebook page by page.
  • Set membership checks are like going straight to a tab in a binder.

That said, sets aren’t “free.” Hashing can be non-trivial for large or complex elements, and building a set takes time. When you have tiny collections (say under 10 elements), a list can be competitive and simpler. I tend to switch to sets when membership checks happen inside loops or when the data is reused across many lookups.

Typical performance guidance I give teams

  • For a few dozen elements: either list or set is fine; choose clarity.
  • For hundreds or more: a set usually wins if you do many membership checks.
  • For heavy set algebra on large data: rely on built-in set operations, which are optimized in C.

I avoid exact timings because hardware, Python version, and input distribution matter. I frame it as “a handful of lookups” vs “repeated lookups inside loops.” That mental model stays reliable across projects.

Working patterns in 2026: testing, typing, and AI helpers

Modern Python work is more than raw syntax. I routinely combine set operations with type hints, tests, and AI-assisted tools, and it changes how I think about mutability.

Type hints that make intent clear

Type hints aren’t a runtime guarantee, but they make your intent explicit. I use set[str] to communicate that a function accepts a mutable set of strings, and frozenset[str] when I want immutability at the API boundary.

def approved_domains() -> frozenset[str]:

return frozenset({"example.com", "company.org"})

def is_allowed(email: str, domains: set[str]) -> bool:

# domains is mutable by design

return email.split("@")[-1] in domains

Type checkers and linters will flag accidental misuse. That’s one of the best low-cost defenses against mutation bugs.

Tests that lock in behavior

In production systems, I test both content and identity when mutability matters. For example:

def testmarkverifiedmutatesinput():

emails = {"[email protected]"}

before_id = id(emails)

mark_verified(emails, "[email protected]")

assert "[email protected]" in emails

assert id(emails) == before_id

This test makes it obvious to future readers that mutation is expected. If someone refactors the function to return a new set, the test fails, and you catch the behavioral shift early.

AI-assisted workflows

In 2026, I commonly use AI tools to draft tests, generate data, or spot mutation risks. The key is to prompt for specific checks: “Find places where a set is modified in place and suggest if it should be copied.” That yields actionable results instead of vague advice. I treat AI as a second set of eyes, not a source of truth. The real decision still sits with you and your codebase context.

Set operations: clear examples and real-world framing

Let’s walk through a few operations with concrete domain examples. I pick event processing because it’s a common case where sets simplify logic.

De-duplication and ingestion

Say you ingest event IDs from multiple services and need a unique set of IDs to process:

serviceaids = {"evt-100", "evt-101", "evt-102"}

servicebids = {"evt-101", "evt-103"}

uniqueids = serviceaids | serviceb_ids

print(unique_ids)

This gives you a new set of unique IDs. The original sets remain unchanged.

Filtering to a policy allowlist

Imagine you have a global allowlist and a dynamic set of candidate domains:

allowlist = {"example.com", "company.org", "partner.net"}

candidates = {"example.com", "random.io", "partner.net"}

approved = allowlist & candidates

print(approved)

Removing known-bad entries

You can subtract a blocklist:

blocklist = {"malware.net", "phish.io"}

incoming = {"example.com", "phish.io", "partner.net"}

safe = incoming – blocklist

print(safe)

Symmetric difference for change tracking

When I need to track what changed between two snapshots, I use symmetric difference:

before = {"lena", "omar", "vi"}

after = {"lena", "sanjay", "vi"}

changed = before ^ after

print(changed)

This tells me who was added or removed between two states.

Mutability and concurrency

If you’re working in multi-threaded or async code, mutability matters even more. A shared mutable set can cause race conditions if multiple threads modify it at the same time. I don’t recommend using a set as a shared mutable structure unless you wrap it with locks or use a concurrency-safe design.

In async code, a set can still be shared across tasks, and await points can interleave modifications. I prefer two patterns:

  • Pass copies into tasks when I don’t want shared mutation.
  • Use a single owner task that holds the set and exposes actions via queues or channels.

In high-concurrency systems, I sometimes switch to immutable frozenset snapshots for each step. It costs memory, but it removes a whole category of data races and makes the system easier to reason about.

Frozen sets and API design

A frozenset is an immutable set. It looks and behaves like a set for membership and algebra, but you can’t add or remove items. That makes it perfect for constants, configuration, and function outputs that you don’t want callers to mutate.

Example:

SUPPORTED_FORMATS = frozenset({"json", "csv", "parquet"})

def is_supported(fmt: str) -> bool:

return fmt in SUPPORTED_FORMATS

When I design public APIs, I often return frozenset from functions that expose internal data. That signals “read-only,” and it prevents subtle bugs where a caller mutates shared state by accident.

Traditional vs modern approach

Here is a quick comparison I give teams deciding between mutable and immutable sets:

Pattern

Traditional Approach

Modern Approach (2026) —

— Global configuration

Mutable set in module scope

frozenset constants with type hints Passing data between layers

In-place updates in helper functions

Functions return new sets; inputs stay unchanged Debugging state changes

Print statements

Tests + trace tools + mutation-focused reviews Concurrency

Shared mutable set with locks

Immutable snapshots + single-owner task

I’m not saying you should always pick immutability. I am saying you should choose consciously, document it, and test it.

Edge cases that bite in production

These are less common, but they show up when your system grows.

Custom classes as elements

Custom objects can be in a set if they define hash and eq consistently. If you override eq without hash, the object becomes unhashable. That can break code in surprising ways. If you store custom objects in sets, I recommend using @dataclass(frozen=True) to lock down their hash behavior.

Mutation through aliases

If two variables refer to the same set, mutation through one changes the other. This seems obvious, but it is a common source of test flakiness. I use set(x) to create a copy when I need isolation.

Set of sets

A set can’t contain another set because sets are mutable. Use frozenset instead:

group_a = frozenset({"lena", "omar"})

group_b = frozenset({"vi", "sanjay"})

groups = {groupa, groupb}

print(groups)

This is useful in partitioning, clustering, and permission groups.

Practical decision checklist

When you’re deciding how to use a set in your own code, I use a small checklist:

  • Will this data change after creation? If yes, use a set; if no, consider frozenset.
  • Do I need to preserve order or duplicates? If yes, use a list or dict.
  • Is this set shared across functions? If yes, decide whether mutation is expected and test for it.
  • Are elements hashable and stable? If not, convert to immutable types.
  • Will this run in concurrent code? If yes, avoid shared mutation.

This keeps my code predictable and makes maintenance easier for teammates who join later.

A deeper example: audit log filtering

Here’s a more complete example that mirrors a common backend task: filtering an audit log with allowlists and blocklists, while making mutability explicit. I chose this to show realistic data rather than toy values.

def filterauditevents(events, allowedactions, blockedusers):

# Create new sets to avoid mutating caller data

allowed = set(allowed_actions)

blocked = set(blocked_users)

filtered = []

for event in events:

# event is a dict with keys: user_id, action

if event["action"] in allowed and event["user_id"] not in blocked:

filtered.append(event)

return filtered

events = [

{"user_id": "u-101", "action": "login"},

{"user_id": "u-202", "action": "delete"},

{"user_id": "u-303", "action": "login"},

]

allowed_actions = {"login", "logout"}

blocked_users = {"u-202"}

result = filterauditevents(events, allowedactions, blockedusers)

print(result)

The function clones the incoming sets. That costs a bit of memory but guarantees the caller’s sets are intact. If you want a mutating variant for performance, you can design a separate function and name it clearly, like filterauditeventsinplace.

Where I land on the big question

So, are sets mutable in Python? Yes, absolutely. You can modify their contents after creation with methods like add, remove, and update, and those changes happen in place. At the same time, the elements inside a set must be immutable and hashable. That combination is the key: a mutable container that requires immutable members. Once you internalize that, many Python behaviors stop feeling mysterious.

If you take one practical rule from me, take this: decide up front whether you want to mutate or return a new set, and make that choice explicit in your code and tests. That one decision eliminates a large class of bugs. I also recommend that you use frozenset for constants and public API outputs, and that you treat mutation as something you opt into, not something you let happen by accident.

When you build larger systems, mutability becomes a team-wide policy rather than a personal style choice. Your future self—and the teammate who inherits your code—will thank you for making it clear when a set changes and when it does not. If you start doing that today, you’ll see fewer surprises, fewer late-night incident calls, and a cleaner mental model of how your Python programs behave.

Scroll to Top