Handle MemoryError in Python: Practical Patterns to Keep Memory Flat

I still remember the first time a Python job died with MemoryError. It wasn’t a fancy ML pipeline—just a “simple” loop that pulled records, appended a few fields, and wrote a report. It ran fine in dev, then cratered in production after a few hours. The painful part wasn’t the crash; it was the uncertainty: Where is the memory going, and why does it grow even when the loop body looks harmless?

When MemoryError shows up, you’re usually facing one of two realities: you’re genuinely trying to hold more data than the machine can support, or your program is accidentally retaining references (so nothing can be freed). Either way, you don’t fix it by sprinkling del statements and hoping for the best.

Here’s the approach I use in 2026-era Python codebases: I start by understanding what MemoryError means in CPython, then I categorize the failure mode (infinite work, unintended accumulation, or unbounded recursion), and finally I apply patterns that keep memory flat—streaming, chunking, bounded caches, and safe shutdown behavior. You’ll leave with runnable examples and a checklist you can apply to real services, scripts, and data jobs.

What MemoryError Actually Means (and What It Doesn’t)

In Python, MemoryError is an exception raised when the interpreter can’t allocate memory for an object. That sounds simple, but the surrounding details matter.

  • MemoryError is not a “bad pointer” crash. When Python raises this exception, it’s usually catching a failed allocation attempt (for example, malloc returning null, or an internal allocator refusing a request).
  • You can still see the OS kill your process with no exception. On Linux, the OOM killer may terminate a process outright. In containers, this can look like a sudden exit (often code 137) rather than a Python exception.
  • “Available memory” isn’t just RAM. Virtual memory limits, container limits, ulimit settings, fragmentation, and allocator behavior can all be part of the story.

A useful mental model: imagine you’re trying to put groceries away. MemoryError is the moment you open the cabinet and realize there’s no physical space left for the new item. But in Python, the cabinet might be “full” because you kept every previous grocery bag (references), or because you used many tiny bins that waste space (fragmentation), or because you’re only allowed one shelf (container limit).

CPython specifics I keep in mind

Most production Python runs on CPython. CPython has:

  • Reference counting as the primary mechanism for freeing objects when references drop to zero.
  • A cyclic garbage collector for reference cycles (containers referencing each other).
  • A memory allocator layer that can keep memory in arenas/pools for reuse, which means RSS (resident memory as seen by the OS) doesn’t always drop when you delete objects.

So yes, del big_list can reduce Python-level references—but your OS-level memory graph may not immediately fall. That’s why I rely on measurements over vibes.

OS OOM vs Python MemoryError: Different Failures, Different Fixes

Before I touch code, I always confirm which kind of “out of memory” I’m dealing with. The fixes can be totally different.

Case A: Python raises MemoryError

This usually means CPython attempted to allocate (for a list resize, a dict growth, a string concatenation, a bytes object, a numpy array, etc.) and the allocator said “no.” You’ll see a stack trace pointing at the allocation site (or close to it).

What I do next:

  • Identify the exact object being allocated (a list growth? a giant bytes read? a sorted() materialization?).
  • Decide whether the allocation is necessary (some are) and if it can be bounded.

Case B: The OS kills the process (no Python exception)

This happens commonly under container memory limits or strict cgroup enforcement. Your logs might show nothing except a sudden stop.

What I do next:

  • Confirm memory limit and usage at runtime (Kubernetes limits, Docker --memory, systemd slice limits, etc.).
  • Look for sudden RSS growth or slow linear growth.
  • Reduce peak memory (chunking, streaming) and reduce overhead (data structures and copies).

Case C: The process “hangs” and memory climbs slowly

This is often unintended accumulation: caches, queues, lists, dicts, closures holding references, or “just in case” debug logging that collects everything.

What I do next:

  • Add a heartbeat that reports memory and key counters.
  • Take allocation snapshots at intervals.
  • Find what grows with iteration count.

That distinction matters because “add del” fixes none of these reliably. It can help release references, but it can’t:

  • change the architecture from unbounded to bounded
  • prevent accidental copies
  • shrink a container limit
  • undo fragmentation patterns

The Loop Patterns That Commonly Trigger Memory Blowups

When a MemoryError appears “in a loop,” the loop is rarely the root cause. The loop is the place where time passes—and where unbounded growth becomes visible.

I bucket loop-driven memory failures into three patterns:

  • Infinite work (the loop never stops)
  • Unintended accumulation (the loop keeps storing)
  • Unbounded recursion / missing base case (stack growth until failure)

1) Infinite loops: memory grows because the program never reaches cleanup

An infinite loop by itself doesn’t guarantee memory growth. The real danger is an infinite loop that continuously allocates.

Here’s a minimal example that will keep eating memory:

def leak_forever():

items = []

while True:

# Each iteration adds a new object, and nothing ever gets released.

items.append(bytearray(1024 * 1024)) # +1 MiB each loop

leak_forever()

If you intended a “daemon-like” loop, you need explicit backpressure and periodic cleanup. A safer pattern is:

  • bound the data you keep
  • sleep or wait on I/O
  • log progress
  • break on thresholds

import time

from collections import deque

def runworker(maxbuffer_items: int = 1000):

recentpayloads = deque(maxlen=maxbuffer_items) # bounded memory

while True:

payload = b‘x‘ * 1024 # pretend this came from a socket

recent_payloads.append(payload)

# do work…

time.sleep(0.01) # yield time back; avoids runaway CPU too

run_worker()

The key detail is deque(maxlen=...): it makes “keep the last N items” a fixed-cost decision.

2) Unintended accumulation: the loop keeps references alive

This is the most common issue I see in real code. The loop appends to a list, caches results, aggregates logs, or builds a giant string.

Problem version:

def loadalluserevents(eventiterable):

events = []

for event in event_iterable:

events.append(event) # grows without bound

return events

Sometimes you truly need all events. Often you don’t—you need a summary, a stream transform, or an on-disk output.

A streaming alternative:

def countuserevents(event_iterable):

count = 0

for in eventiterable:

count += 1

return count

Or write as you go:

import json

def writeeventsndjson(eventiterable, outpath: str):

with open(out_path, ‘w‘, encoding=‘utf-8‘) as f:

for event in event_iterable:

f.write(json.dumps(event))

f.write(‘\n‘)

The difference is not stylistic—it’s whether memory stays near-constant.

3) Loops without a base case: recursion that never bottoms out

Recursion is elegant until it isn’t. Missing base cases (or base cases that are never reached) can push memory pressure via stack frames and retained objects.

Broken recursion:

def walk_forever(n: int) -> int:

return walk_forever(n – 1)

walk_forever(5)

In practice, you’ll usually hit RecursionError first, but recursion can also retain large objects per frame (like slices or partial results), leading to severe memory growth.

Safer alternatives:

  • ensure a correct base case
  • convert to an explicit stack/queue loop

def countdown_iterative(n: int) -> int:

while n > 0:

n -= 1

return 0

countdown_iterative(5)

If you genuinely need recursion (tree traversals), I either keep per-frame state minimal or I switch to an iterative traversal.

Measure Memory Growth Before You Change Code

I treat memory fixes like performance fixes: measure first, then change one thing at a time.

tracemalloc: built-in and surprisingly effective

tracemalloc tracks Python allocations (not every native allocation, but enough to find the hot lines).

import tracemalloc

def buildbiglist(n: int):

data = []

for i in range(n):

data.append(‘user_id=‘ + str(i))

return data

tracemalloc.start()

buildbiglist(500_000)

current, peak = tracemalloc.gettracedmemory()

print(f‘Current: {current/1024/1024:.1f} MiB‘)

print(f‘Peak: {peak/1024/1024:.1f} MiB‘)

snapshot = tracemalloc.take_snapshot()

for stat in snapshot.statistics(‘lineno‘)[:5]:

print(stat)

What I look for:

  • lines with unexpectedly large allocation counts
  • growth that correlates with iterations
  • objects that never drop

A simple loop-level “memory heartbeat”

In long-running loops, I add a periodic heartbeat. In scripts, I sometimes use resource (Unix). Here’s a portable approach that doesn’t require third-party deps: track your own counters plus tracemalloc peak.

import tracemalloc

def process_items(items):

tracemalloc.start()

for idx, item in enumerate(items, start=1):

_ = item.upper() # placeholder

if idx % 100_000 == 0:

current, peak = tracemalloc.gettracedmemory()

print(f‘{idx=:,} current={current/1024/1024:.1f}MiB peak={peak/1024/1024:.1f}MiB‘)

processitems((str(i) for i in range(1000_000)))

This doesn’t tell you RSS, but it gives you a stable signal for “Python allocations are rising.” If current rises linearly with iterations, you have accumulation.

A Repeatable Triage Workflow (What I Do in Real Incidents)

When something crashes at 2 a.m., I don’t want “maybe it’s a leak” vibes. I want a deterministic workflow.

Step 1: Confirm the symptom

I ask:

  • Do we see a Python stack trace with MemoryError?
  • Or do we see a sudden exit (container OOM / SIGKILL)?
  • Is it reproducible locally with smaller input?

If I can reproduce with smaller input, I’m usually minutes away from finding the culprit.

Step 2: Add one tiny, low-risk visibility hook

I add one of these (depending on environment):

  • tracemalloc heartbeat
  • periodic snapshots
  • a bounded debug log of “last N items processed”

The point isn’t perfect instrumentation. It’s to answer: does memory grow with time? with number of items? with specific types of items?

Step 3: Identify the growth class

I classify what I see into one of these:

  • Linear growth with items processed → accumulation (list/dict/cache/queue)
  • Stepwise growth around certain operations → hidden copy or a large materialization (sorted, list, read, json.loads on huge strings)
  • Spiky growth with periodic drops → batch pattern; likely need smaller batches or streaming output
  • Growth without Python allocation growth → native memory (C extensions, large buffers, image libs, some ML stacks)

Step 4: Apply the smallest architectural bound

I choose the smallest change that puts a ceiling on memory:

  • convert list materialization to generator + streaming writer
  • add maxlen / maxsize
  • chunk and clear
  • spill to disk (SQLite/tempfile) when in-memory aggregation exceeds a cap

Then I re-measure.

Patterns That Keep Memory Flat: Generators, Chunking, and Bounded State

If you want fewer MemoryError incidents, the highest-value shift is architectural: stop making “read everything into RAM” your default.

Generator expressions instead of list-building

Problem:

squares = [i * i for i in range(50000000)]

Better when you can stream:

squares = (i * i for i in range(50000000))

Then consume incrementally:

total = 0

for value in squares:

total += value

print(total)

In real services, this can be the difference between a job that runs in a few hundred MiB and one that tries to allocate many GiB.

Chunk your work explicitly

If you have to process a huge dataset, do it in chunks. Here’s a clean chunking helper:

from itertools import islice

from typing import Iterable, Iterator, List, TypeVar

T = TypeVar(‘T‘)

def chunked(iterable: Iterable[T], size: int) -> Iterator[List[T]]:

if size <= 0:

raise ValueError(‘size must be > 0‘)

it = iter(iterable)

while True:

chunk = list(islice(it, size))

if not chunk:

return

yield chunk

for batch in chunked(range(10000000), 100_000):

# Process a bounded batch

_ = batch[0] + batch[-1]

Chunking also improves latency control: you can checkpoint progress and emit partial results.

Don’t grow strings in a loop

This one still bites experienced devs.

Problem:

log_text = ‘‘

for i in range(1000000):

log_text += f‘line {i}\n‘ # repeated reallocation and growth

Better:

lines = []

for i in range(1000000):

lines.append(f‘line {i}\n‘)

log_text = ‘‘.join(lines)

Even better (stream to disk):

with open(‘app.log‘, ‘w‘, encoding=‘utf-8‘) as f:

for i in range(1000000):

f.write(f‘line {i}\n‘)

If you only need the last N lines for debugging, keep a bounded deque.

Prefer iterating over files instead of reading them whole

Problem:

data = open(‘events.jsonl‘, ‘r‘, encoding=‘utf-8‘).read().splitlines()

Better:

with open(‘events.jsonl‘, ‘r‘, encoding=‘utf-8‘) as f:

for line in f:

line = line.rstrip(‘\n‘)

# parse/process

Even for “medium” files, streaming makes behavior predictable. Predictable is what keeps production calm.

Data Structures and APIs That Secretly Multiply Memory Use

A lot of MemoryError incidents are not caused by “big data.” They’re caused by copying.

Watch for hidden copies

Common copy triggers:

  • slicing lists (items[10:100000]) creates a new list
  • sorted(huge_iterable) materializes everything
  • list(generator) materializes everything
  • json.loads(huge_string) duplicates data structures (string + parsed objects)
  • DataFrame operations that create full copies (depends on operation)

If I’m reviewing a memory-heavy loop, I search for:

  • list(
  • .copy()
  • sorted(
  • += on containers
  • .extend( with huge iterables

Caches: the “helpful” memory leak

Caching can quietly eat memory because it’s doing its job: keeping objects alive.

If you use functools.lru_cache, set a maxsize and choose a realistic bound.

from functools import lru_cache

@lrucache(maxsize=10000)

def expensivelookup(userid: int) -> str:

return f‘profile:{user_id}‘

If your key space is effectively unbounded (timestamps, request IDs), unlimited caching is a leak.

One thing I also watch: caching large values. Even with maxsize=10_000, if each value is 200 KB you’ve just bought yourself ~2 GB of cache.

Prefer bounded queues for pipelines

If you’re building producer/consumer pipelines, use queue.Queue(maxsize=...) or an async queue with a size limit. Unlimited queues are just lists with better marketing.

A simple pattern that prevents “producer outruns consumer” memory blowups:

import queue

import threading

def producer(q: queue.Queue[bytes], count: int):

for _ in range(count):

q.put(b‘x‘ * 1024) # blocks when full

q.put(b‘‘) # sentinel

def consumer(q: queue.Queue[bytes]):

while True:

item = q.get()

if item == b‘‘:

break

# process item

q: queue.Queue[bytes] = queue.Queue(maxsize=10_000)

t1 = threading.Thread(target=producer, args=(q, 1000000), daemon=True)

t2 = threading.Thread(target=consumer, args=(q,), daemon=True)

t1.start(); t2.start(); t1.join(); t2.join()

The key idea: backpressure is a memory safety feature.

Handling MemoryError Without Making Things Worse

Catching MemoryError is allowed, but I treat it like catching KeyboardInterrupt: do it for cleanup and controlled shutdown, not to “keep going like normal.”

A safe pattern: fail the current unit of work, release references, exit cleanly

Here’s a runnable pattern for batch processing:

from typing import Iterable, List

def process_batch(batch: List[int]) -> int:

# Pretend we build something heavy.

payload = [str(x) for x in batch]

return len(payload)

def runjob(numbers: Iterable[int], batchsize: int = 200_000) -> int:

processed = 0

batch: List[int] = []

try:

for n in numbers:

batch.append(n)

if len(batch) >= batch_size:

processed += process_batch(batch)

batch.clear() # release references in-place

if batch:

processed += process_batch(batch)

batch.clear()

except MemoryError:

# Controlled failure path: drop references and report.

batch.clear()

raise RuntimeError(‘Out of memory while processing; reduce batch_size or stream output‘)

return processed

print(runjob(range(1000_000)))

Notes I care about:

  • batch.clear() keeps the same list object but releases references, which is helpful if the list is reused.
  • I re-raise a domain-level exception with guidance. In services, I also log context (batch size, input source, recent counters).

When I do NOT catch MemoryError

  • inside low-level utility functions where the caller can’t safely recover
  • when continuing would corrupt state (partial writes without a transactional boundary)
  • when the process is already unstable

In a web service, I’d rather fail a request fast than limp along and trigger cascading failures.

Add guardrails: explicit limits beat surprise crashes

If your API accepts limit=100000000, that’s not “flexibility,” it’s an incident waiting to happen.

I often add:

  • maximum page sizes
  • maximum query spans
  • maximum recursion depth (or iterative alternatives)
  • maximum in-memory aggregation size (then spill to disk)

A simple example:

def saferange(limit: int, hardcap: int = 5000000):

if limit > hard_cap:

raise ValueError(f‘limit too large: {limit} > {hard_cap}‘)

return range(limit)

for i in saferange(1000_000):

pass

This is boring code—and boring code is exactly what you want around memory safety.

Traditional vs Modern (2026) Ways I Prevent Memory Errors

I’ve watched teams “fix” memory errors by adding RAM, then hit the same wall again at the next scale bump. The durable fix is picking patterns that don’t require linear memory.

Problem

Traditional approach

Modern approach (2026)

What I recommend

Process huge dataset

Load into list, process

Stream, chunk, write incremental outputs

Stream first, chunk when needed

Compute derived values

List comprehension everywhere

Generator expressions and iterators

Default to iterators

Data interchange

Build giant JSON in memory

NDJSON / streaming JSON writers

Prefer line-delimited output with incremental writes## Practical Scenarios (Where MemoryError Actually Comes From)

I find it easier to solve memory problems when I map them to concrete scenarios. Here are the ones that show up repeatedly.

Scenario 1: “I’m building a report, why is it huge?”

The classic pattern is:

  • read all rows
  • enrich each row
  • keep all enriched rows
  • convert to JSON/CSV at the end

This fails because you hold:

  • the original rows (if referenced)
  • the enriched rows
  • the output string/buffer

A flat-memory approach is to stream output as you go.

For CSV, the csv module is already streaming-friendly:

import csv

from typing import Iterable, Dict, Any

def writecsvrows(rows: Iterable[Dict[str, Any]], out_path: str) -> None:

with open(out_path, ‘w‘, newline=‘‘, encoding=‘utf-8‘) as f:

writer = None

for row in rows:

if writer is None:

writer = csv.DictWriter(f, fieldnames=list(row.keys()))

writer.writeheader()

writer.writerow(row)

The key move: don’t collect rows unless you have a specific reason.

Scenario 2: “My service is fine for hours, then dies”

This is usually a slow accumulation:

  • request logs stored in memory
  • a cache with an unbounded key space
  • a queue that grows during traffic spikes
  • a metrics label explosion (high-cardinality tags) that stores ever-growing dictionaries

The fix is almost always: bound it.

  • deque(maxlen=N) for “last N events”
  • lru_cache(maxsize=N) plus careful key choices
  • Queue(maxsize=N) for pipelines
  • reject high-cardinality metrics labels

Scenario 3: “I’m parsing JSON, why does it explode?”

Large JSON is dangerous because you often have the raw string plus the parsed object tree.

Common trap:

import json

def parsebigjson(path: str):

raw = open(path, ‘r‘, encoding=‘utf-8‘).read()

return json.loads(raw)

Here you hold the entire file contents as a string and the parsed structure at the same time.

If you control the data format, switching to NDJSON (one JSON object per line) is a massive memory win because you can parse one line at a time.

import json

def iter_ndjson(path: str):

with open(path, ‘r‘, encoding=‘utf-8‘) as f:

for line in f:

line = line.strip()

if not line:

continue

yield json.loads(line)

This is one of the biggest “format decisions” I make for memory stability.

Scenario 4: “I’m just using Pandas / NumPy / image libs”

Sometimes Python-level allocations aren’t the main culprit.

  • NumPy arrays allocate large contiguous native buffers.
  • DataFrame operations can materialize full copies.
  • Image processing libraries may allocate multiple intermediate buffers.

In these cases, tracemalloc can under-report the true footprint because much of it is native memory.

My practical move is to:

  • reduce peak intermediate allocations (avoid chaining operations that create copies)
  • process in tiles/chunks (images) or partitions (data)
  • ensure I don’t keep references to old arrays/DataFrames

Even without specialized tooling, architectural bounding still works.

Memory Efficiency: The Object Overhead Tax (Why “Just a List of Ints” Can Hurt)

A surprise for many teams: Python objects have overhead.

A list of a million integers isn’t “a million numbers.” It’s:

  • the list object
  • an array of a million pointers
  • a million separate int objects

That overhead can turn “reasonable” data volumes into memory disasters.

Better representations when memory matters

I don’t optimize prematurely, but when I see memory pressure, I ask whether the representation is the real issue.

Options I reach for:

  • array(‘I‘) / array(‘L‘) for packed numeric data
  • bytes / bytearray for raw binary
  • memoryview to slice without copying
  • dataclasses with slots=True (or plain classes with slots) when creating many small records

Example: slots to reduce per-object overhead in large collections:

from dataclasses import dataclass

@dataclass(slots=True)

class Event:

user_id: int

ts: int

kind: str

If you store millions of small objects, this can meaningfully reduce memory.

Avoid accidental duplication with “convenient” transforms

I also watch for patterns like:

  • records = [transform(x) for x in records] (keeps both old and new if old is referenced elsewhere)
  • records += morerecords where morerecords is huge
  • list(map(...)) where streaming would do

Sometimes the fix is as simple as reworking one line so you don’t hold both the input and output at the same time.

Garbage Collection and Reference Cycles: When Memory Doesn’t Drop

Most of the time, reference counting frees objects immediately. But reference cycles can delay collection.

Common cycle pattern:

  • objects reference each other
  • they stay alive even when you think you’re done

I don’t recommend reaching for GC knobs first, but I do keep two practical rules:

1) In long-running jobs that build lots of temporary cyclic structures, a periodic gc.collect() can be a pragmatic stopgap.

2) It’s usually better to remove the cycle than to tune GC.

Example: avoid keeping objects mutually referencing each other when a simple ID reference would do.

I also watch for cycles created by:

  • closures capturing large objects
  • logging handlers storing request objects
  • callbacks with references to huge context objects

If memory grows slowly and never returns, a cycle is on my suspect list.

Advanced: Fragmentation and “Why RSS Won’t Go Down”

This is the part where people get frustrated: “I cleared the list, why does memory stay high?”

A few realities:

  • CPython may keep memory arenas around for reuse.
  • The OS may not reclaim pages immediately.
  • Native extensions can keep their own pools.

My practical stance:

  • For short-lived batch jobs, I care about peak memory and completion, not whether RSS returns to baseline mid-run.
  • For long-running services, I care whether memory trends upward over time.

If memory goes up and stabilizes, that’s often acceptable.

If it climbs without bound, that’s a leak or unbounded accumulation.

Tooling I Actually Use (Beyond tracemalloc)

Built-in tools get you surprisingly far, but sometimes you need better visibility.

RSS reporting (process-level)

If I’m allowed third-party dependencies, I’ll use psutil to log RSS periodically. If not, I still try to capture OS-level usage via platform tools.

With psutil, the heartbeat becomes more honest:

# Requires: psutil

import os

import psutil

def rss_mib() -> float:

proc = psutil.Process(os.getpid())

return proc.memory_info().rss / 1024 / 1024

# In your loop:

# print(f‘rss={rss_mib():.1f}MiB‘)

The reason I like RSS: it matches what the OS cares about.

Object growth visualization

When I suspect a leak, I want to answer: what object types are increasing?

Tools I’ve used successfully:

  • objgraph (great for seeing growth by type)
  • pympler (high-level memory summaries)

Even if you don’t keep them permanently, they’re fantastic for one-off investigations.

Snapshot diffing: what changed between “healthy” and “bad”

A pattern that works well:

  • take a tracemalloc snapshot at iteration 10k
  • take another at iteration 200k
  • compare the top differences

That tells you which lines are responsible for net growth, not just peak allocation.

Practical Recipes (Copy/Paste Patterns I Use)

These are small, boring utilities that prevent memory incidents.

Recipe 1: Bounded “last N errors” for debug context

Instead of storing every failure detail forever:

from collections import deque

from typing import Deque, Any

last_errors: Deque[Any] = deque(maxlen=200)

def record_error(err: Exception) -> None:

last_errors.append(repr(err))

Now you have context without a memory leak.

Recipe 2: Spill-to-disk when aggregation grows too large

Sometimes you really do need to aggregate (grouping, dedupe, join). The move is to cap in-memory state and spill.

A minimal approach is SQLite (built-in):

import sqlite3

def make_db(path: str) -> sqlite3.Connection:

conn = sqlite3.connect(path)

conn.execute(‘create table if not exists seen (k text primary key)‘)

return conn

def seen_before(conn: sqlite3.Connection, key: str) -> bool:

try:

conn.execute(‘insert into seen(k) values (?)‘, (key,))

conn.commit()

return False

except sqlite3.IntegrityError:

return True

This trades RAM for disk, which is often exactly what you want when RAM is the bottleneck.

Recipe 3: External sort mindset (avoid sorted() on huge inputs)

sorted() materializes everything. If you need sorted output for huge datasets, consider:

  • sorting chunks
  • writing sorted chunks to disk
  • merging streams

Even a simple “sort chunks, write temp files, merge” approach can keep memory bounded.

The conceptual takeaway: whenever you see sorted(huge_iterable), ask if you’re about to pay an unbounded memory bill.

Edge Cases That Bite Even Experienced Developers

A few patterns I watch for specifically because they look harmless.

“I used a generator, why is memory still growing?”

Because something downstream is collecting it.

Common culprits:

  • list(generator)
  • passing a generator into an API that materializes it
  • logging/printing results that implicitly build large strings

I always follow the data: it’s not enough to produce an iterator; you need to consume it in a streaming way too.

“I clear the list, but memory still climbs”

Often it’s not the list:

  • a second structure (dict/set) is growing
  • a cache key space is exploding
  • a queue backs up under load
  • a closure retains old references

This is where snapshot diffing and object-growth tools shine.

“It only happens on some inputs”

That’s a hint that one input triggers a worst-case path:

  • unusually large payload
  • path that enables caching
  • path that creates a huge intermediate object

I handle this by adding input-size guardrails and failing fast with clear errors.

Production Considerations: Make Memory a First-Class Constraint

Most memory incidents aren’t “bugs,” they’re missing constraints.

Add memory limits intentionally

Whether you run in containers or on VMs, decide:

  • max memory per process
  • max concurrency based on per-request memory usage

If each request can temporarily allocate 200 MB, then 20 concurrent requests can kill a 4 GB container. That’s not a Python problem—it’s a sizing and backpressure problem.

Prefer bounded concurrency over unlimited parallelism

A common source of MemoryError is over-parallelizing:

  • ThreadPoolExecutor with too many workers
  • ProcessPoolExecutor that multiplies memory per process
  • async tasks that buffer huge responses concurrently

I usually set concurrency based on memory, not CPU.

Emit “memory health” signals

I like having at least one of these in long-running services:

  • RSS gauge
  • Python allocation trend (optional)
  • queue depth
  • cache size

Memory is a resource like CPU. Treating it as “invisible” is how you get surprised.

Expansion Strategy

Add new sections or deepen existing ones with:

  • Deeper code examples: More complete, real-world implementations
  • Edge cases: What breaks and how to handle it
  • Practical scenarios: When to use vs when NOT to use
  • Performance considerations: Before/after comparisons (use ranges, not exact numbers)
  • Common pitfalls: Mistakes developers make and how to avoid them
  • Alternative approaches: Different ways to solve the same problem

If Relevant to Topic

  • Modern tooling and AI-assisted workflows (for infrastructure/framework topics)
  • Comparison tables for Traditional vs Modern approaches
  • Production considerations: deployment, monitoring, scaling

Checklist: My “No Surprises” MemoryError Playbook

When I’m done fixing a memory issue, I like having a checklist that makes the fix durable.

  • Confirm failure mode: Python MemoryError vs OS/container OOM
  • Add a heartbeat: item count + memory + queue depth
  • Prove growth: does memory rise with iterations?
  • Find materializations: list(...), sorted(...), read(), giant json.loads(...)
  • Replace unbounded state with bounds: maxlen, maxsize, hard caps
  • Stream output: write incrementally rather than building mega-strings
  • Chunk processing: bounded batches + clear() after each batch
  • Validate worst-case inputs: oversized payloads should fail fast
  • Re-run with measurement: ensure memory plateaus, not climbs

If you adopt just two habits—(1) stream by default and (2) bound anything that can grow—you’ll prevent the majority of MemoryError incidents I see in real-world Python systems.

Scroll to Top