114

How can I use functools.lru_cache inside classes without leaking memory?

In the following minimal example the foo instance won't be released although going out of scope and having no referrer (other than the lru_cache).

from functools import lru_cache
class BigClass:
    pass
class Foo:
    def __init__(self):
        self.big = BigClass()
    @lru_cache(maxsize=16)
    def cached_method(self, x):
        return x + 5

def fun():
    foo = Foo()
    print(foo.cached_method(10))
    print(foo.cached_method(10)) # use cache
    return 'something'

fun()

But foo and hence foo.big (a BigClass) are still alive

import gc; gc.collect()  # collect garbage
len([obj for obj in gc.get_objects() if isinstance(obj, Foo)]) # is 1

That means that Foo/BigClass instances are still residing in memory. Even deleting Foo (del Foo) will not release them.

Why is lru_cache holding on to the instance at all? Doesn't the cache use some hash and not the actual object?

What is the recommended way use lru_caches inside classes?

I know of two workarounds: Use per instance caches or make the cache ignore object (which might lead to wrong results, though)

4
  • 9
    To help others find the explanation: This seems to be the issue flake8-bugbear refers to in the warning B019 Use of 'functools.lru_cache' or 'functools.cache' on class methods can lead to memory leaks. The cache may retain instance references, preventing garbage collection.. Commented Mar 21, 2022 at 7:21
  • 6
    I'm still curious about @televator's question: Why is lru_cache holding on to the instance at all? Doesn't the cache use some hash and not the actual object? Commented Mar 21, 2022 at 7:49
  • @akaihola Caching the hashes does not suffice. Hashing is for detecting difference (different hashes mean different objects), while when memoizing, it is detecting sameness (same argument list gives same result). Commented Jul 15, 2024 at 13:04
  • See docs.python.org/3/faq/programming.html#faq-cache-method-calls. Commented Mar 20, 2025 at 21:08

11 Answers 11

53

This is not the cleanest solution, but it's entirely transparent to the programmer:

import functools
import weakref

def memoized_method(*lru_args, **lru_kwargs):
    def decorator(func):
        @functools.wraps(func)
        def wrapped_func(self, *args, **kwargs):
            # We're storing the wrapped method inside the instance. If we had
            # a strong reference to self the instance would never die.
            self_weak = weakref.ref(self)
            @functools.wraps(func)
            @functools.lru_cache(*lru_args, **lru_kwargs)
            def cached_method(*args, **kwargs):
                return func(self_weak(), *args, **kwargs)
            setattr(self, func.__name__, cached_method)
            return cached_method(*args, **kwargs)
        return wrapped_func
    return decorator

It takes the exact same parameters as lru_cache, and works exactly the same. However it never passes self to lru_cache and instead uses a per-instance lru_cache.

Sign up to request clarification or add additional context in comments.

5 Comments

This has the slight strangeness to it that the function on the instance is only replaced by the caching wrapper on the first invocation. Also, the caching wrapper function is not anointed with lru_cache's cache_clear/cache_info functions (implementing which was where I bumped into this in the first place).
This doesn't seem to work for __getitem__. Any ideas why ? It does work if you call instance.__getitem__(key) but not instance[key].
This will not work for any special method because those are looked up on the class slots and not in instance dictionaries. Same reason why setting obj.__getitem__ = lambda item: item will not cause obj[key] to work.
Any idea how to get this to work on 3.x?, I get TypeError: wrapped_func() missing 1 required positional argument: 'self'
For frozen dataclasses, instead of setattr you need to use object.__setattr__ as mentioned in stackoverflow.com/a/58336722/9360161.
42

Simple wrapper solution

Here's a wrapper that will keep a weak reference to the instance:

import functools
import weakref

def weak_lru(maxsize=128, typed=False):
    'LRU Cache decorator that keeps a weak reference to "self"'
    def wrapper(func):

        @functools.lru_cache(maxsize, typed)
        def _func(_self, *args, **kwargs):
            return func(_self(), *args, **kwargs)

        @functools.wraps(func)
        def inner(self, *args, **kwargs):
            return _func(weakref.ref(self), *args, **kwargs)

        return inner

    return wrapper

Example

Use it like this:

class Weather:
    "Lookup weather information on a government website"

    def __init__(self, station_id):
        self.station_id = station_id

    @weak_lru(maxsize=10)
    def climate(self, category='average_temperature'):
        print('Simulating a slow method call!')
        return self.station_id + category

When to use it

Since the weakrefs add some overhead, you would only want to use this when the instances are large and the application can't wait for the older unused calls to age out of the cache.

Why this is better

Unlike the other answer, we only have one cache for the class and not one per instance. This is important if you want to get some benefit from the least recently used algorithm. With a single cache per method, you can set the maxsize so that the total memory use is bounded regardless of the number of instances that are alive.

Dealing with mutable attributes

If any of the attributes used in the method are mutable, be sure to add _eq_() and _hash_() methods:

class Weather:
    "Lookup weather information on a government website"

    def __init__(self, station_id):
        self.station_id = station_id

    def update_station(station_id):
        self.station_id = station_id

    def __eq__(self, other):
        return self.station_id == other.station_id

    def __hash__(self):
        return hash(self.station_id)

3 Comments

Great answer @Raymond! Wish I could give you more upvotes :-)
Warning: if you pass maxsize=None, it doesn't fix the memory leak (python 3.11)
If you pass maxsize=None, you're explicitly setting an unlimited size with no evictions. So you could have written, Warning: the tool is going to exactly what you said you wanted.
36

An even simpler solution to this problem is to declare the cache in the constructor and not in the class definition:

from functools import lru_cache
import gc

class BigClass:
    pass
class Foo:
    def __init__(self):
        self.big = BigClass()
        self.cached_method = lru_cache(maxsize=16)(self.cached_method)
    def cached_method(self, x):
        return x + 5

def fun():
    foo = Foo()
    print(foo.cached_method(10))
    print(foo.cached_method(10)) # use cache
    return 'something'
    
if __name__ == '__main__':
    fun()
    gc.collect()  # collect garbage
    print(len([obj for obj in gc.get_objects() if isinstance(obj, Foo)]))  # is 0

4 Comments

Any explanation why this case works while the one in the question doesn't?
this version the cache is local to the class instance, hence when the instance is deleted so is the cache. If u want a global cache the that is resilient in memory
This approach is further discussed here: rednafi.github.io/reflections/…
The page pointed to by phispi's comment is now at rednafi.com/python/lru_cache_on_methods
25

I will introduce methodtools for this use case.

pip install methodtools to install https://pypi.org/project/methodtools/

Then your code will work just by replacing functools to methodtools.

from methodtools import lru_cache
class Foo:
    @lru_cache(maxsize=16)
    def cached_method(self, x):
        return x + 5

Of course the gc test also returns 0 too.

6 Comments

You can use either one. methodtools.lru_cache behaves exact like functools.lru_cache by reusing functools.lru_cache inside while ring.lru suggests more features by reimplementing lru storage in python.
methodtools.lru_cache on a method uses a separate storage for each instance of the class, while the storage of ring.lru is shared by all the instances of the class.
Caveat emptor: As of mid-2024, methodtools, ring, and rope are all basically unmaintained. The last commit for methodtools was over 4 months ago. The last commit for rope was over a year ago. That's... not great. It's unclear whether either have been tested under Python 3.12, for example. Personally, I'd just copy-paste one of the trivial answers given above and call it a day. ¯\_(ツ)_/¯
@CecilCurry Hi, I am the maintainer. wirerope is a very small project which does not expect to be updated unless Python function fundamental is changed. methodtools.lru_cache is just working as functools.lru_cache. Please report a bug if you find any problem. ring is out of maintenance, you are right.
You should mention in your answer that you are the author of this library.
|
12

This issue with this method is that self is an unused variable.

The simple solution is to make the method into a static method. That way, the instance isn't part of the cache.

class Foo:
    def __init__(self):
        self.big = BigClass()

    @staticmethod                   # <-- Add this line
    @lru_cache(maxsize=16)
    def cached_method(x):
        print('miss')
        return x + 5

Comments

7

Solution

Below a small drop-in replacement for (and wrapper around) lru_cache which puts the LRU cache on the instance (object) and not on the class.

Summary

The replacement combines lru_cache with cached_property. It uses cached_property to store the cached method on the instance on first access; this way the lru_cache follows the object and as a bonus it can be used on unhashable objects like a non-frozen dataclass.

How to use it

Use @instance_lru_cache instead of @lru_cache to do decorate a method and you're all set. Decorator arguments are supported, e.g. @instance_lru_cache(maxsize=None)

Comparison with other answers

The result is comparable to the answers provided by pabloi and akaihola, but with a simple decorator syntax. Compared to the answer provided by youknowone, this decorator is type hinted and does not require third-party libraries (result is comparable).

This answer differs from the answer provided by Raymond Hettinger as the cache is now stored on the instance (which means the maxsize is defined per instance and not per class) and it works on methods of unhashable objects.

from functools import cached_property, lru_cache, partial, update_wrapper
from typing import Callable, Optional, TypeVar, Union

T = TypeVar("T") 

def instance_lru_cache(
    method: Optional[Callable[..., T]] = None,
    *,
    maxsize: Optional[int] = 128,
    typed: bool = False
) -> Union[Callable[..., T], Callable[[Callable[..., T]], Callable[..., T]]]:
    """Least-recently-used cache decorator for instance methods.

    The cache follows the lifetime of an object (it is stored on the object,
    not on the class) and can be used on unhashable objects. Wrapper around
    functools.lru_cache.

    If *maxsize* is set to None, the LRU features are disabled and the cache
    can grow without bound.

    If *typed* is True, arguments of different types will be cached separately.
    For example, f(3.0) and f(3) will be treated as distinct calls with
    distinct results.

    Arguments to the cached method (other than 'self') must be hashable.

    View the cache statistics named tuple (hits, misses, maxsize, currsize)
    with f.cache_info().  Clear the cache and statistics with f.cache_clear().
    Access the underlying function with f.__wrapped__.

    """

    def decorator(wrapped: Callable[..., T]) -> Callable[..., T]:
        def wrapper(self: object) -> Callable[..., T]:
            return lru_cache(maxsize=maxsize, typed=typed)(
                update_wrapper(partial(wrapped, self), wrapped)
            )

        return cached_property(wrapper)  # type: ignore

    return decorator if method is None else decorator(method)

3 Comments

This will lead to cache hits even if instance properties affecting the return value of the decorated method changed.
That’s not related to this solution but rather a consequence of using a cache. Don’t use a cache on a function or method that’s not deterministic.
This may be semantic but I'd argue that a proper implementation of a cache for class methods should treat self like any other argument and check its hash against the one stored alongside the cache value
3

python 3.8 introduced the cached_property decorator in the functools module. when tested its seems to not retain the instances.

If you don't want to update to python 3.8 you can use the source code. All you need is to import RLock and create the _NOT_FOUND object. meaning:

from threading import RLock

_NOT_FOUND = object()

class cached_property:
    # https://github.com/python/cpython/blob/v3.8.0/Lib/functools.py#L930
    ...

2 Comments

cached_property is useless in this case - you can't use arguments (as with any property).
This is not a good answer. @cached_property can only be used if you want a single static value. This is not what the OP is looking for with @lru_cache, which behaves more like a dict that caches the mapping from inputs to outputs.
3

You can move the implementation of the method to a module global function, pass only relevant data from self when calling it from the method, and use @lru_cache on the function.

An added benefit from this approach is that even if your classes are mutable, the cache will be correct. And the cache key is more explicit as just the relevant data is in the signature of the cached function.

To make the example slightly more realistic, let's assume cached_method() needs information from self.big:

from dataclasses import dataclass
from functools import lru_cache

@dataclass
class BigClass:
    base: int

class Foo:
    def __init__(self):
        self.big = BigClass(base=100)

    @lru_cache(maxsize=16)  # the leak is here
    def cached_method(self, x: int) -> int:
        return self.big.base + x

def fun():
    foo = Foo()
    print(foo.cached_method(10))
    print(foo.cached_method(10)) # use cache
    return 'something'

fun()

Now move the implementation outside the class:

from dataclasses import dataclass
from functools import lru_cache

@dataclass
class BigClass:
    base: int

@lru_cache(maxsize=16)  # no leak from here
def _cached_method(base: int, x: int) -> int:
    return base + x

class Foo:
    def __init__(self):
        self.big = BigClass(base=100)

    def cached_method(self, x: int) -> int:
        return _cached_method(self.big.base, x)

def fun():
    foo = Foo()
    print(foo.cached_method(10))
    print(foo.cached_method(10)) # use cache
    return 'something'

fun()

Comments

2

The problem with using @lru_cache or @cache on an instance method is that self is passed to the method for caching despite not really being needed. I can't tell you why caching self causes the issue but I can give you what I think is a very elegant solution to the problem.

My preferred way of dealing with this is to define a dunder method that is a class method that takes all the same arguments as the instance method except for self. The reason this is my preferred way is that it's very clear, minimalistic and doesn't rely on external libraries.

from functools import lru_cache
class BigClass:
    pass


class Foo:
    def __init__(self):
        self.big = BigClass()
    
    @staticmethod
    @lru_cache(maxsize=16)
    def __cached_method__(x: int) -> int:
        return x + 5

    def cached_method(self, x: int) -> int:
        return self.__cached_method__(x)


def fun():
    foo = Foo()
    print(foo.cached_method(10))
    print(foo.cached_method(10)) # use cache
    return 'something'

fun()

I have verified that the item is garbage collected correctly:

import gc; gc.collect()  # collect garbage
len([obj for obj in gc.get_objects() if isinstance(obj, Foo)]) # is 0

2 Comments

In cases where self isn't used, there is a much simpler solution — write the method without self and decorate it with @cache and @staticmethod. The other answers address the more interesting problem where the method output does depend in some way on the instance.
@RaymondHettinger This answer is proposing the same thing I am, which is to pass only the relevant information from self to be cached. If you want to use properties from self you can pass them in cached_method when you call __cached_method__().
1

Building on top of https://stackoverflow.com/a/33672499, we can avoid overriding the class method with an instance method on first execution by exploiting WeakKeyDictionary:

from weakref import ref, WeakKeyDictionary
from functools import lru_cache, wraps


def lru_cache_method(*lru_args, **lru_kwargs):

    def decorator(method):

        cache = WeakKeyDictionary()

        @wraps(method)
        def cached_method(self, *args, **kwargs):
            bound_cached_method = cache.get(self)
            if bound_cached_method is None:
                self_weak = ref(self)

                @wraps(method)
                @lru_cache(*lru_args, **lru_kwargs)
                def bound_cached_method(*args, **kwargs):
                    return method(self_weak(), *args, **kwargs)
                
                cache[self] = bound_cached_method

            return bound_cached_method(*args, **kwargs)
        
        return cached_method
    
    return decorator

Doing so addresses the following limitations of the original implementation:

Unfortunately, it has the limitation of only handling hashable types, and cannot be applied to the __hash__ method, which can be circumvented (see also Python WeakKeyDictionary for unhashable types):

from weakref import WeakKeyDictionary, WeakValueDictionary, ref
from functools import lru_cache, wraps

class IdKey:
    def __init__(self, value):
        self._id = id(value)
    def __hash__(self):
        return self._id
    def __eq__(self, other):
        return self._id == other._id
    def __repr__(self):
        return f"<IdKey(_id={self._id})>"

def lru_cache_method(*lru_args, **lru_kwargs):

    def decorator(method):

        instances = WeakValueDictionary()
        methods = WeakKeyDictionary()

        @wraps(method)
        def cached_method(self, *args, **kwargs):
            key = IdKey(self)
            weakly_bound_cached_method = methods.get(key)
            if weakly_bound_cached_method is None:
                # NOTE This prevents `key` from being GCed until `self` is GCed.
                instances[key] = self
                
                # NOTE This makes sure self can be GCed before `bound_cached_method` is GCed,
                # by avoiding a mutual dependency.
                _self = ref(self)

                @wraps(method)
                @lru_cache(*lru_args, **lru_kwargs)
                def weakly_bound_cached_method(*args, **kwargs):
                    return method(_self(), *args, **kwargs)
                
                # NOTE This entry can be GCed as soon as `self` is GCed.
                methods[key] = weakly_bound_cached_method

            return weakly_bound_cached_method(*args, **kwargs)
        
        return cached_method
    
    return decorator

which you can test as follows:

from time import sleep

class X:
    def __init__(self, *args, **kwargs):
        self.args = tuple(args)
        self.kwargs = tuple(kwargs.items())

    @lru_cache_method(maxsize = 1)
    def __hash__(self):
        return hash((self.args, self.kwargs))
    

class Y(dict):
    def __init__(self):
        pass

    @lru_cache_method(maxsize = 1)
    def sleep(self):
        sleep(1)

    

if __name__ == '__main__':
    x1 = X(*[1, 2, 3], **dict(a=1, b=2))
    x2 = X(*[1, 2], **dict(a=2, b=1))
    print(hash(x1))
    print(hash(x2))
    x1.args = (1, 2)
    print(hash(x1))
    print(hash(x2))
    x2.kwargs = dict(a=1, b=2)
    print(hash(x1))
    print(hash(x2))

    import gc
    gc.collect()

    print([obj for obj in gc.get_objects() if isinstance(obj, IdKey)])
    print([obj for obj in gc.get_objects() if isinstance(obj, X)])

    del x1
    del x2

    print([obj for obj in gc.get_objects() if isinstance(obj, IdKey)])
    print([obj for obj in gc.get_objects() if isinstance(obj, X)])

    from timeit import timeit

    print(timeit('y1.sleep()', 'from main import Y; y1 = Y()', number = 100))

    hash(Y())

Comments

0

This can be done using cachetools and its cachetools.keys.methodkey, which ignores the first positional argument when building the cache key:

from cachetools import LRUCache, cached
from cachetools.keys import methodkey

class Foo:
    @cached(LRUCache(maxsize=128), key=methodkey)
    def cached_method1(self, x):
        return x + 5

The above will instantiate a cache during the class definition, so a cache per class. If you want a cache per class instance, use the @cachedmethod decorator, which takes a function as first argument to retrieve the cache for the method’s respective instance, and which already has key=methodkey by default:

from cachetools import LRUCache, cachedmethod

class Foo:
    def __init__(self):
        self._cache1 = LRUCache(maxsize=128)

    @cachedmethod(lambda self: self._cache1)
    def cached_method1(self, x):
        return x + 5

In case you want to have more cached methods, either use a separate cache per method or, if you decide to have a shared cache, then set a custom key function in each method decorator.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.