Part of the twat collection of Python utilities.
twat-cache is a versatile and high-performance Python library designed to simplify caching for your functions. It provides a unified and easy-to-use interface to various caching backends, allowing you to speed up your applications by storing and retrieving the results of expensive computations without significant code changes.
At its core, twat-cache offers a set of decorators that you can apply to your Python functions. When a decorated function is called, twat-cache checks if the result for the given arguments already exists in its cache. If so, it returns the cached result immediately, bypassing the actual function execution. If not, the function executes, and its result is stored in the cache for future calls.
This mechanism can drastically reduce execution time for functions that are called repeatedly with the same inputs, especially if those functions perform I/O operations (like web requests or database queries) or complex calculations.
twat-cache is for Python developers who want to:
- Improve the performance and responsiveness of their applications.
- Reduce redundant computations or data fetching.
- Simplify the implementation of caching logic.
- Have the flexibility to choose from different caching strategies (in-memory, disk-based, file-based) without rewriting their code.
- Work with both synchronous and asynchronous Python code.
Whether you're building web applications, data processing pipelines, scientific computing tools, or any Python project where performance matters, twat-cache can be a valuable addition.
- Unified Interface: Learn one way to cache, and apply it across multiple cache engines.
- Automatic Engine Selection: The
ucache(universal cache) decorator can intelligently pick a suitable cache engine, or you can explicitly choose one. - Multiple Cache Engines & Backends: Supports several caching strategies by integrating with robust backend libraries:
- In-memory (LRU, LFU, FIFO policies via
cachetoolslibrary, or high-performancecacheboxlibrary) - Disk-based (persistent caching via
diskcachelibrary) - File-based (efficient for large objects like NumPy arrays via
jobliblibrary) - Asynchronous caching (for
async/awaitfunctions viaaiocachelibrary)
- In-memory (LRU, LFU, FIFO policies via
- Fallback Mechanism: If a preferred cache engine or its underlying backend library isn't available,
twat-cachegracefully falls back to a functional alternative. - Configurable: Control cache size (
maxsize), time-to-live (ttl) for entries, eviction policies, and more. - Context Management: Provides
CacheContextfor fine-grained control over cache engine lifecycles and configurations within specific code blocks. - Modern Python: Built with type hints and supports modern Python features.
- Part of a Larger Ecosystem: As a component of the
twatproject, it aims for consistency and quality within that suite of tools.
You can install twat-cache using pip.
Basic Installation (includes in-memory and basic disk caching capabilities):
pip install twat-cacheThis installs twat-cache along with pydantic, loguru, diskcache, joblib, and cachetools.
To include all optional backend libraries for extended capabilities:
pip install twat-cache[all]This adds support for cachebox, aiocache, klepto, and platformdirs.
To install support for specific optional backend libraries:
Choose the backend libraries that enable the cache engines you need:
- cachebox (High-performance Rust-based in-memory cache):
pip install twat-cache[cachebox]
- aiocache (Async-capable caching):
pip install twat-cache[aiocache]
- klepto (Scientific computing caching, various storage options):
pip install twat-cache[klepto]
Note on Redis: Previous versions or documentation might have mentioned a Redis cache engine. Currently, the direct implementation for a Redis engine (redis.py) is not present in the core twat_cache.engines directory. While pyproject.toml might list a redis extra, ensure the redis-py (or equivalent) package is installed and that twat-cache explicitly supports it in the version you are using if Redis is a requirement. The CacheEngineManager has a provision to load a Redis engine, but the engine code itself must be available.
For local development:
# Clone the repository
git clone https://github.com/twardoch/twat.git
cd twat
# Set up development environment
./scripts/setup-dev.sh
# Or manually install dependencies
pip install -e .[dev,test,all]The project uses a comprehensive build system with multiplatform support:
# Quick commands using Make
make help # Show all available commands
make test # Run tests
make lint # Run linting
make format # Format code
make build # Build package
make release-check # Run all pre-release checks
# Or use scripts directly
python3 scripts/build.py test
python3 scripts/build.py lint
python3 scripts/build.py format
python3 scripts/build.py build
python3 scripts/build.py checkThe project uses git-tag-based semantic versioning with automated CI/CD:
# Check current version
make version
# Create releases
make release # Patch release (1.2.3 -> 1.2.4)
make release-minor # Minor release (1.2.3 -> 1.3.0)
make release-major # Major release (1.2.3 -> 2.0.0)
# Dry run to see what would happen
make release-dry-runThe project is tested and built on:
- Operating Systems: Ubuntu, Windows, macOS
- Python Versions: 3.10, 3.11, 3.12
- Architectures: x64 (primary), arm64 (via GitHub's runners)
Automated workflows provide:
- Continuous Integration: Multi-platform testing on every push/PR
- Automated Releases: Build and publish to PyPI on git tags
- Security Scanning: Dependency and code security checks
- Quality Assurance: Linting, type checking, test coverage
See docs/BUILD_AND_RELEASE.md for detailed information.
twat-cache is primarily used as a library by importing its decorators and context managers into your Python code. It does not currently offer a standalone command-line interface for direct cache manipulation outside of a Python script.
Here are some common ways to use twat-cache:
1. In-Memory Caching (mcache)
Ideal for fast, temporary caching of frequently accessed, small-to-medium sized data within a single process.
from twat_cache import mcache
import time
@mcache(maxsize=128, ttl=60) # Cache up to 128 items, expire after 60 seconds
def get_user_details(user_id: int) -> dict:
print(f"Fetching details for user {user_id}...")
time.sleep(1) # Simulate expensive operation
return {"id": user_id, "name": f"User {user_id}", "email": f"user{user_id}@example.com"}
# First call: fetches and caches
user1 = get_user_details(1)
print(user1)
# Second call: result comes from cache
user1_cached = get_user_details(1)
print(user1_cached)
# After 60 seconds, or if maxsize is exceeded, the cache entry might be evicted.2. Disk-Based Caching (bcache)
Useful for persistent caching across application restarts or for larger datasets that don't fit comfortably in memory. Uses the diskcache library.
from twat_cache import bcache
import time
@bcache(folder_name="user_data_cache", ttl=3600) # Cache in '.cache/twat_cache/user_data_cache', expire after 1 hour
def get_report_data(report_id: str) -> list:
print(f"Generating report {report_id}...")
time.sleep(2) # Simulate expensive report generation
return [{"report": report_id, "data_point": i} for i in range(5)]
# First call: generates and caches to disk
report_A = get_report_data("A")
print(report_A)
# Second call (even after script restart): result comes from disk cache
report_A_cached = get_report_data("A")
print(report_A_cached)3. File-Based Caching for Large Objects (fcache)
Optimized for caching large objects like NumPy arrays or machine learning models, typically using the joblib library.
from twat_cache import fcache
import numpy as np
import time
@fcache(folder_name="array_processing_cache", compress=True) # Compress cached files
def process_large_array(size: int) -> np.ndarray:
print(f"Processing large array of size {size}x{size}...")
time.sleep(3) # Simulate heavy computation
return np.random.rand(size, size)
# First call: computes and caches the array to a file
array_data = process_large_array(1000)
print(array_data.shape)
# Second call: loads the array from the cached file
array_data_cached = process_large_array(1000)
print(array_data_cached.shape)4. Asynchronous Caching (acache)
For caching results of async functions, typically using the aiocache library.
from twat_cache import acache
import asyncio
import time
@acache(ttl=300) # Cache async results for 5 minutes
async def fetch_external_data(url: str) -> dict:
print(f"Fetching data from {url}...")
await asyncio.sleep(1) # Simulate async HTTP request
return {"url": url, "content": f"Data from {url}"}
async def main():
# First call: fetches and caches
data1 = await fetch_external_data("https://api.example.com/data")
print(data1)
# Second call: result comes from cache
data1_cached = await fetch_external_data("https://api.example.com/data")
print(data1_cached)
if __name__ == "__main__":
asyncio.run(main())5. Universal Caching (ucache)
Let twat-cache attempt to pick the best cache engine based on availability and function characteristics.
from twat_cache import ucache
import time
@ucache(ttl=1800, compress=True) # Universal cache, expire after 30 mins, try compression
def complex_calculation(a: int, b: float, c: str) -> str:
print(f"Performing complex calculation with {a}, {b}, {c}...")
time.sleep(1.5)
return f"Result: {a * b} - {c.upper()}"
# twat-cache will choose an appropriate engine (e.g., DiskCacheEngine if available and suitable)
res1 = complex_calculation(10, 2.5, "hello")
print(res1)
res1_cached = complex_calculation(10, 2.5, "hello")
print(res1_cached)6. Using Cache Context (CacheContext)
For more explicit control over caching within a specific block of code, or for using cache engines directly.
from twat_cache import CacheContext
import time # Ensure time is imported for the example
def process_user_session(user_id: str, data: dict):
# Use a specific disk cache for this operation
with CacheContext(engine_name="diskcache", folder_name="session_cache", ttl=600) as cache:
# 'cache' is the cache engine instance, allowing direct get/set.
# For DiskCacheEngine, this would be a diskcache.Cache object.
cache_key = f"session_data_{user_id}"
cached_data = cache.get(cache_key)
if cached_data:
print(f"Using cached session data for {user_id}")
return cached_data
print(f"Processing and caching session data for {user_id}")
# Simulate processing
processed_data = {**data, "processed_at": time.time()}
cache.set(cache_key, processed_data)
return processed_data
user_data = {"preferences": "dark_mode"}
session_info = process_user_session("user123", user_data)
print(session_info)
session_info_cached = process_user_session("user123", user_data) # Should hit cache
print(session_info_cached)This section delves into the internal workings of twat-cache, its architecture, and guidelines for coding and contributing.
twat-cache is structured around a few key components: decorators, configuration objects, cache engines, and management utilities.
1. Core Decorators (src/twat_cache/decorators.py)
The primary interface for users. These decorators wrap user functions to inject caching logic.
@mcache(Memory Cache):- Prioritizes fast in-memory cache engines.
- Order of preference:
CacheBoxEngine(ifcacheboxlibrary is available), thenCacheToolsEngine(ifcachetoolsis available), finally falls back to Python's built-infunctools.lru_cache(viaFunctoolsEngine). - Configuration:
maxsize,ttl,policy(LRU, LFU, FIFO, etc.),secure(key generation).
@bcache(Basic Disk Cache):- Primarily uses
DiskCacheEngine(which leverages thediskcachelibrary) for persistent, SQLite-backed disk storage. - If
diskcacheis unavailable butkleptois (anduse_sql=Truewas passed, though this option is being de-emphasized), it can useKleptoEnginefor SQL-based storage. - Falls back to
@mcacheif no suitable disk-caching engine is found. - Configuration:
folder_name,maxsize(bytes on disk),ttl,policy,use_sql(for Klepto),secure.
- Primarily uses
@fcache(File Cache):- Designed for large objects, often using serialization strategies suited for items like NumPy arrays.
- Prefers
JoblibEngine(usingjoblib.Memory) for efficient file-based caching. - Can fall back to
KleptoEngineifjoblibis not available. - If neither is available, it falls back to
@mcache. - Configuration:
folder_name,maxsize,ttl,compress,secure.
@acache(Asynchronous Cache):- Specifically for
async deffunctions. - Uses
AioCacheEngine(leveraging theaiocachelibrary) if available. - If
aiocacheis not available, it uses a synchronous memory cache (mcache) to wrap the async function. The caching mechanism then stores the result after the initial await, so subsequent calls with the same arguments will return the cached result without re-executing the async operation. - Configuration:
maxsize,ttl,policy.
- Specifically for
@ucache(Universal Cache):- The most flexible decorator. It attempts to select the "best" available cache engine.
- Selection logic (
_select_best_backend): Considers if the function is async (prefersAioCacheEngine), if disk persistence seems beneficial (e.g., iffolder_nameis provided, hinting atDiskCacheEngine,JoblibEngine, orKleptoEngine), or defaults to fast in-memory options (CacheBoxEngine,CacheToolsEngine). - Takes a wide range of configuration options applicable to various engines.
- The actual engine is instantiated via
_create_enginebased on the resolved configuration and selected backend name.
2. Configuration (src/twat_cache/config.py)
CacheConfig(Pydantic Model): A Pydantic model defines the structure for all cache configurations. This ensures type safety and validation for parameters likemaxsize,ttl,policy,folder_name,compress,secure, andpreferred_engine.create_cache_config(...): A factory function to easily constructCacheConfigobjects.- Environment variables are not directly used for configuration by default, but
CacheConfigcould be extended or instantiated with values from environment variables programmatically if needed. - Redis-specific configurations were previously part of
CacheConfigbut have been removed, indicating a potential shift or pending refactor for Redis support.
3. Cache Engines (src/twat_cache/engines/)
This directory contains the actual caching logic implementations, abstracted via a common interface.
BaseCacheEngine(Protocol): Defines the contract for all cache engines. It's expected to have an__init__(self, config: CacheConfig)method and acache(self, func)method that returns a wrapped, cached version of the input function. It also has anis_available()class method to check if the engine's dependencies (the underlying backend library) are met.- Engine Implementations:
FunctoolsEngine: Wraps Python'sfunctools.lru_cacheorfunctools.cache. Always available.CacheBoxEngine: Uses thecacheboxlibrary for high-performance in-memory caching.CacheToolsEngine: Uses thecachetoolslibrary for various in-memory eviction policies (LRU, LFU, TTL, FIFO, RR).DiskCacheEngine: Uses thediskcachelibrary for persistent, transactional, disk-backed caching (typically SQLite + file system).JoblibEngine: Usesjoblib.Memoryfor file-system based caching, particularly efficient for Python objects like NumPy arrays.KleptoEngine: Uses thekleptolibrary, which offers a wide range of serialization and storage backends (including file, SQL, HDF5). Its usage intwat-cacheseems to be de-emphasized in recent changes.AioCacheEngine: Uses theaiocachelibrary for caching results of asynchronous functions.RedisCacheEngine: (Potentially) An engine for Redis. As noted,redis.pyis currently missing from the source tree, so this engine is not practically available unless provided externally or by a specific package version.
CacheEngineManager(src/twat_cache/engines/manager.py):- Responsible for registering and discovering available cache engine classes.
_register_builtin_engines(): Registers all standard engine types known totwat-cache.select_engine(): Provides logic to pick an engine based on availability and optional preferences. This is used by context managers. Decorators have their own specific fallback chains.
4. Cache Management (src/twat_cache/cache.py)
- Global Cache Registry (
_active_caches): A dictionary that keeps track of active cache instances created by the decorators. This allows for global operations. register_cache(name, cache_engine_instance, wrapper, stats): Adds a newly created cache instance to the global registry.clear_cache(name=None): Clears a specific cache by name or all registered caches if no name is provided. It also attempts to clean up associated disk directories.get_stats(name=None): Retrieves statistics (hits, misses, size, maxsize) for a specific cache or aggregated stats for all caches.update_stats(...): Internal helper to update hit/miss counts for a cache.
5. Context Management (src/twat_cache/context.py)
CacheContext: A class-based context manager.- Allows for explicit selection of a cache engine (
engine_name) and configuration for a specific block of code. - Example:
with CacheContext(engine_name="diskcache", folder_name="my_temp_cache") as cache_instance: - The
cache_instanceprovided byasis intended to be the underlying cache object from the chosen backend library (e.g., adiskcache.Cacheobject) if thetwat-cacheengine wrapper exposes it directly (often via a_cacheattribute orcache_instanceproperty). If not, it will be thetwat-cacheengine wrapper instance itself, which should still provide basic cache operations likeget,set,delete. - Ensures resources (like database connections for
DiskCacheEngine) are properly initialized and closed.
- Allows for explicit selection of a cache engine (
engine_context: A function-based context manager, similar in purpose toCacheContext, providing a more concise way to achieve the same.
6. Backend Selection Logic (_select_best_backend in decorators.py)
This internal function is used by @ucache and potentially by context managers if no engine is explicitly specified.
- Checks availability of backend libraries (
_get_available_backends). - Considers
preferred_enginefrom config. - Factors in
is_async(favorsAioCacheEngine). - Factors in
needs_disk(favorsDiskCacheEngine,KleptoEngine,JoblibEngine). - Has a fallback order if specific requirements aren't met (e.g.,
CacheBoxEngine->CacheToolsEngine->FunctoolsEnginefor general in-memory needs).
7. Key Generation (make_key in decorators.py)
- Responsible for creating a consistent, hashable cache key from the function's arguments (
*args,**kwargs). - Serializes arguments to JSON by default to handle various data types and ensure order doesn't affect the key for keyword arguments.
- A custom
serializercan be provided inCacheConfigfor handling non-JSON-serializable objects or for custom keying strategies. Thesecure=Trueoption (default) likely implies a more robust serialization.
Coding Conventions & Style:
- Python Version: Requires Python 3.10 or newer (as per
pyproject.toml). - Formatting: Code is formatted using
BlackandRuff. Adherence to these formatters is expected.- Run
ruff format .andruff check --fix .
- Run
- Linting:
Ruffis used for extensive linting, covering rules from Flake8, isort, pep8-naming, and more.- Run
ruff check .
- Run
- Type Checking: Static type checking is enforced using
MyPy. All new code should include type hints.- Run
mypy src/twat_cache tests
- Run
- Imports:
isort(via Ruff) is used to sort imports. Known first-party istwat_cache. Relative imports are generally banned insrcbut allowed intests. - Line Length: 88 characters, enforced by Black/Ruff.
Dependencies:
- Core:
pydantic(for configuration),loguru(for logging),diskcache,joblib,cachetools. - Optional Backend Libraries & Their Engines:
cachebox(forCacheBoxEngine)aiocache(forAioCacheEngine)klepto(forKleptoEngine) These are installed via extras liketwat-cache[all]ortwat-cache[cachebox].
- Development & Testing:
pytest,pytest-cov,pytest-benchmark,mypy,ruff,hatch,pre-commit.
Testing:
- Tests are located in the
tests/directory and usepytest. - Run tests with
python -m pytest -n auto tests(or viahatch env run test:test). - Coverage is monitored:
python -m pytest -n auto --cov=src/twat_cache ...(orhatch env run test:test-cov). - Strive for high test coverage for new features and bug fixes.
- Benchmarks are present in
tests/test_benchmark.pyand can be run withpytest-benchmark.
Contribution Process:
- Fork the repository on GitHub.
- Create a new branch for your feature or bug fix:
git checkout -b feature/your-feature-nameorbugfix/issue-description. - Make your changes. Ensure you add tests for any new functionality or bug fixes.
- Run linters, formatters, and type checkers locally:
hatch env run lint:fmt(orruff format . && ruff check --fix .)hatch env run lint:typing(ormypy src/twat_cache tests)- Or use
pre-commithooks.
- Run tests:
hatch env run test:test(orpytest -n auto tests) - Commit your changes with a clear and descriptive commit message.
- Push your branch to your fork:
git push origin feature/your-feature-name. - Open a Pull Request (PR) against the main branch of the original repository.
- Address any feedback or CI failures.
Pre-commit Hooks:
The project uses pre-commit (as indicated by .pre-commit-config.yaml) to automatically run linters and formatters before commits. Install and set it up in your local clone:
pip install pre-commit
pre-commit installThis helps ensure that contributions meet the project's coding standards.
Logging:
The loguru library is used for logging. Import the logger from twat_cache.logging for consistent logging within the library.
Project Structure (src/twat_cache):
__init__.py: Exports main public interface (decorators, context managers).__main__.py: Contains some utility functions, but not a primary CLI entry point for the cache library itself.cache.py: Global cache registration and management utilities.config.py:CacheConfigPydantic model and configuration factory.context.py:CacheContextandengine_contextfor explicit cache management.decorators.py: Core caching decorators (mcache,bcache, etc.) and backend selection logic.engines/: Directory for different cache backend implementations.base.py:BaseCacheEngineprotocol.manager.py:CacheEngineManager.- Individual engine files (e.g.,
diskcache.py,joblib.py).
exceptions.py: Custom exceptions.logging.py: Logging setup.paths.py: Path utilities (though__main__.pyalso has some path logic).type_defs.py: Common type definitions and protocols.