Skip to content

MilkClouds/lazyregistry

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

65 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

lazyregistry

CI codecov pypi Python Versions License: MIT Code style: ruff

A lightweight Python library for lazy-loading registries with namespace support and type safety

Installation

# Install with pip
$ pip install lazyregistry

# Add to your project with uv
$ uv add "lazyregistry"

Quick Start

from lazyregistry import Registry

registry = Registry(name="plugins")

# Register by import string (lazy - imported on access)
registry["json"] = "json:dumps"

# Register by instance (immediate - already imported)
import pickle
registry["pickle"] = pickle.dumps

# Import happens here
serializer = registry["json"]

Features

  • Lazy imports - Defer expensive imports until first access
  • Instance registration - Register both import strings and direct objects
  • Namespaces - Organize multiple registries
  • Generic types - Type hints for IDE autocomplete on registry access
  • Eager loading - Optional immediate import for critical components
  • Pretrained models - Built-in support for save_pretrained/from_pretrained pattern

💡 Want to lazy-load your entire package with full type checking support? See Combining with lazy-loader.

Examples

Run examples: uv run python examples/<example>.py

1. Plugin System

examples/plugin_system.py - Extensible plugin architecture with decorator-based registration:

from lazyregistry import Registry

PLUGINS = Registry(name="plugins")

def plugin(name: str):
    def decorator(cls):
        PLUGINS[name] = cls
        return cls
    return decorator

@plugin("uppercase")
class UppercasePlugin:
    def process(self, text: str) -> str:
        return text.upper()

# Execute plugins
PluginManager.execute("uppercase", "hello")  # "HELLO"
PluginManager.pipeline("hello", "uppercase", "reverse")  # "OLLEH"

2. Pretrained Models

examples/pretrained.py - HuggingFace-style save/load with two patterns:

Basic (config only):

from lazyregistry import NAMESPACE
from lazyregistry.pretrained import AutoRegistry, PretrainedConfig, PretrainedMixin

# Each model has its own config with type identifier
class BertConfig(PretrainedConfig):
    model_type: str = "bert"
    hidden_size: int = 768

class GPT2Config(PretrainedConfig):
    model_type: str = "gpt2"
    hidden_size: int = 768

# Base model class
class BaseModel(PretrainedMixin):
    config_class = PretrainedConfig

class AutoModel(AutoRegistry):
    registry = NAMESPACE["models"]
    config_class = PretrainedConfig
    type_key = "model_type"

# Register with decorator - models inherit from BaseModel
@AutoModel.register_module("bert")
class BertModel(BaseModel):
    config_class = BertConfig

# Or register directly
AutoModel.registry["gpt2"] = "transformers:GPT2Model"  # Lazy import
AutoModel.registry["t5"] = T5Model                     # Direct

# Save and auto-load
config = BertConfig(hidden_size=1024)
model = BertModel(config=config)
model.save_pretrained("./model")
loaded = AutoModel.from_pretrained("./model")  # Auto-detects type

Advanced (config + custom state):

class Tokenizer(PretrainedMixin):
    def __init__(self, *args, vocab: dict[str, int] | None = None, **kwargs):
        super().__init__(*args, **kwargs)
        self.vocab = vocab or {}

    def save_pretrained(self, path):
        super().save_pretrained(path)
        # Save additional state (vocabulary)
        Path(path).joinpath("vocab.txt").write_text(...)

    @classmethod
    def from_pretrained(cls, path):
        config = cls.config_class.model_validate_json(...)
        vocab = ...  # Load vocabulary
        return cls(config=config, vocab=vocab)

API Reference

Core Classes

ImportString - String that represents an import path with lazy loading capability

from lazyregistry import ImportString

# Create an import string
import_str = ImportString("json:dumps")

# Load the object when needed
func = import_str.load()
func({"key": "value"})  # '{"key": "value"}'

Registry[K, V] - Named registry with lazy import support

registry = Registry(name="plugins")

# Dict-style assignment (auto-converts strings to ImportString)
registry["key"] = "module:object"      # Lazy
registry["key2"] = actual_object       # Immediate

# Configure behavior
registry.eager_load = True
registry["key3"] = "module:object"     # Eager load

Namespace - Container for multiple registries

from lazyregistry import NAMESPACE

# Direct access to .registry for registration
NAMESPACE["models"]["bert"] = "transformers:BertModel"
NAMESPACE["models"]["gpt2"] = GPT2Model

# Access registered items
model = NAMESPACE["models"]["bert"]

LazyImportDict[K, V] - Base class for custom implementations

Same dict-style API as Registry, but with configurable behavior:

from lazyregistry.registry import LazyImportDict

registry = LazyImportDict()

# Configure behavior via attributes
registry.auto_import_strings = True   # Auto-convert strings to ImportString (default: True)
registry.eager_load = False           # Load immediately on assignment (default: False)

# Use like a normal dict
registry["key"] = "module:object"
registry.update({"key2": "module:object2"})

Pretrained Pattern

PretrainedMixin - Save/load with Pydantic config

class MyConfig(PretrainedConfig):
    model_type: str = "my_model"

class MyModel(PretrainedMixin):
    config_class = MyConfig

model.save_pretrained("./path")
loaded = MyModel.from_pretrained("./path")

AutoRegistry - Auto-detect model type from config

The type_key parameter (defaults to "model_type") determines which config field is used for type detection.

Three ways to register:

from lazyregistry.pretrained import PretrainedConfig, PretrainedMixin

# Each model has its own config class with type identifier
class BertConfig(PretrainedConfig):
    model_type: str = "bert"
    hidden_size: int = 768

class GPT2Config(PretrainedConfig):
    model_type: str = "gpt2"
    hidden_size: int = 768

# Base model class
class BaseModel(PretrainedMixin):
    config_class = PretrainedConfig

class AutoModel(AutoRegistry):
    registry = NAMESPACE["models"]
    config_class = PretrainedConfig
    type_key = "model_type"  # Can use any field name

# 1. Decorator registration - models inherit from BaseModel
@AutoModel.register_module("bert")
class BertModel(BaseModel):
    config_class = BertConfig

# 2. Direct registration via .registry
AutoModel.registry["gpt2"] = GPT2Model                   # Direct instance
AutoModel.registry["t5"] = "transformers:T5Model"        # Lazy import string

# 3. Bulk registration via .registry.update() - useful for many models
AutoModel.registry.update({
    "roberta": RobertaModel,
    "albert": "transformers:AlbertModel",
    "electra": "transformers:ElectraModel",
})

# Auto-detect and load
loaded = AutoModel.from_pretrained("./path")  # Detects type from config

Why?

Before:

# All imports happen upfront
from heavy_module_1 import ClassA
from heavy_module_2 import ClassB
from heavy_module_3 import ClassC

REGISTRY = {"a": ClassA, "b": ClassB, "c": ClassC}

After:

# Import only what you use
from lazyregistry import Registry

registry = Registry(name="components")
registry.register("a", "heavy_module_1:ClassA")
registry.register("b", "heavy_module_2:ClassB")
registry.register("c", "heavy_module_3:ClassC")

# Only ClassA is imported
component = registry["a"]

Tips

Combining with lazy-loader for Full Lazy Package Imports

For packages with many heavy dependencies, you can combine lazyregistry with lazy-loader to achieve both lazy module imports and lazy registry lookups:

# mypackage/__init__.py
from typing import TYPE_CHECKING

if TYPE_CHECKING:
    # Static analysis sees these imports (IDE autocomplete, mypy, pyright)
    from .auto import AutoModel as AutoModel
    from .bert import BertModel as BertModel
    from .gpt2 import GPT2Model as GPT2Model
else:
    # Runtime uses lazy-loader (nothing imported until accessed)
    import lazy_loader as lazy

    __getattr__, __dir__, __all__ = lazy.attach(
        __name__,
        submod_attrs={
            "auto": ["AutoModel"],
            "bert": ["BertModel"],
            "gpt2": ["GPT2Model"],
        },
    )
# mypackage/auto.py
from lazyregistry import Registry
from lazyregistry.pretrained import AutoRegistry

class AutoModel(AutoRegistry):
    registry = Registry(name="models")

# String references - actual imports deferred until registry access
AutoModel.registry.update({
    "bert": "mypackage.bert:BertModel",
    "gpt2": "mypackage.gpt2:GPT2Model",
})

Benefits:

  • Double lazy loading: lazy-loader defers module imports + lazyregistry defers registry lookups
  • Full type checking: TYPE_CHECKING block provides IDE autocomplete and static analysis
  • Zero import cost: Heavy dependencies (torch, sklearn, etc.) only load when actually used
  • Clean public API: Users import naturally with from mypackage import AutoModel

Testing

Run tests with coverage:

uv run pytest tests/ --cov=lazyregistry --cov-report=term-missing

The test suite includes:

  • Core registry tests - LazyImportDict, Registry, Namespace functionality
  • Pretrained tests - save/load patterns, AutoRegistry, custom state
  • Example tests - Verify all examples run correctly

License

MIT

About

A lightweight Python library for lazy-loading registries with namespace support and type safety

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages