Skip to content

Optimize startup time with lazy imports #514

@osma

Description

@osma

Annif takes several seconds to start even when it's doing nothing but printing the version number or help text:

$ time annif --version
0.54.0.dev0

real	0m4,398s
user	0m4,322s
sys	0m0,470s

I investigated this a little bit using the -X importtime feature in Python 3.7+ and the tuna tool for visualizing profiling information. It seems that the time is mostly spent importing large libraries such as tensorflow, scikit-learn, optuna, connexion and nltk:

kuva

These libraries are all unnecessary in simple operations such as annif --help and --version so it would be better to avoid importing them altogether. There are some tutorials on lazy importing (e.g. this one) and the importlib library contains (since Python 3.5) a LazyLoader utility class that could be used here.

I experimented a bit with this lazy_import function but couldn't get it to work for nltk submodules:

# Adapted from: https://stackoverflow.com/questions/42703908/
def lazy_import(fullname):
    """lazily import a module the first time it is used"""
    try:
        return sys.modules[fullname]
    except KeyError:
        spec = importlib.util.find_spec(fullname)
        module = importlib.util.module_from_spec(spec)
        loader = importlib.util.LazyLoader(spec.loader)
        # Make module with proper locking and get it inserted into sys.modules.
        loader.exec_module(module)
        return module

This needs more experimentation but for now I'm just opening the issue...

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions