Skip to content

session: don't load all built-in plugins at once #4741

@bastimeyer

Description

@bastimeyer

I've talked about the problem of how Streamlink loads its plugins multiple times, eg. last year in the proposal for the new plugin matchers, but I never opened a dedicated thread for this, so let's do this. Don't expect anything from this for now though, I just want to write this down.


Issue

Every time a new Streamlink session gets initialized, it always loads all built-in plugins at once, so that it can read the URL regexes of the plugin classes for its URL-to-plugin resolver and so that plugin arguments become available in streamlink_cli.

The big issue with that is, despite plugin files being byte-compiled by the python interpreter for faster loading times, that there's a lot of runtime logic in the plugin modules which unnecessarily delays the session initialization and which stores a ton of unneeded junk in memory as plugin references are kept by the session's plugins dict.

Loading only the plugin that's needed to find streams from the input URL would mean no wasted time during session initialization, no wasted memory, and lots of module imports like certain stream implementations for example could also be avoided. On fast systems, this is all negligible, but why be wasteful and have a plugin loading system like this?

I would even say that resolving this plugin loading issue could lead to some of the plugin request/merging rules being loosened, as adding more built-in plugins wouldn't be too bad. But that's a different topic.

Solution

The idea of the added plugin matchers API last year wasn't just to allow defining URLs in a declarative way, but also making it easier for a parser to analyze the AST of plugin files, so that a pre-build step when installing or building wheels could generate a JSON file which could then be loaded by the session as an alternative to the plugin modules themselves.

I've recently had another go at it and got the bare logic working, so I'm pretty much aware of what's needed to be done in order to properly implement this.

In addition to the plugin matchers, I would prefer having plugin arguments defined as class decorators as well, so that the parser can be simplified. I've already written a parser for the documentation's plugin matrix which is based on the current plugin arguments definitions, but that can be changed easily. Moving the plugin args is simple and won't be a breaking change for custom plugins.

Plan

  1. Turn plugin arguments to class decorators, so that both URL regexes and arguments can be defined in the same declarative style
  2. Add a pre-build step when installing / building wheels
    • Implement plugin AST visitor which outputs data containing a list of URL regexes+priorities and plugin arguments
    • Add a build_py setuptools hook (via versioningit's cmdclass wrapper) which writes the data as JSON
    • This should not affect sdists and editable installs
  3. Rewrite Session plugin loading logic
    • Load JSON file if it exists, fall back to load-everything logic
    • Resolve from JSON if available, only load resolved plugin
    • Always load all custom plugins and resolve accordingly
  4. Rewrite streamlink_cli.main.setup_plugin_args()
    • All available plugin arguments need to be loaded from JSON because of config files containing those args
    • The parser_helper function used when building the docs should always load all built-in plugins
  5. Optionally add CLI arg for loading all plugins

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions