Context
Using mkdocstrings to document Python packages in a monorepo where sub-libraries have their own virtual environments alongside their source code.
Bug description
The _list_sources function in config.py recursively scans all directories under the mkdocstrings.handlers.python.paths to enumerate Python files for cache invalidation. There is no way to exclude directories from this scan.
The paths option must point to the parent of the top-level package so that griffe can resolve the full module path. For example, to document mypkg, paths must include the directory containing the mypkg/ package — you cannot point deeper (e.g. to mypkg/ directly) because griffe would fail to resolve mypkg.hello (it looks for mypkg/ as a child of the search path).
This means paths necessarily points to a broad directory that may contain non-source content. If that directory includes a virtual environment (.venv, venv, or any other name), _list_sources picks up tens of thousands of third-party .py files. This causes extreme memory usage (25+ GB observed) and the build hangs indefinitely.
There is currently no way for the user to work around this at the configuration level. A possible solution would be an exclude option for _list_sources, for example:
[project.plugins.mkdocstrings.handlers.python]
paths = ["lib"]
exclude = [".venv", "venv"]
As a workaround, _list_sources can be monkeypatched before calling build() to filter out unwanted paths.
Related links
Reproduction
I have attached a .zip file with a minimal reproduction. The zip contains a zensical project with a lib/mypkg source package and mkdocstrings configured with paths = ["lib"]. The virtual environment is not included in the zip due to its size — it must be created as part of the steps below.
zensical-repro.zip
Steps to reproduce
- Unzip the attached file
- Install zensical and mkdocstrings:
pip install "zensical>=0.0.23" "mkdocstrings[python]"
- Create a virtual environment inside
lib/ to simulate a sub-library with its own environment:
python -m venv lib/myenv
lib/myenv/bin/pip install requests pandas numpy django flask sqlalchemy boto3 scipy matplotlib scikit-learn
- Run
zensical build -f zensical.toml --clean
- Observe that the build hangs and memory usage grows continuously
- For comparison, remove the virtual environment and rebuild:
rm -rf lib/myenv
zensical build -f zensical.toml --clean
The build completes in under a second.
Note: narrowing paths to lib/mypkg is not an option when the documented module is mypkg — griffe requires mypkg/ to be a direct child of the search path to resolve mypkg.hello.
Browser
No response
Before submitting
Context
Using mkdocstrings to document Python packages in a monorepo where sub-libraries have their own virtual environments alongside their source code.
Bug description
The
_list_sourcesfunction inconfig.pyrecursively scans all directories under themkdocstrings.handlers.python.pathsto enumerate Python files for cache invalidation. There is no way to exclude directories from this scan.The
pathsoption must point to the parent of the top-level package so that griffe can resolve the full module path. For example, to documentmypkg,pathsmust include the directory containing themypkg/package — you cannot point deeper (e.g. tomypkg/directly) because griffe would fail to resolvemypkg.hello(it looks formypkg/as a child of the search path).This means
pathsnecessarily points to a broad directory that may contain non-source content. If that directory includes a virtual environment (.venv,venv, or any other name),_list_sourcespicks up tens of thousands of third-party.pyfiles. This causes extreme memory usage (25+ GB observed) and the build hangs indefinitely.There is currently no way for the user to work around this at the configuration level. A possible solution would be an
excludeoption for_list_sources, for example:As a workaround,
_list_sourcescan be monkeypatched before callingbuild()to filter out unwanted paths.Related links
Reproduction
I have attached a .zip file with a minimal reproduction. The zip contains a zensical project with a
lib/mypkgsource package and mkdocstrings configured withpaths = ["lib"]. The virtual environment is not included in the zip due to its size — it must be created as part of the steps below.zensical-repro.zip
Steps to reproduce
pip install "zensical>=0.0.23" "mkdocstrings[python]"lib/to simulate a sub-library with its own environment:zensical build -f zensical.toml --cleanNote: narrowing
pathstolib/mypkgis not an option when the documented module ismypkg— griffe requiresmypkg/to be a direct child of the search path to resolvemypkg.hello.Browser
No response
Before submitting