https://github.com/crate-ci/typos is an awesome hook and runs super fast. However, for large codebases and since it processes any file independent of the language, the number of files this hook has to process at once can be a lot. When running on a small CI machine, this can kill the machine with an out of memory error.
I did some investigations already and compared the hook when running prek run -a typos:
- without
require_serial:
DEBUG: Running priority group with priority XX with concurrency N
TRACE run{hook_id=typos language=python}: Running typos total_files=10000 concurrency=N
require_serial: true
DEBUG: Running priority group with priority XX with concurrency N
TRACE run{hook_id=typos language=python}: Running typos total_files=10000 concurrency=1
where N is the number of available cores (I guess). Now when running on a small machine with 4GB or 8GB memory, this can easily cause OOMing.
I am not sure if this is a typos problem or prek problem, so I wanted to start here and maybe create an issue in https://github.com/crate-ci/typos as well.
Would it make sense to specify a file limit or something in prek so that the hook is called multiple times in sequence when the number of files exceeds the limit?
https://github.com/crate-ci/typos is an awesome hook and runs super fast. However, for large codebases and since it processes any file independent of the language, the number of files this hook has to process at once can be a lot. When running on a small CI machine, this can kill the machine with an out of memory error.
I did some investigations already and compared the hook when running
prek run -a typos:require_serial:require_serial: truewhere
Nis the number of available cores (I guess). Now when running on a small machine with 4GB or 8GB memory, this can easily cause OOMing.I am not sure if this is a typos problem or prek problem, so I wanted to start here and maybe create an issue in https://github.com/crate-ci/typos as well.
Would it make sense to specify a file limit or something in prek so that the hook is called multiple times in sequence when the number of files exceeds the limit?