Conversation
andrewpollock
left a comment
There was a problem hiding this comment.
LGTM, a few minor things.
Curious as to why 7?
docker/exporter/export_runner.py
Outdated
| parser.add_argument( | ||
| '--processes', | ||
| help='Maximum number of parallel exports', | ||
| default=DEFAULT_EXPORT_PROCESSES) |
There was a problem hiding this comment.
It might be tidier to default to os.cpu_count() here, that way this just does the right/intended thing if the number of CPUs is increased in the Kubernetes job spec?
There was a problem hiding this comment.
Good point, updated to use os.cpu_count() by default, and falling back to DEFAULT_EXPORT_PROCESSES if failing to get CPU count.
docker/exporter/exporter.py
Outdated
| '--bucket', | ||
| help='Bucket name to export to', | ||
| default=DEFAULT_EXPORT_BUCKET) | ||
| parser.add_argument('--ecosystem', required=True, help='Ecosystem to upload') |
There was a problem hiding this comment.
If I'm understanding this correctly, the intended use of this flag is either an ecosystem name, or the special string "list", which retains the old behaviour? Please call out this "list" value in the flag help text.
There was a problem hiding this comment.
list value will upload the ecosystem.txt ecosystem list. Called this out in the help message now!
|
Ended up at 7 because that's the maximum number of cores for a pod a node with 8 cores can support (since some CPU is being used up by metrics and logging driver). It doesn't really matter too much now. I'll switch this to 6 to get a more round number. |
andrewpollock
left a comment
There was a problem hiding this comment.
Awesome performance improvement
Exporter currently take too long to complete (around 1 hour and 10 minutes), and will only get longer to run as the OSV database expands.
This change parallelizes the exporter along each ecosystem, separating the ecosystem export portion of the script from the selection of each ecosystem. This roughly reduced the time taken down to how long the longest ecosystem takes to export, 13 minutes or so.