quay-pruner: completely overhaul prune-quay.py#2496
Conversation
Pruning had stopped working (up to >26000 image tags), and the reasons were many; one, pruning's always been less deterministic than I'd hope; two, when I switched us to ceph.git/container for building images, I mistakenly changed the format of the 'fulltag' (no longer has a short sha1 in it) and that was sort of driving the pruning process. three, I suspect some of the newer flavors etc. were slipping through the cracks. So here's an attempt to fix all that by changing the algorithm fundamentally; now, tags of a certain manifest digest are considered at the same time, and their sha1 checked in shaman as usual (but only their sha1); if it's found, the tags are all left, and if not, they're all removed. This should be cleaner, faster, and more reliable. Also refactored a lot of the worker routines to util.py so I could add some helper/debug/info scripts: get-tagdates.py generates JSON showing tag-to-age for examining the state of things delete-tags.py takes tags on the CLI to delete, or can be invoked with '--stragglers <age>' to remove anything older than age (as long as it's not in shaman or seems like it might be a 'distinguished' build (with recent release names in its name)). prune-quay.py also now reports summary statistics of its operation.
3e337b2 to
9da0f83
Compare
|
retest |
zmc
left a comment
There was a problem hiding this comment.
Looks like a good functional change, but I had some questions and some small changes I'd like to see.
I'm unsure if this is overkill, but I feel like it might be helpful to use some type hints to guard against things like accidentally changing the shape of objects returned by helper functions.
|
Thanks for the review. I don't really disagree with anything. It's inflexible on domain mostly because this is very specific to the tagging strategy we use for ceph-ci/ceph, and would take some more generalization to be more generally useful, I think. I'll step through and make the changes requested in separate commits and let you know when I'm through. |
|
re: the hardcoding, I know these scripts wouldn't be able to just be pointed at some other registry and magically work, but I did see that there had been some effort to avoid hardcoding via |
|
rewhacked everything but the type hints; I agree and almost did some, but then remembered I'd also have to add a checker and the infra to run it, and got cold feet and decided to save it for later. |
- explain --stragglers a bit in help text - add re match group names and use them - use variables for URLs in messages - dryrun -> dry-run - remove dead code will squash before merge if approved. Signed-off-by: Dan Mick <dan.mick@redhat.com>
|
@zmc anything else? |
|
@zmc anything else you'd like? |
Considering this is already broken now, merging
Pruning had stopped working (up to >26000 image tags), and the reasons were many; one, pruning's always been less deterministic than I'd hope; two, when I switched us to ceph.git/container for building images, I mistakenly changed the format of the 'fulltag' (no longer has a short sha1 in it) and that was sort of driving the pruning process. three, I suspect some of the newer flavors etc. were slipping through the cracks.
So here's an attempt to fix all that by changing the algorithm fundamentally; now, tags of a certain manifest digest are considered at the same time, and their sha1 checked in shaman as usual (but only their sha1); if it's found, the tags are all left, and if not, they're all removed. This should be cleaner, faster, and more reliable.
Also refactored a lot of the worker routines to util.py so I could add some helper/debug/info scripts:
get-tagdates.py generates JSON showing tag-to-age for examining the state of things
delete-tags.py takes tags on the CLI to delete, or can be invoked with '--stragglers ' to remove anything older than age (as long as it's not in shaman or seems like it might be a 'distinguished' build (with recent release names in its name)).
prune-quay.py also now reports summary statistics of its operation.