Skip to content

quay-pruner: completely overhaul prune-quay.py#2496

Merged
djgalloway merged 2 commits intoceph:mainfrom
dmick:fix-quay-pruner
Dec 5, 2025
Merged

quay-pruner: completely overhaul prune-quay.py#2496
djgalloway merged 2 commits intoceph:mainfrom
dmick:fix-quay-pruner

Conversation

@dmick
Copy link
Member

@dmick dmick commented Nov 11, 2025

Pruning had stopped working (up to >26000 image tags), and the reasons were many; one, pruning's always been less deterministic than I'd hope; two, when I switched us to ceph.git/container for building images, I mistakenly changed the format of the 'fulltag' (no longer has a short sha1 in it) and that was sort of driving the pruning process. three, I suspect some of the newer flavors etc. were slipping through the cracks.

So here's an attempt to fix all that by changing the algorithm fundamentally; now, tags of a certain manifest digest are considered at the same time, and their sha1 checked in shaman as usual (but only their sha1); if it's found, the tags are all left, and if not, they're all removed. This should be cleaner, faster, and more reliable.

Also refactored a lot of the worker routines to util.py so I could add some helper/debug/info scripts:

get-tagdates.py generates JSON showing tag-to-age for examining the state of things

delete-tags.py takes tags on the CLI to delete, or can be invoked with '--stragglers ' to remove anything older than age (as long as it's not in shaman or seems like it might be a 'distinguished' build (with recent release names in its name)).

prune-quay.py also now reports summary statistics of its operation.

Pruning had stopped working (up to >26000 image tags), and the
reasons were many; one, pruning's always been less deterministic
than I'd hope; two, when I switched us to ceph.git/container for
building images, I mistakenly changed the format of the 'fulltag'
(no longer has a short sha1 in it) and that was sort of driving
the pruning process.  three, I suspect some of the newer flavors
etc. were slipping through the cracks.

So here's an attempt to fix all that by changing the algorithm
fundamentally; now, tags of a certain manifest digest are considered
at the same time, and their sha1 checked in shaman as usual (but
only their sha1); if it's found, the tags are all left, and if not,
they're all removed.  This should be cleaner, faster, and more
reliable.

Also refactored a lot of the worker routines to util.py so I could
add some helper/debug/info scripts:

get-tagdates.py generates JSON showing tag-to-age for examining the
state of things

delete-tags.py takes tags on the CLI to delete, or can be invoked
with '--stragglers <age>' to remove anything older than age (as long
as it's not in shaman or seems like it might be a 'distinguished' build
(with recent release names in its name)).

prune-quay.py also now reports summary statistics of its operation.
@dmick
Copy link
Member Author

dmick commented Nov 13, 2025

retest

zmc
zmc previously requested changes Nov 14, 2025
Copy link
Member

@zmc zmc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a good functional change, but I had some questions and some small changes I'd like to see.

I'm unsure if this is overkill, but I feel like it might be helpful to use some type hints to guard against things like accidentally changing the shape of objects returned by helper functions.

@dmick
Copy link
Member Author

dmick commented Nov 14, 2025

Thanks for the review. I don't really disagree with anything.

It's inflexible on domain mostly because this is very specific to the tagging strategy we use for ceph-ci/ceph, and would take some more generalization to be more generally useful, I think.

I'll step through and make the changes requested in separate commits and let you know when I'm through.

@zmc
Copy link
Member

zmc commented Nov 14, 2025

re: the hardcoding, I know these scripts wouldn't be able to just be pointed at some other registry and magically work, but I did see that there had been some effort to avoid hardcoding via util.QUAYBASE. Not really a blocker but there is at least once instance where the output uses the wrong domain.

@dmick
Copy link
Member Author

dmick commented Nov 15, 2025

rewhacked everything but the type hints; I agree and almost did some, but then remembered I'd also have to add a checker and the infra to run it, and got cold feet and decided to save it for later.

- explain --stragglers a bit in help text
- add re match group names and use them
- use variables for URLs in messages
- dryrun -> dry-run
- remove dead code

will squash before merge if approved.

Signed-off-by: Dan Mick <dan.mick@redhat.com>
@dmick dmick requested a review from zmc November 15, 2025 01:00
@dmick
Copy link
Member Author

dmick commented Nov 20, 2025

@zmc anything else?

@dmick dmick closed this Nov 20, 2025
@dmick dmick reopened this Nov 20, 2025
@dmick
Copy link
Member Author

dmick commented Nov 25, 2025

@zmc anything else you'd like?

@dmick dmick requested a review from djgalloway December 5, 2025 22:59
@djgalloway djgalloway dismissed zmc’s stale review December 5, 2025 23:53

Considering this is already broken now, merging

@djgalloway djgalloway merged commit f5497c7 into ceph:main Dec 5, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants