libimage: ListImages: pre-compute dangling/parent#1372
libimage: ListImages: pre-compute dangling/parent#1372openshift-merge-robot merged 5 commits intocontainers:mainfrom
Conversation
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: vrothberg The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
LGTM |
|
With this change listing 1400 images changes from .. ... to ... The difference grows exponentially with the number of images. |
mtrmac
left a comment
There was a problem hiding this comment.
I completely support the idea of ListImages computing all of this in one go (eventually, in the future, it could be a single SQLite read transaction), and ListImages callers opting into computing the expensive parts.
I’m not at all sure about storing the data in Image objects like this. AFAIK the parent/child relationships can change by adding/removing a different image (i.e. calling reload on the image that is caching data wouldn’t help).
All the pre-existing cache fields are values that don’t change over the lifetime of the image. And looking at methods like Image.Tag and Image.Mount, it seems quite possible to me that some caller could hold an Image object for a long time.
I’d rather prefer adding Runtime.ListImagesExtra which returns []ImageWithExtra, a new struct with extra fields.
libimage/runtime.go
Outdated
| // as the layer tree will computed once for all instead of once for | ||
| // each individual image (see containers/podman/issues/17828). | ||
|
|
||
| tree, err := r.layerTree() |
There was a problem hiding this comment.
Note that filterImages can also trigger a layerTree computation.
I suppose a clean way to deal with that could be to have a separate lazyLayerTree holder, passed to the various callees of of ListImages (and to things like getChildren?). That could also avoid the need for the if !(options.Dangling || options.Parent) above.
Asymptotically, calling layerTree once vs. twice makes no difference, and it’s a much smaller performance win than the primary purpose of this PR, so I’m absolutely fine with not dealing with the two calls in this PR. Maybe just a comment to note this duplication?
Could we instead add a I am personally not that worried about potentially outdated data because, in theory, many attributes are outdated on return. But the current caching would prevent users from getting fresh data if needed, which is not good. |
If that is intended to still cache data, that would be inconvenient to use by callers (OTOH that would make it more convenient to ask for fresh data, which is, probably, good). Or just avoid the separate
That’s a very fair point, all of the data is conceptually outdated by the time the store is unlocked — the image might not exist at all when |
a127be7 to
0a3efbb
Compare
|
@mtrmac PTanotherL. I think we're converging but want your blessing before adding tests and wrapping up. |
mtrmac
left a comment
There was a problem hiding this comment.
Overall, given the very true point that all of the data is necessarily outdated by the time the caller receives it, I don’t feel that strongly about the design any more — both this, and the caching, is fair enough.
I have some weak opinions, but I’m fine with the current approach in this PR.
Checking whether an image is dangling and finding a parent image requires building a layer tree. Computing a layer tree is expensive, so add options to `ListImages` to pre-compute the dangling and parent information ahead of time; that requires 1 layer tree instead of N. Context: containers/podman/issues/17828 Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
0a3efbb to
5e751d5
Compare
mtrmac
left a comment
There was a problem hiding this comment.
LGTM, with or without the naming nit.
Computing the layer tree requires listing all images. Certain code paths have all images at hand already, so let's optimize a bit to avoid listing them redundantly. Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
5e751d5 to
349b951
Compare
libimage: ListImages: pre-compute dangling/parent
|
@vrothberg there are merge conflicts |
This reverts commit 349b951. PR containers/pull/1372 hasn't been fully applied by the openshift-ci bot,so let's start with a clean state again. Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Checking whether an image is dangling and finding a parent image requires building a layer tree. Computing a layer tree is expensive, so add options to `ListImages` to pre-compute the dangling and parent information ahead of time; that requires 1 layer tree instead of N. Further add an images argument to computing the layer tree. Context: containers/podman/issues/17828 Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Should be resolved now. That was quiet ugly. It seems like the openshift-ci bot has applied some but not all commits in the PR. That in combination with me applying some changes to the wrong commit made it ugly. So I reverted the already applied commit and squashed the two old ones together. |
|
@Luap99 PTAL, feel free to merge |
|
/lgtm |
Sounds like a bug with the bot. |
Checking whether an image is dangling and finding a parent image requires building a layer tree. Computing a layer tree is expensive, so add options to
ListImagesto pre-compute the dangling and parent information ahead of time; that requires 1 layer tree instead of N. The information will then be cached in the images such that subsequent calls toIsDangling()andParent()can return the cached data.Context: containers/podman/issues/17828