Add support for lazily-pulled blobs in cache manager.#1475
Add support for lazily-pulled blobs in cache manager.#1475tonistiigi merged 4 commits intomoby:masterfrom
Conversation
hinshun
left a comment
There was a problem hiding this comment.
My first pass at this PR.
At a high level, it seems like this differs from the proposal in #870, and this PR chose an approach that minimized changes. I think this is fine, but I'm not familiar with this part of the code so I'll leave the higher level discussion to @tonistiigi.
I'll write a summary here so I can quickly understand when I look at this again.
GetByBlobgets called in three use cases:- During scheduler loop, when it executes an edge causing it load the cached result from a cache manager. That in turn will call out to the worker to convert a
Remoteinto aImmutableRef. - During cache import, when the cache manager to called to load the cached result from the cache manager constructed from the cache import.
- During cache export, when the solver calls out to the exporter to
ExportTo, which in turns calls to load thesolver.Result.
- During scheduler loop, when it executes an edge causing it load the cached result from a cache manager. That in turn will call out to the worker to convert a
RefOptionis ainterface{}so consumers of these opts cast before applying to their options struct.- The
content.Provider,progress.Writerand image ref are plumbed down usingperSessionStatewhich are mapped to the session ID. - The meat of this PR is in
getLazyRefadded to the*pullerand theEnsureContentExistsadded toImmutableRef.
# Solve case
(*scheduler).dispatch
-> (*edge).unpark
-> (*edge).execIfPossible
-> (*edge).loadCache
-> (*sharedOp).LoadCache
-> (*combinedCacheManaged).Load
-> (*cacheManager).LoadWithParents
-> (*cacheResultStorage).LoadWithParents
-> (Worker).FromRemote
-> (Accessor).GetByBlob
-> (*cacheResultStorage).Load
-> (Worker).FromRemote
-> (Accessor).GetByBlob# Cache import case
(*lazyCacheManager).Load
-> (*cacheManager).Load
-> (*cacheResultStorage).LoadWithParent
-> (Worker).FromRemote
-> (Accessor).GetByBlob# Cache export case
(*Solver).Solve
-> (*exporter).ExportTo
-> (*cacheResultStorage).Load|
@hinshun Thanks for the comments! I will push the updates asap in the next day, just can't make them quite yet because I haven't set my desktop back up yet after moving to a new place.
Yeah the implementation is different than the original issue proposed, but I believe it's closer to what @tonistiigi suggested more recently on slack when discussing this. He can confirm though.
Your summary is good, I think it's just missing the cases where the solver has a
That's correct, I'd just add that the changes to |
There was a problem hiding this comment.
As discussed in slack, one of the issues that came up with this was the case where a lazy ref is looked up from the cache by another build request. This is the part currently solved by every request creating a lazy-ref in the CacheKey method.
The current approach seems risky and doesn't really follow to ops/solver logic. I think there are 2 possible solutions to avoid this that involve some tweaks to the cache logic.
The first approach would make the cache lookups not work until a ref is lazy. With this approach, when we call cache.Save(), for a lazy ref we would instead store this position as a callback and if the ref gets un-lazied these callbacks would get executed and add the cache records. As lazy refs are only in memory cleanup of these callbacks would be easy as well.
The problem, though, is that this is that it doesn't solve the same problem for the mergeop and stargz cases. In these cases, we definitely would want to have cache working even if stargz has not pulled the for ref or if part of the merged ref has not been pulled.
The second approach is that when lazy ref is accessed by the generated ID (like cache does) it can fail with a typed error requiring that you pass extra information (like a provider for a descriptor and progress info) with the method. Now the question is how a cache lookup can provide this extra information. If the lazy ref already stores the image-ref(s) then it is not that complicated as cache lookup already has access to the credentials through the session. But I'm not sure if it is a good idea to allow cache manager to just ask credentials for any previously cached image. Alternative would be to use something quite similar to the current solution where CacheMap query would save possible "extra load info" that cache lookup could use. But ideally, this is not done through a global map indexed by session but saved back to the job so another vertex from the same job can access it. Maybe using the session as an index if fine as well if this is done outside refs. (The main issue with current logic is that cache functions shouldn't create refs). Not sure how the cleanup would work without storing it to the job though. LMK if this is too confusing... fyi @ktock
RefOption is a interface{} so consumers of these opts cast before applying to their options struct.
@hinshun This is pretty bad. But lets not expand the scope of this PR with it. Feel free to create an issue for adding more typed API for this (or just fix it:))
Okay interesting, I'm focusing on your second approach for now cause it would be nice to have something that works for the stargz/mergeop cases too. Let me double check my understanding of your suggestion here. In a previous iteration of this instead of calling That seems slightly closer to what you're suggesting relative to the current PR's approach but still different in a few crucial ways:
Is that correct? Or am I misunderstanding? If so, makes total sense in general, but I think one tough part might be how to keep What do you think about these possibilities? (I don't love any of them, but want to get your thoughts on them as a starting point):
|
I think smth like https://github.com/moby/buildkit/compare/master...tonistiigi:cache-helpers-example?expand=1 isn't that bad. No new imports are added to the solver package. If we want to avoid changing the One thing I got wrong in that example branch is that helpers shouldn't just come directly from the current |
sipsma
left a comment
There was a problem hiding this comment.
I think smth like https://github.com/moby/buildkit/compare/master...tonistiigi:cache-helpers-example?expand=1 isn't that bad. No new imports are added to the solver package. If we want to avoid changing the Load() method signature we could hide this inside the context as well.
Awesome, makes total sense, including the part about probably needing more of a callback that can traverse ancestor vertexes in the build graph. Did a rough pass of this approach, the only minor question mark left in my head is what to do in the corner case where the callback is looking for a load helper key (such as content-provider:...) and it's present in multiple ancestors in the build graph (i.e. the case we discussed previously where two image source vertexes resolve to the same image in different repos). For now, I'll probably just have it use whichever helper it happens to find first.
Will push the updates in the next few days.
Yes, that should be fine. Digests should provide stability here. Eg. if you today have 2 different image references in the same build that point to the same digest, pull also happens once and there is no predefined order defining which ref was used. |
fe195be to
bd7f19f
Compare
|
Updated from previous comments and w/ a new integration test. Travis is failing for this PR, but I believe it's unrelated; the only failure is I tried an empty force-push to re-run the tests for this PR but got the same error on a different test in that suite, so I'll just leave it as is for now. |
tonistiigi
left a comment
There was a problem hiding this comment.
First pass. No big issues atm.
The biggest issue I have atm is with the "cache-opt" API and how it is split between packages. Should not need a cacheutil package. The generic types can be part of solver (can even use Digest as a key there), and cache package should define the strongly typed implementation side (Provider, Progress definition etc). Using the function to walk ancestors is good but I'm not a fan of using the context for passing anything to cache package. If we pass something with context then it should end where the cache load happens. I think the correct API for cache package is to return a typed error with descriptions about what load opts are needed, then cache loader invokes the ancestor walking to find the opts matching the requested keys and calls again. If context is used to connect cache loading with the ancestor walking I think it is better if context doesn't reimplement Value() but adds the typed walker logic as a single value to the context.
Things I didn't fully investigate yet:
- Progress of layer pulls does not complete
- Load metadata and resolve step seem to run twice for the same image(eg. the resolver is not reused). Don't think this is the case atm. Opentracing output of the http requessts should help here.
- Unpacking happens under the progress of the next vertex.
- Ideally next vertex is not in running state while unlazy happens(you can leave this one to me as it probably means changing the solver a bit).
Done, now there's just
Done, I changed to just make the descriptor handler be a
Fixed, there was a bug in
Believe I fixed this. IIUC, basically the issue was that I solved part of it by just having Additionally, because the lease on the manifests pulled during Based on the debug logs during test runs, I think this is working now. Let me know if you are okay with this solution and/or if I was misunderstanding the concern.
Fixed A couple additional notes:
|
|
The only failure in Travis appears to just be one of the ephemeral |
tonistiigi
left a comment
There was a problem hiding this comment.
Load metadata and resolve step seem to run twice for the same image(eg. the resolver is not reused).
I think this is still the case based on what I see in progressbar. I can add some debug to figure out what is going on.
Also, sorry for the merge conflicts I caused 😳
cache/opts.go
Outdated
| started *time.Time | ||
| Provider content.Provider | ||
| ImageRef string | ||
| ProgressWriter progress.Writer |
There was a problem hiding this comment.
I'm still a bit annoyed that we give the ref access to the generic progress.Writer and the VertexDigest, VertexName that are solver specific concepts. Also, not sure how it works with FromRemote as don't see it passed then. I think we should only pass the narrow interface for the actions cache pkg needs.
There was a problem hiding this comment.
Okay, so what do you think about CacheKey just passing some callback funcs used to start/complete the progress?
There was a problem hiding this comment.
I think define an interface for the methods you need in the cache pkg and then pass in implementation for it when needed.
cache/opts.go
Outdated
| VertexName string | ||
| } | ||
|
|
||
| type MissingDescHandler DescHandlerKey |
cache/opts.go
Outdated
| return fmt.Sprintf("missing descriptor handler for lazy blob %q", digest.Digest(m)) | ||
| } | ||
|
|
||
| func (m MissingDescHandler) FindIn(getter func(interface{}) interface{}) *DescHandler { |
There was a problem hiding this comment.
I think this should not use interface{}. Worker is ok level, where to do the type cast.
| } | ||
| ctx = opentracing.ContextWithSpan(progress.WithProgress(ctx, s.st.mpw), s.st.mspan) | ||
| ctx = session.NewContext(ctx, s.st.getSessionID()) | ||
| ctx = withAncestorCacheOpts(ctx, s.st) |
There was a problem hiding this comment.
I recently took sessionID out of context in #1551 so makes it weird to add it in here. Maybe you can even combine it with the Group interface I added for sessions. This can be follow-up as well.
There was a problem hiding this comment.
Actually, I think this PR reintroduces one of the things fixed in #1551 . Eg. the fetcher created in pull is already connected to a session but that session might change later. Would be better if the real session.Group is added to the request in here so when Provider is accessed it already is connected to updated session. Let me know if this gets too messy for you. We can merge without it and I can carry/fix this later.
There was a problem hiding this comment.
Okay, I'll take a closer look at #1551 and give integrating with the Group a shot
There was a problem hiding this comment.
Alright, took a look, my understanding is the issue comes down to the fact that we construct a fetcher here from the resolver, at which time credentials will be called with whatever groups are present at the time. That's bad because it could be much later and under a different vtx that the actual de-lazy happens but the provider will still use that possibly stale fetcher.
If that's correct then I guess the fix is, in part, to recreate the fetcher from the resolver when the actual pull occurs rather than use the one created earlier, which is easy enough. My only question is if you think it's okay at that time (i.e. this line) to use the same resolver as before (which should be updated w/ sessions in calls to CacheMap and Snapshot such as here), or do you think we need to make a new resolver using the session.Group from the vtx where the actual de-lazy is occurring (i.e. here or in the exporter)?
Basically, should the de-lazy use the session.Groups from the original vtx where the lazy ref got created, or the session.Group from the vtx where the de-lazy is occurring? Or is it irrelevant (I haven't traced through #1551 carefully enough yet to tell)?
There was a problem hiding this comment.
My only question is if you think it's okay at that time (i.e. this line) to use the same resolver as before
Yes, ideally we should reuse as much as possible. New resolver means new tcp connections and new authentication roundtrips that should be avoided.
Basically, should the de-lazy use the session.Groups from the original vtx where the lazy ref got created, or the session.Group from the vtx where the de-lazy is occurring?
From where the de-lazy is occurring. Old may not be valid anymore.
I added this SessionAuthenticator https://github.com/moby/buildkit/blob/master/source/containerimage/pull.go#L127 concept that allows adding new sessions while reusing the resolver https://github.com/moby/buildkit/blob/master/source/containerimage/pull.go#L167 . Was actually thinking to take this further and write a custom containerd authenticator and resolver pool (limited resolver pool already exists for pull) that can remember tokens and reuse connections cross builds(after checking auth). Something like resolverpool.Get(ref, scope ("pull/push"), g session.Group). (Maybe also resolver.WithSession(session.Group) resolver if that helps). That could help simplify this as well. Maybe I have time to work on this tomorrow.
There was a problem hiding this comment.
Okay, for now all I did was make it so the fetcher gets reconstructed at the time of de-lazy.
But it is still using the resolver that is only ever updated w/ session.Groups from calls to CacheKey/Snapshot. So it will not be updated with the session.Groups from the vertex where actual de-lazy is occurring. I'm not sure what the best way to implement that would be as it seems like it would require plumbing the session.Groups for the current vtx all the way through to the provider implementation, where it can be used to update the resolver. I guess we could put the session group in the ctx, but it seems like that was something you were trying to avoid in the previous PR? Let me know what you think or if I'm misunderstanding.
solver/types.go
Outdated
| "time" | ||
|
|
||
| "github.com/containerd/containerd/content" | ||
| "github.com/moby/buildkit/cache" |
There was a problem hiding this comment.
Lets try to avoid this dependency in the solver package.
There was a problem hiding this comment.
Okay, I didn't like this either but felt a little better when I realized solver already had a transitive dependency on cache :)
The import is only needed for the Remote struct. I originally wanted to leave Remote in its original place in solver, but Remote is now needed in cache due to the GetRemote method moving there and trying to import solver into cache created an import loop. Moving Remote to cache worked, but what else would you suggest? I guess cache/remotecache seems like an obvious choice, but are you okay with solver having a direct import from that?
There was a problem hiding this comment.
I realized solver already had a transitive dependency on cache :)
go list -json ./solver | grep cache on master doesn't show a dependency on cache
There was a problem hiding this comment.
I... must have lost my mind at some point in all of this or am misremembering the issue but yeah you're right. I'll try to switch Remote back to solver
cache/refs.go
Outdated
| } | ||
|
|
||
| func (sr *immutableRef) getProvider(ctx context.Context, dh *DescHandler) (content.Provider, error) { | ||
| if isLazy, err := sr.isLazy(ctx); !isLazy { |
There was a problem hiding this comment.
doesn't this function need synchronization against parallel calls?
There was a problem hiding this comment.
I previously looked through the underlying implementation of the content store methods it's calling and concluded the synchronization wasn't necessary here, but I'll go through it all again to double-check and get back
There was a problem hiding this comment.
Okay yeah, I still believe this basically works as is. There's a few relevant pieces of code:
- We always use
contentutil.Copywhen de-lazying, which in turn uses aFetchHandler - The
FetchHandlerinternally opens a content writer with a writer ref consistent for a given oci descriptor. It also usescontent.OpenWriter, which internally will check to see if there's already a writer open for that ref (and thus that descriptor) and, if so, wait until that writer is closed before opening it again. fetchwill also always check to see if opening the writer or committing an already completed writer returns anAlreadyExistserr, in which case it knows the content was already successfully committed by something else and returns a nil err. Same goes for thecontent.Copyfunc it calls internally.
So, given the above, if we check isLazy and it returns true but by the time we actually try to de-lazy it's already been de-lazied by something else, then we should just get a nil error returned by contentutil.Copy and can move on successfully. If something else is in the middle of de-lazying, we'll block until that thing is done de-lazying and, if it succeeded, continue on or if it failed, try de-lazying it again.
The only slightly weird corner case I can think of this is if you loaded the same immutable ref twice but with different desc handlers and then those two were both getting de-lazied at the exact same time, it's possible for you to technically pull it from one remote but then the other cache ref wins the race to setting the Description in the metadata and it will be marked there as having been pulled from a different remote image ref. It seems like the description is mostly just informational and if two instances of the same cache ref are being pulled at the same time it is not really significant which one's image ref ends up in the Description, but let me know if that matters or if there are other cases I'm not thinking of.
There was a problem hiding this comment.
We should avoid the case where to parallel Unlazy requests make an extra HTTP request to the registry, even if some data is ignored later. Also the locking in content.Writer isn't very well thought out, it doesn't handle cancellation of one request gracefully and I believe doesn't even release the lock until a time-based trigger completes. Usually flightcontrol.Do that is used extensively in this repo is a much better solution.
There was a problem hiding this comment.
Ah sure, it should be easy to wrap in a flightcontrol.Do too, will do.
exporter/containerimage/export.go
Outdated
| eg, egctx := errgroup.WithContext(ctx) | ||
| for _, desc := range remote.Descriptors { | ||
| eg.Go(func() error { | ||
| return contentutil.Copy(egctx, contentutil.HashOnlyIngester(), remote.Provider, desc) |
There was a problem hiding this comment.
I don't really get this HashOnlyIngester logic. As your comment says just wrap the Applier so that it calls ReaderAt on Provider(or calls Unlazy() etc). This double hashing seems unnecessary and this construction is very hard to understand.
There was a problem hiding this comment.
Just to be clear, the point wasn't just to take the hash, it was to read the content in full from remote.Provider which had the (magic) side effect of de-lazying it into the local content store (due to the implementation in the cache package). However, now that you mention "double hashing" I do realize that when we do the magic de-lazy to the local content store, we are already doing a hash-verification, so copying to a HashOnlyIngester is actually unnecessary and I can at minimum change this to copy to an ingester that straight up discards data it reads without even taking the hash. I'll do that if nothing else.
But I also agree this all kind of sucks and the approach I suggested in the comment of just having the Applier use remote.Provider directly would be cleaner. However, I didn't see any way of doing so that didn't involve even more large-scale refactoring on top of the already large change here (happy to be wrong though). The problem is the actual diff.Applier interface only has this method
Apply(ctx context.Context, desc ocispec.Descriptor, mount []mount.Mount, opts ...ApplyOpt) (ocispec.Descriptor, error)
which doesn't include a content.Provider. The Applier implementation appears to internally use the content.Store it's hooked up to much earlier (i.e. here and here) as its content.Provider.
So the options I'm seeing besides what I already have are:
- Wrap just the
Applymethod with something that takes thedescargument, de-lazies it (using a similar technique as now, just w/out the double-hashing you mentioned) and then calls the wrappedApplymethod.- I have no opposition to this, I just didn't do it originally because it seemed like basically the same as what I'm currently doing but hidden under an
Applywrapper. Totally fine with updating to that if you think it's clearer though.
- I have no opposition to this, I just didn't do it originally because it seemed like basically the same as what I'm currently doing but hidden under an
- Change the original
Applierto be constructed in the worker implementations with a special content store that can somehow be updated on the fly to use providers for given descriptors as set from theremote.Providerwe obtain in the exporter here.- I'm not entirely sure how this would work, but the idea is appealing on some level. My main concern is that even if it's possible it would expand this PR way too much, but I could of course be overestimating and/or not seeing a simple path to implementing it.
Let me know if this makes any sense and what you think.
There was a problem hiding this comment.
Wrap just the Apply method with something that takes the desc argument, de-lazies it (using a similar technique as now, just w/out the double-hashing you mentioned) and then calls the wrapped Apply method.
I think that is the same that I had in mind.
wrappedApplier(applier, remote/provider) Applier implements Apply that when called invokes copy to be made to contentstore, and then just calls the regular applier that does the extraction from contentstore to snapshotter.
There was a problem hiding this comment.
If this turns out problematic for some reason(eg. the current parallelization issue) then lets add explicit method to ref/remote for this and avoid magic constructions with side-effects.
There was a problem hiding this comment.
Had some more thoughts; just to be super explicit about this since the code has kind of gone back and forth between different approaches here throughout the PR, here's what I'm thinking.
There's 3 places where a de-lazy may need to occur:
- During
ImmutableRef.Extract() - During a local export/unpack (such as the
Appliercase we were discussing) - Sometimes during a push; specifically when a lazy ref is going to be pushed to a registry that doesn't contain the blob yet (but not when the registry already contains the blob in one or more of its repos)
There's also 2 general approaches for how the provider in the Remote returned by ImmutableRef.GetRemote() can be implemented:
- The current approach w/ magic side-effects where the provider is implemented such that opening a reader for it will result first in the entire descriptor blob to be pulled to the local content store and then a reader for the local content store to be returned.
- An alternative, less magical approach where it simply checks to see if the ref is lazy and, if so, set the provider to the one from the remote repo or, if not lazy, sets the local content store as the provider. This means that all callers of
GetRemote()will have to check whether the descriptors in the returned remote are available locally before trying to use them.- Callers can be helped out with this via a
Unlazy(or what I called beforeEnsureContentExists) method on theRemoteorImmutableRefobject
- Callers can be helped out with this via a
In theory, approach 2 is preferable from the standpoint of being less magical and side-effecty, but there's a fairly significant downside when you consider the de-lazies that occur during a push (place 3 in the list above). Namely, all of the logic that determines during a push whether or not the content will need to exist locally or if pull can be skipped is fairly deep in external code (e.g. one place is here where it's determined if you can do a cross-repo push). I'm guessing we can't modify that code to do the checks/calls required in approach 2, so the only other option would be to put all the logic that determines whether we can skip the local pull of content directly in the push util itself. That seems like it would most likely be a fairly large amount of copy/paste and hardcoding of behavior that should just be handled by the library we're calling out to.
So given that, I'm still inclined for approach 1 (though, like I said, I'll fix the whole double-hashing issue). However, I don't really see any reason why we couldn't do approach 1 but also add a util method to Remote called Unlazy that at least makes the code less confusing to read. I'll use remote.Unlazy in the wrapper around the Applier like we were talking about and also in the extract() code path. Internally, Unlazy will just do contentutil.Copy from the remote provider to an ingester that discards everything (without even checking hash) in order to trigger the magic side effect of de-lazying, but at least it will be wrapped in a more explicit method.
Let me know how that all sounds.
There was a problem hiding this comment.
Internally Unlazy will call ref.GetRemote().Provider.ReaderAt(ctx,ref.Desc) that will copy from remote provider to the local contentstore?
Almost but not exactly, due to the issues I mentioned before I think we have to leave GetRemote and the provider it returns the same as it currently is, where when the provider gets opened it first copies from the remote provider to the local content store before returning a reader for the local content store (de-lazies as a side-effect of opening the reader).
If it weren't for the case of de-lazies occurring during pushes that I mentioned in the previous comment, I think we could do what you suggest (ref.GetRemote().Provider.ReaderAt(ctx,ref.Desc)) where the Remote returned by GetRemote is just the actual remote provider w/out any de-lazying as a side-effect, but like I said before I just don't know how to make that work with the push handler code that exists outside this repo (and thus doesn't know how to call Unlazy).
So, given that, if Unlazy is going to be a helper function, it would have to do something like contentutil.Copy(ctx, discardIngester, ref.GetRemote.Provider), where discardIngester is just the equivalent of ioutil.Discard but for ingesters.
However, if we make Unlazy be a method on Remote or ImmutableRef, we could probably make it not have to do the separate contentutil.Copy, it could be coded specially to just copy from the remote provider from DescHandler to the local content store. I think this is what I'd prefer since the contentutil.Copy is obviously sort of wasteful even if it's just writing to a discard ingester.
There was a problem hiding this comment.
If it weren't for the case of de-lazies occurring during pushes that I mentioned in the previous comment, I think we could do what you suggest (ref.GetRemote().Provider.ReaderAt(ctx,ref.Desc)) where the Remote returned by GetRemote is just the actual remote provider w/out any de-lazying as a side-effect,
I'm bit confused about what causes the conflict between push and de-lazy on ref.GetRemote().Provider.ReaderAt . The example you linked https://github.com/containerd/containerd/blob/0f2b15b7af41896190fcb6e7b597be789c11756b/remotes/docker/pusher.go#L129 directly works with descriptor and I don't see it requiring local data. So I don't think it would ever call ReaderAt if descriptor exists or can be cross-pushed. It only calls it if it needs local data and de-lazy is required then anyway.
There was a problem hiding this comment.
Right, what I'm saying is that there are times where de-lazy is required for push and there are times where it isn't required for push. However, the determination of whether it's needed happens deep in external code. So, for buildkit's code, if we don't have the "de-lazy as a side-effect" approach, then we would be forced to just always de-lazy before a push because we don't know ahead of time whether it needs to be present locally. But if we did that we'd be breaking some of the crucial features of lazy refs, namely that you don't have to de-lazy when just re-tagging or doing a cross-repo push.
Of course the other option would be to have buildkit's code do the determination of whether a de-lazy is needed. I was ruling that out for now because it seems like that functionality (such as that cross-repo logic I linked to) isn't public in containerd push handler code and is fairly far from trivial. Let me know though if there's some easier way of determining it, that would obviously change this consideration.
There was a problem hiding this comment.
We discussed this more on Slack, we were actually in agreement, I was just confused about one point. We’ll go with the side effect reader and Unlazy will use that internally, but it will just open the reader to trigger the side effect rather than copy to a discard ingester.
There was a problem hiding this comment.
Updated with this change. I did make one superficial tweak to it, namely, instead of having Unlazy be a util func, I made an Unlazier interface and then callers can check to see if a remote's provider implements that interface, calling it if so. The only small improvement over Unlazy as a util func is that you can avoid having to open and immediately close a reader to the local content store, otherwise it's basically the same in my mind. It's a fairly minor change so if you disagree w/ the approach just let me know and I have zero issue switching to Unlazy as a util func.
source/containerimage/pull.go
Outdated
| } | ||
|
|
||
| if len(p.manifest.Remote.Descriptors) > 0 { | ||
| topDesc := p.manifest.Remote.Descriptors[len(p.manifest.Remote.Descriptors)-1] |
There was a problem hiding this comment.
afaics, this should happen for all layers. Also, a blob digest doesn't need to have the same parents so it doesn't look correct to assume refs parents come from the same remote (if only top digest is tracked).
There was a problem hiding this comment.
My thinking here (which I should probably add a comment for if nothing else) is that while it's true a parent desc could be in the chain of multiple refs and thus come from multiple remotes, it actually won't matter.
Say here you have a chain of descriptors from parent->child of A->B->C (topDesc is thus C), but in a different image you end up with just the chain A->B (where topDesc is thus B) and use an entirely different remote (and progress writer, etc.). If you end up creating lazy-refs for both these desc chains, say then that the A->B chain gets de-lazied first.
When you later get around to de-lazying the A->B->C chain, the implementation in cache realizes that A->B was already de-lazied by something else and should skip de-lazying those descriptors. It will end up just de-lazying what it needs to, C in this case. It thus shouldn't matter that there were some descriptors that had different remotes in different refs.
In addition, for a single ref, all the descriptors in the chain for that ref are going to have the same DescHandler values, so setting the DescHandler for each in the chain seemed unnecessary.
That's the theory anyways; you asked in some other comments about synchronization issues which I need to go look deeper at as obviously that's important to get right for all the above to work. Let me know if this makes sense or if I'm missing something.
There was a problem hiding this comment.
But what if you have A->B->C and A->D->C competing?
There was a problem hiding this comment.
I forgot that the digest of a layer descriptor doesn't include the digest of its parent (this is all still settling in my brain), but I actually still feel like it would still work. If A->B->C and A->D->C are getting de-lazied at the same, each of the individual descriptors will still get de-lazied just once (again, assuming the locking/synchronization is done correctly). It's possible to end up with a mix of like A and C got de-lazied using remote 1 and B got de-lazied using remote 2, but at the end of the day they are the same bytes no matter the remote and end up in the local content store, which is what matters, right? What am I missing still?
There was a problem hiding this comment.
A and C got de-lazied using remote 1 and B got de-lazied using remote 2, but at the end of the day they are the same bytes
I'm not following on the "but at the end of the day they are the same bytes". The case I'm thinking is when there are 2 separate lazy refs (different bytes) that have the same top layer. Afaics when they both get unlazy request only one deschandler is used. So ref that was based on A->B->C but is pulled with a deschandler for A->D->C there could be a problem.
There was a problem hiding this comment.
Right, makes total sense, I was still somehow getting hung up on the layer digest not including its parent digest, but I get it now. I'll change this to set the DescHandler for each desc in the chain, thanks for bearing with me
|
Another thing I noticed is that layers are pulled one-by-one now, not in parallel like before. |
Cool, let me know what you're seeing with the progressbar if I can reproduce it. And no worries, this is a large enough PR that's it's kinda hard to miss :)
Interesting, besides the integ tests I've been running a simple test case to try out the progress bars and it looks like a two-layer image shows the layers at least starting in parallel: but maybe that's what you're seeing and they actually always download 1-by-1? I'll try with a more normal image w/ more layers to see if I can reproduce. In theory the layer pulls should be in parallel due to the waitgroup here, but something could be going wrong |
I think the case I was testing was when it was pulling while exporting. Didn't realize it was a different code path. Also, cross repo push didn't seem to work for me(where it can move layers between repositories without pulling). Do you know if annotations get lost for the descriptors? I can look more into this myself. Not a blocker for PR. |
Looked into this, think the issue that the push util uses the local content store to find any distribution source annotations. Obviously if the ref is lazy this will not work. What can be done is somewhere around this line we could use the desc handler's
3 above feels like the only good option, but let me know if I'm missing anything else obvious. Either way I will concentrate first on fixing the rest of the issues in this PR and leave this one for a follow up if there's not any straightforward immediate fixes. |
Actually, not much of |
|
Most recent update was just a rebase on main, no actual changes. |
|
Comparing the output against the old one it looks like:
Edit: btw, it is not a big issue if schema1 images are just not lazy. Schema1 is not a priority so whatever is easiest works in here. |
Good catch, after looking at the schema1 converter code in more depth I saw that it's basically internally hardcoded to do a pull of the layers and relies on those being present in order to do the conversion. So, I went with your suggestion of just letting schema1 not be lazy for now. I did however update the code to show the progress on pull for that case though. The tty progress for schema1 looks good, but I did notice that occasionally in the plain progress for schema1 there will be final layer updates after I verified that in that case the layer status is actually getting written to the progress stream before the resolve status is getting written, so it's not a synchronization issue on that level. My only guess is that the status updates are getting batched together and then written to the progress output in an arbitrary order? Let me know if you think this matters.
Updated to per-extraction, it's a very easy thing to switch. The only downside I see with per-extraction relative to per-layer is that now because the de-lazy happens within the context of extraction, the time the extract takes includes the time de-lazying the whole image takes, i.e. I don't have strong feelings either way though. |
tonistiigi
left a comment
There was a problem hiding this comment.
The only downside I see with per-extraction relative to per-layer is that now because the de-lazy happens within the context of extraction, the time the extract takes includes the time de-lazying the whole image takes, i.e
I think this might be confusing. I think better to revert it then. Or split unlazy pull and extract although I guess new one is slightly more performant. We can make a change in client side to sum the durations of separate items with same array and reintroduce the single line unpack then. Then we could start, stop and continue the unpack progress.
LGTM
@thaJeztah Not sure how to merge this so it doesn't cause trouble for the moby release. I think porting this will take somewhat significant work and additional changes. We might need to make a special (temp) branch for 20.x if we need some fixes because I don't want to cut v0.8 without this PR as well.
source/containerimage/pull.go
Outdated
| }) | ||
| Digest: desc.Digest, | ||
| } | ||
| if platform != nil { |
There was a problem hiding this comment.
This is bit confusing to me. Don't think there is a case where the platform is allowed to be nil.
There was a problem hiding this comment.
Updated to have platform not be a pointer
|
ping @hinshun |
|
I also tested this in combination with #1622 , the extra requests I saw before were gone after that so these issues were not related to this PR. |
Sounds good, for now I just reverted to what I had before with per-layer extraction status.
Thanks for your guidance throughout this whole PR, appreciate it! Let me know if there are any other follow-ups I can create issues for and/or look into. Right now the ones I'm aware of are 1) improved integration w/ |
Is this the part that was reverted, or is this still a concern? |
cache/blobs.go
Outdated
| var currentDescr ocispec.Descriptor | ||
| if sr.parent != nil { | ||
| eg.Go(func() error { | ||
| err := computeBlobChain(ctx, sr.parent, createIfNeeded, compressionType) |
There was a problem hiding this comment.
nit, this can just be return computeBlobChain right?
cache/blobs.go
Outdated
| } | ||
|
|
||
| var descr ocispec.Descriptor | ||
| var err error |
There was a problem hiding this comment.
nit, can you move these two closer to CompareWithParent?
cache/blobs.go
Outdated
| if descr.Digest == "" { | ||
| // reference needs to be committed | ||
| var lower []mount.Mount | ||
| var release func() error |
There was a problem hiding this comment.
Seems to be overwritten by line 110.
There was a problem hiding this comment.
Yeah it's a bit confusing, I moved the declaration of release as a var to inside the if parent != nil below to make it a bit more clear
| } | ||
|
|
||
| // setBlob associates a blob with the cache record. | ||
| // A lease must be held for the blob when calling this function |
There was a problem hiding this comment.
Do we need to do a leases.FromContext(ctx) check in this function?
There was a problem hiding this comment.
Sure may as well, added
| blobChainID = imagespecidentity.ChainID([]digest.Digest{pInfo.BlobChainID, blobChainID}) | ||
| p2, err := cm.Get(ctx, parent.ID(), NoUpdateLastUsed) | ||
|
|
||
| p2, err := cm.Get(ctx, parent.ID(), NoUpdateLastUsed, descHandlers) |
There was a problem hiding this comment.
Could descHandlers be nil at this point? Looking at line 113, it can be nil if desc.Digest is found in the content store. Is a nil option okay here?
There was a problem hiding this comment.
Yeah nil option will be okay, it gets handled in this loop: https://github.com/sipsma/buildkit/blob/43e079f01768896f6f39ff30b438e628be2cb744/cache/opts.go#L21
cache/refs.go
Outdated
| } else if err != nil { | ||
| return false, err | ||
| } | ||
| return false, nil |
There was a problem hiding this comment.
nit, this can be simplified to return false, err
cache/refs.go
Outdated
| createdAt := GetCreatedAt(sr.md) | ||
| if !createdAt.IsZero() { | ||
| if createdAt, err := createdAt.MarshalText(); err == nil { | ||
| desc.Annotations["buildkit/createdat"] = string(createdAt) |
There was a problem hiding this comment.
nit, seems more intuitive to check err != nil and return ocispec.Descriptor{}, err and then remove the else block to set the annotation.
| if err != nil { | ||
| return nil, err | ||
| } | ||
| dh := dhs[desc.Digest] |
There was a problem hiding this comment.
Yeah but similar to before, you can read from a nil map (just always get a zero'd value), which gives us the expected behavior in this case.
source/containerimage/pull.go
Outdated
| } | ||
|
|
||
| func (p *puller) CacheKey(ctx context.Context, g session.Group, index int) (string, bool, error) { | ||
| func (p *puller) CacheKey(ctx context.Context, g session.Group, index int) (_ string, cacheOpts solver.CacheOpts, _ bool, err error) { |
There was a problem hiding this comment.
I think you should names these just so we can tell what the return values mean.
util/pull/pull.go
Outdated
| } | ||
| if desc.Digest != "" { | ||
|
|
||
| if desc.Digest != "" && func() error { |
There was a problem hiding this comment.
This is pretty rough to read, can we extract this into a function?
source/containerimage/pull.go
Outdated
|
|
||
| descHandler := &cache.DescHandler{ | ||
| Provider: p.manifest.Remote.Provider, | ||
| ImageRef: p.manifest.Ref, |
There was a problem hiding this comment.
Was looking at this for some resolver changes with session.Group and wondered if we really need ImageRef at all. Afaics this is used in 2 places, cross-repo annotation and unlazy description message. For both, I think we should just pass them with GetByBlob and keep in metadata?
There was a problem hiding this comment.
My original concern with that was the case where there's multiple images identical in terms of content but using different image refs. In theory, we could store a list of ImageRefs for the blob in the metadata, which I think would work fine with the cross-repo stuff since you can add multiple distribution source annotations.
The only part I'm not sure about is the unlazy description message; if we have a list of ImageRefs we wouldn't know which one is actually associated with the provider, so if we just chose one it might not actually be the repo we're really pulling it from. Do you think that matters? AFAIK that description message isn't important for functionality, though it would be odd for it to say it pulled the ref from a place it didn't actually do the pull from.
There was a problem hiding this comment.
Do you think that matters?
I don't think it does. The same behavior already exists today. If you pull the same image from 2 refs, one of them will win and another will be skipped as already exists. When you are pulling you can't control what side your current pull will land.
There was a problem hiding this comment.
Okay cool, I made the update to remove ImageRef as a separate commit because it was a little more involved than I originally thought, let me know what you think.
This allows the layers of images to only be pulled if/once they are actually required. Signed-off-by: Erik Sipsma <erik@sipsma.dev>
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
hinshun
left a comment
There was a problem hiding this comment.
Looks good to me. Well done 👏
|
🎉 |
Thanks again! Appreciate all the reviews and glad I could help out with it. Gonna start taking a look at #1431 now... |
|
Thanks @sipsma! @tonistiigi Demo'd this feature Yesterday, and its really cool |
|
where is the docs of this feature ? |
Closes #870
This update allows cache manager's
GetByBlobto optionally be called with a descriptor that does not yet exist in the local content store. When such a descriptor is loaded, the correspondingcache.Managermethod needs to be called with acache.DescHandlersobject as aRefOption.DescHandlersspecifies the provider/progress/image-ref to use for a given ref in the event it needs to be unlazied (i.e. during an extract or for preparation of a mount).DescHandlersget associated with a vertex via a new fieldOptsof typesolver.CacheOptsin thesolver.CacheMapobject, which can optionally be set by vertex implementations.CacheOptsallows associating arbitrary keys (wrapped in a genericinterface{}) with arbitrary values. These values can be accessed throughsolver.CacheOptGetterOf(ctx), which has is a function that accepts a list of keys and returns a map of those keys to their associated values. This function will search "upwards" in the solver graph (traversing parents+ancestors via a DFS) until it finds a vertex that has set the associated key.Currently,
DescHandlersis the only use case for theseCacheOpts(it gets set insource/containerimage/pull.go), but the intention is to potentially be generally useful in case it's needed more in the future. When a lazy ref needs to be loaded, the caller gets a typed error (cache.NeedsRemoteProvidersError) and then can usesolver.CacheOptGetterOf(ctx)to find theDescHandlerfor each of the lazy blobs.The change also includes updates to take advantage of laziness during push's, so that remote blobs are only pulled locally when strictly necessary, but not during, for example, a solve that simply updates the tag of a remote image or a push that can take advantage of a registry's support for cross-repo pushing. This is accomplished by moving the
GetRemotemethod fromworkerdirectly toImmutableRef. IfGetRemoteis called on a lazy ref, theRemote's provider will be setup such that if it's opened for reading, the ref will get unlazied (that is, pulled into the local content store). This results in the blobs staying lazy unless they are actually needed to be read by the push handlers during export.