Skip to content

fix dropped pull progress output due to canceled context#2254

Merged
tonistiigi merged 1 commit intomoby:masterfrom
coryb:issue-2248
Jul 15, 2021
Merged

fix dropped pull progress output due to canceled context#2254
tonistiigi merged 1 commit intomoby:masterfrom
coryb:issue-2248

Conversation

@coryb
Copy link
Copy Markdown
Collaborator

@coryb coryb commented Jul 14, 2021

fixes #2248

fixes moby#2248

Signed-off-by: coryb <cbennett@netflix.com>

// progressFactory needs the outer context, the context in `p.g.Do` will
// be canceled before the progress output is complete
progressFactory := progress.FromContext(ctx)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking at FromContext implementation it only uses context for the ctx.Value(contextKey) call and later uses another context coming with the call. So why does it matter if this context gets canceled. Also, it is possible that this context also gets canceled before the blobs are actually pulled.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, it was not obvious to me either.

The progress factory is not used until later when we call the controller Start

c.writer, _, _ = c.WriterFactory(ctx)

Start is eventually called here when we need to Unlazy the ref:

ctx, stopProgress = p.dh.Progress.Start(ctx)

By the time we call Start, the original context from the p.g.Do has already been canceled, the new writer immediately gets ignored via:
case <-pr.ctx.Done():
return

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in slack, it seems the reader is what is getting associated with the canceled context via the flightcontrol on g.Do:

ctx := newContext(c) // newSharedContext
pr, pctx, closeProgressWriter := progress.NewContext(context.Background())
c.progressCtx = pctx
c.ctx = ctx
c.closeProgressWriter = closeProgressWriter

So this fix is really about using the reader associated with the CacheKey ctx

@tonistiigi tonistiigi added this to the v0.9.0 milestone Jul 14, 2021
Copy link
Copy Markdown
Member

@tonistiigi tonistiigi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed in slack, there might be a case where progress gets lost in here if there are parallel calls to this group as previously the magic in flightcontrol would have cached it and sent to all callers while now progress from only one caller is used. When I tested it I couldn't repro on the default case and when pulling the same image both still showed progress. Might be different if only a layer is shared.

So I'm ok with this as a fix but might make sense to look into it further sometime to figure out if there is something more to change.

@coryb
Copy link
Copy Markdown
Collaborator Author

coryb commented Jul 14, 2021

yeah, I am not sure what else we could do here. I feel like the progress system generally needs a redesign to remove the need for progress.Controllers, but at the moment I don't know what that design would look like. I will give it some thought, but unlikely to have much time in the near future.

Copy link
Copy Markdown
Collaborator

@sipsma sipsma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

yeah, I am not sure what else we could do here. I feel like the progress system generally needs a redesign to remove the need for progress.Controllers, but at the moment I don't know what that design would look like. I will give it some thought, but unlikely to have much time in the near future.

I agree, progress.Controller was very much kludge piled on top of the existing progress code when I was a lot less familiar with the codebase and trying to get lazy pull working.

I think all of the progress code might be easier to follow if it was replaced by something like a singleton ProgressManager object that centralized the progress stream handling (similar to how there's a singleton CacheManager, Solver, Worker, etc). If it had sort of a pub-sub style API you could decouple the progress stream for a single build from the "event publishers" and thus maybe get rid of all the confusing context handling, the need for callbacks inside descriptor handlers, etc.

However, I say all this as another person who will not have time for it in the near future... @tonistiigi Do you think we should open an issue like "Refactor progress streams"? Happy to do so if you are okay with a somewhat vague issue like that.

@tonistiigi tonistiigi merged commit a181832 into moby:master Jul 15, 2021
@tonistiigi
Copy link
Copy Markdown
Member

Do you think we should open an issue like "Refactor progress streams"?

yes, but try to put some description of what is actually wrong with the current one

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[0.9] blob download progress broken

3 participants