-
Notifications
You must be signed in to change notification settings - Fork 18.9k
containerd integration: imageservice: add contexts to various methods (step 1) #43939
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
2c7339b to
b477634
Compare
c6bb210 to
eeb19b2
Compare
|
whoop; at least this one is green; I updated #43818 based on this PR (to start with the minimum amount of changes for that PR to work) |
corhere
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Plumbing contexts through a function inevitably makes it possible for an operation to fail for a completely benign reason: the context was canceled. A function could misbehave if it is not prepared for the possibility that context-accepting function calls could be canceled. Best-case scenario it returns an inaccurate error if a context is canceled. Worst-case scenario is a timing-dependent bug which occurs only if the context is canceled at a specific point in the operation's execution. Given the risk of introducing bugs which could be really troublesome to debug—or consistently reproduce—I think we have to make sure all functions are cancelation-aware when we plumb contexts through them. Since Go convention is to signal that a function call was canceled by returning the ctx.Err() value, cancelation-awareness is necessarily an aspect of error handling. So I think we have to plumb error-handling up into the caller at the same time as we plumb contexts down into a callee function.
daemon/images/image.go
Outdated
| func (i *ImageService) manifestMatchesPlatform(ctx context.Context, img *image.Image, platform specs.Platform) bool { | ||
| logger := logrus.WithField("image", img.ID).WithField("desiredPlatform", platforms.Format(platform)) | ||
|
|
||
| ls, leaseErr := i.leases.ListResources(context.TODO(), leases.Lease{ID: imageKey(img.ID().Digest())}) | ||
| ls, leaseErr := i.leases.ListResources(ctx, leases.Lease{ID: imageKey(img.ID().Digest())}) | ||
| if leaseErr != nil { | ||
| logger.WithError(leaseErr).Error("Error looking up image leases") | ||
| return false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
manifestMatchesPlatform returning false could mean the manifest does not match the platform, or it could mean ListResources returned an error. The caller can't tell them apart, so could erroneously proceed believing that it had been given accurate information when in fact the context happened to be canceled at an inopportune time. This kind of ambiguity has always existed, but was much less likely to be an issue on a healthy system before contexts were plumbed through. This is why I think we need to also plumb errors through whenever we plumb contexts through a function.
func (i *ImageService) manifestMatchesPlatform(ctx context.Context, img *image.Image, platform specs.Platform) (bool, error)
daemon/create.go
Outdated
|
|
||
| if opts.params.Platform == nil && opts.params.Config.Image != "" { | ||
| if img, _ := daemon.imageService.GetImage(opts.params.Config.Image, opts.params.Platform); img != nil { | ||
| if img, _ := daemon.imageService.GetImage(ctx, opts.params.Config.Image, opts.params.Platform); img != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ignoring an error-return value on a function which takes a context argument could potentially lead to incorrect behaviour if the caller is unaware the function call was canceled. In this particular case, the image platform compatibility check could be skipped if the context is canceled, which could result in a container erroneously being created when the request should have been rejected.
When plumbing contexts through a function, at which stage do we want to audit the error handling? When a PR adds a context argument to a function's signature, should we audit the body of the function whose signature is modified (e.g. GetImage) or should we audit all the call sites of that function (e.g. the line this comment is attached to)? I personally would prefer the latter as the code won't compile until every call site of the plumbed-in function is touched, so the diff naturally highlights every call site which needs to be audited.
|
^^ also discussing this in the maintainers meeting; @corhere had some ideas on different approach on handling the context, and context-cancellation, which may be a good topic for our containerd-integration call ./cc @bsousaa @rumpl I think next meeting is on a "whaleness" day, but perhaps it would be an idea to write down some approaches so that we can decide which one best fits our situation best. |
|
A function must be prepared to handle the cancelation of any function calls it passes a (potentially) cancelable context into. As a corollary, functions for which it can be statically proven that under no circumstances can a cancelable context be passed into a function call do not need to be cancelation-aware. Therefore:
I think it is possible to to plumb contexts through iteratively without getting overwhelmed or missing anything. The general idea is to always maintain some invariants while refactoring which do not require having to recursively audit all the callers or callees of a refactored function. Here's what I have come up with:
Refactoring could proceed iteratively, touching a chosen subset of the codebase at a time while maintaining the invariants. When a context argument is added to a function, all its call sites need only be modified to pass in The major flaw in the iterative approach is that values cannot be propagated across the |
972db81 to
d5c25c3
Compare
d5c25c3 to
2d59f8c
Compare
|
@rumpl looks like this may be related 🤔 |
| _, err := c.imageBackend.GetImage(ctx, spec.Image, imagetypes.GetImageOpts{}) | ||
| if err == nil { | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| _, err := c.imageBackend.GetImage(ctx, spec.Image, imagetypes.GetImageOpts{}) | |
| if err == nil { | |
| return nil | |
| } | |
| _, err := c.imageBackend.GetImage(ctx, spec.Image, imagetypes.GetImageOpts{}) | |
| if errors.Is(err, context.Canceled) || errors.Is(err, context.DeadlineExceeded) { | |
| return err | |
| } | |
| if err == nil { | |
| return nil | |
| } |
2d59f8c to
d1a6d70
Compare
This one looks to be failing still (on
|
9a410ab to
d1a6d70
Compare
daemon/images/image_pull.go
Outdated
| } | ||
| if err != nil && img == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the other special cases where err != nil && img != nil? Why is silently ignoring the error the correct behaviour?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, fixed
|
#44091 was merged, so this PR can be rebased (to drop the first commit) |
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
6ba017b to
33d04c1
Compare
Done |
33d04c1 to
89350f9
Compare
thaJeztah
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| if img != nil { | ||
| p := maximumSpec() | ||
| imgPlat := v1.Platform{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for this PR, but (also related to the other question about nil error and nil image), perhaps we should look at GetImage to return a sentinel error in some of these conditions.
corhere
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! ![]()
daemon/images/image_pull.go
Outdated
| } else { | ||
| if err != nil { | ||
| return err | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
| } else { | |
| if err != nil { | |
| return err | |
| } | |
| } | |
| } else if err != nil { | |
| return err | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LOL... I wanted to make that comment, and thought "let it be" 😂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised none of the linters complained!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to enable more linters; looks like a bunch were deprecated, and there's a couple of interesting new ones; was testing some of them already on our code.
Not sure if any of them picked up this one though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m very good at dodging linters
Signed-off-by: Sebastiaan van Stijn <github@gone.nl> Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
89350f9 to
779a5b3
Compare
Uh oh!
There was an error while loading. Please reload this page.