Skip to content

Conversation

@jedevc
Copy link
Contributor

@jedevc jedevc commented Jul 24, 2023

Follows up from #3400, #3868, #5275, and possibly more that I'm missing.

Previously, we would return the first non-404 error from a host.

This is logical, however, it can result in confusing errors to the user:

  • e.g. we have an HTTP host, and an HTTPS host.

    If the image does not exist, we return "http: server gave HTTP response to HTTPS client". This is technically correct, however, the user is easily confused - the most relevant error in this case is the 404 error.

  • e.g. we have a broken HTTP host that returns 5XX errors, and a HTTP host with authentication.

    On the request for an image, we return the 5XX error directly. However, we have a host later on which returned an authentication error which is now hidden from the user.

Note: this can sometimes be resolved by changing the order of hosts passed in, however this requires 1. knowing ahead of time which hosts are going to return certain errors and 2. this is often not desirable, we'd prefer to use HTTPS if it's available, and only then fallback to HTTP.

To resolve this, we assign each possible error during resolution a "priority" that marks how far through the image resolution process a host/path combo got. Then we return the error with the highest priority, which is much more likely to be the most relevant error to the user.

The ranking of priority then is (from lowest to highest):

  • Underlying transport errors (TLS, TCP, etc)
  • 404 errors
  • Other 4XX/5XX errors
  • Manifest rejection (due to max size exceeded)

We've encountered this in BuildKit (moby/buildkit#3382 (comment), and maybe in docker/buildx#1642), I've reproduced the scenario we hit issues in and created a test case for it as well.

Signed-off-by: Justin Chadwell <me@jedevc.com>
@k8s-ci-robot
Copy link

Hi @jedevc. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@mikebrow
Copy link
Member

/ok-to-test

jedevc added 2 commits July 26, 2023 11:33
Previously, we would return the first non-404 error from a host.

This is logical, however, it can result in confusing errors to the
user:

- e.g. we have an HTTP host, and an HTTPS host.

  If the image does not exist, we return "http: server gave HTTP
  response to HTTPS client". This is technically correct, however, the
  user is easily confused - the most relevant error in this case is the
  404 error.

- e.g. we have a broken HTTP host that returns 5XX errors, and a HTTP
  host with authentication.

  On the request for an image, we return the 5XX error directly.
  However, we have a host later on which returned an authentication
  error which is now hidden from the user.

Note: this *can* be resolved by changing the order of hosts passed in,
however this requires 1. knowing ahead of time which hosts are going to
return certain errors and 2. this is often not desirable, we'd prefer
to use HTTPS if it's available, and only then fallback to HTTP.

To resolve this, we assign each possible error during resolution a
"priority" that marks how far through the image resolution process a
host/path combo got. Then we return the error with the highest priority,
which is much more likely to be the most relevant error to the user.

The ranking of priority then is (from lowest to highest):
- Underlying transport errors (TLS, TCP, etc)
- 404 errors
- Other 4XX/5XX errors
- Manifest rejection (due to max size exceeded)

Signed-off-by: Justin Chadwell <me@jedevc.com>
Signed-off-by: Justin Chadwell <me@jedevc.com>
@dmcgowan dmcgowan requested review from estesp and mikebrow October 3, 2023 14:04
@dmcgowan
Copy link
Member

dmcgowan commented Oct 3, 2023

This one looks good to go! Sorry for the late response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

5 participants