Avoid stat round-trips when fetching a blob by aaronlehmann · Pull Request #1226 · distribution/distribution

aaronlehmann · 2015-12-01T23:30:26Z

Without this commit, three round-trips are required to fetch a blob with
a progress bar. The first is a call to Stat (HEAD request), to determine
the size. Then Open is called, which also calls Stat, and finally
performs a GET request.

Only the GET request is actually needed. The size of the blob can be
sniffed from Content-Length in the GET response.

This commit changes HTTPReadSeeker to automatically detect the size from
Content-Length instead of requiring it to be passed in. The Stat call is
removed from Open because it is no longer necessary.

HTTPReadSeeker now takes an additional errorHandler callback argument which
translates an unsuccessful HTTP response into an appropriate API-level
error. Using a callback for this makes it possible to avoid leaking the
repsonse body to Read's caller, which would make lifecycle management
problematic.

Fixes #1223

stevvooe · 2015-12-02T00:30:17Z

registry/client/transport/http_reader.go

I really don't like during the close as a condition of calling Error. Is there any reason we can't tie the body lifecycle to the lifecycle of the readseeker?

I don't like it much either.

The problem is that there may be many HTTP requests created by each HTTPReadSeeker. Each call to Read can potentially cause a new HTTP request, whose response body needs to be closed. That's why I was associating them with the error object.

How about keeping a slice in the httpReadSeeker struct that tracks every response body that was associated with an error condition, and closing all of those when the Close method is called?

@aaronlehmann This is how it was before. Actually, there was only one request open at any time. If it was still open on the Close, it would have the body closed, as well.

I guess the question here is whether it's reasonable to close the body on the next Read call. For example, should we allow code that does this:

for ... { _, err := readSeekCloser.Read(p) if err != nil { if httpError, ok := err.(*transport.HTTPError); ok { go doSomethingAsyncWithTheBody(httpError.Resp.Body) } } }

This would be unsafe in the model where we only allow one open response body a time. But it seems like a weird case, and thinking about it some more, breaking this kind of code is probably better than potentially keeping many sockets open until the HTTPReadSeekCloser is closed, potentially hitting resource limits in the mean time.

So I'll move the body closing from Error to Close/reader and keep at most one response body open at a time, if you think that's reasonable.

Updated to close the Body of a response to an unsuccessful request on the next call to Read or Seek (before issuing a new request), or Close.

Updated again to avoid exposing the response through an error return. Instead, the HTTPReadSeeker takes a callback function that generates errors.

stevvooe · 2015-12-02T22:04:33Z

registry/client/transport/http_reader.go

Can this statement be in (*httpReadSeeker).Seek? I think we can avoid having both seekOffset and offset.

Let's add a comment on the importance of this statement.

Without this commit, three round-trips are required to fetch a blob with a progress bar. The first is a call to Stat (HEAD request), to determine the size. Then Open is called, which also calls Stat, and finally performs a GET request. Only the GET request is actually needed. The size of the blob can be sniffed from Content-Length in the GET response. This commit changes HTTPReadSeeker to automatically detect the size from Content-Length instead of requiring it to be passed in. The Stat call is removed from Open because it is no longer necessary. HTTPReadSeeker now takes an additional errorHandler callback argument which translates an unsuccessful HTTP response into an appropriate API-level error. Using a callback for this makes it possible to avoid leaking the repsonse body to Read's caller, which would make lifecycle management problematic. Fixes distribution#1223 Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>

stevvooe · 2015-12-02T22:24:44Z

LGTM

dmcgowan · 2015-12-02T22:39:04Z

LGTM

Avoid stat round-trips when fetching a blob

We were calling Stat for each layer to get the size so we could indicate progress, but distribution/distribution#1226 made it possible to get the length from the GET request that Open initiates. Saving one round-trip per layer should make pull operations slightly faster and more robust. Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>

This calls Stat before Open, which should be unnecessary because Open can handle the case of a nonexistent blob. Removing the Stat saves a round trip. This is similar to the removal of stat in Open in distribution#1226. Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>

GordonTheTurtle added group/distribution labels Dec 1, 2015

aaronlehmann mentioned this pull request Dec 1, 2015

Expose blob size through BlobProvider's Open method #1223

Closed

stevvooe reviewed Dec 2, 2015
View reviewed changes

aaronlehmann force-pushed the avoid-stat-roundtrips branch 2 times, most recently from cc03f58 to 8a12c63 Compare December 2, 2015 21:45

stevvooe reviewed Dec 2, 2015
View reviewed changes

aaronlehmann force-pushed the avoid-stat-roundtrips branch from 8a12c63 to 6beeb93 Compare December 2, 2015 22:21

stevvooe added a commit that referenced this pull request Dec 3, 2015

Merge pull request #1226 from aaronlehmann/avoid-stat-roundtrips

2b63f65

Avoid stat round-trips when fetching a blob

stevvooe merged commit 2b63f65 into distribution:master Dec 3, 2015

tonistiigi mentioned this pull request Dec 3, 2015

Vendor distribution and fix inspect by sha256 prefix moby/moby#18409

Merged

aaronlehmann mentioned this pull request Dec 4, 2015

Avoid a HEAD request for each layer in a v2 pull moby/moby#18418

Merged

aaronlehmann mentioned this pull request Dec 11, 2015

Remove unnecessary stat from blob Get method #1257

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid stat round-trips when fetching a blob#1226

Avoid stat round-trips when fetching a blob#1226
stevvooe merged 1 commit intodistribution:masterfrom
aaronlehmann:avoid-stat-roundtrips

aaronlehmann commented Dec 1, 2015

Uh oh!

stevvooe Dec 2, 2015

Uh oh!

aaronlehmann Dec 2, 2015

Uh oh!

stevvooe Dec 2, 2015

Uh oh!

aaronlehmann Dec 2, 2015

Uh oh!

aaronlehmann Dec 2, 2015

Uh oh!

aaronlehmann Dec 2, 2015

Uh oh!

stevvooe Dec 2, 2015

Uh oh!

stevvooe Dec 2, 2015

Uh oh!

stevvooe commented Dec 2, 2015

Uh oh!

dmcgowan commented Dec 2, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

aaronlehmann commented Dec 1, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

stevvooe commented Dec 2, 2015

Uh oh!

dmcgowan commented Dec 2, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants