Skip to content

fit get status issue#3755

Merged
milosgajdos merged 1 commit intodistribution:mainfrom
wy65701436:fix-getstatus
Oct 28, 2022
Merged

fit get status issue#3755
milosgajdos merged 1 commit intodistribution:mainfrom
wy65701436:fix-getstatus

Conversation

@wy65701436
Copy link
Copy Markdown
Collaborator

@wy65701436 wy65701436 commented Oct 23, 2022

1, return the right upload offset for client when asks.
2, do not call ResumeBlobUpload on getting status.
3, return 416 rather than 404 on failed to patch chunk blob.
4, add the missing upload close

Signed-off-by: Wang Yan wangyan@vmware.com

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Oct 23, 2022

Codecov Report

Base: 57.06% // Head: 57.05% // Decreases project coverage by -0.00% ⚠️

Coverage data is based on head (1c38c74) compared to base (fb21888).
Patch coverage: 66.66% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3755      +/-   ##
==========================================
- Coverage   57.06%   57.05%   -0.01%     
==========================================
  Files         105      105              
  Lines       10806    10818      +12     
==========================================
+ Hits         6166     6172       +6     
- Misses       3952     3956       +4     
- Partials      688      690       +2     
Impacted Files Coverage Δ
registry/handlers/blobupload.go 47.56% <66.66%> (+0.10%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@sudo-bmitch
Copy link
Copy Markdown
Contributor

Looking through the logic of what happens if the connection drops and a client retries an upload of the same blob to the same location (including the state), where would that fail? I'm not sure if the state is checked earlier, but if not, it looks like the following would cancel the upload:

if size := upload.Size(); size != buh.State.Offset {
defer upload.Close()
dcontext.GetLogger(ctx).Errorf("upload resumed at wrong offset: %d != %d", size, buh.State.Offset)
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
buh.Errors = append(buh.Errors, v2.ErrorCodeBlobUploadInvalid.WithDetail(err))
upload.Cancel(buh)
})
}

Shouldn't that return a 416 with an updated Range and Location header instead, and not cancel the upload?

@wy65701436
Copy link
Copy Markdown
Collaborator Author

wy65701436 commented Oct 24, 2022

In fact, according to this specification, https://github.com/distribution/distribution/blob/main/docs/spec/api.md#get-blob-upload, the client should have a way to get the upload offset through this API.

However, according to the handler dispatcher, any get method with a UUID will be processed first by ResumeBlobUpload, which doesn't make much sense.

1, http servers should not cancel uploads while processing get methods, no matter what the case.
2, think about this situation: the distribution http server restarts during the phase of writing the upload block, but some of it is successfully written to disk. At this point, the client will just retry to upload the same block to the same location. But actually the location has changed and it should be the end + the size of the partial block (successfully written). But the client has no way to know that and has to call the get-blob-upload API to return the new location(offset).

@sudo-bmitch
Copy link
Copy Markdown
Contributor

The path I'm thinking of is:

  • PATCH, partial write
  • Retry PATCH, 416 response
  • Possible fallback to GET/HEAD for state
  • PATCH with new offset/state

I think you're implementing:

  • PATCH, partial write
  • GET/HEAD on any error to get the new offset/state
  • PATCH with new offset/state

The risk I'm seeing with the retry is:

  • PATCH, partial write
  • Retry PATCH, 404 response, upload canceled
  • All further requests fail

I can adjust the PATCH logic in my code to avoid any retries, but I think 416 would be a more appropriate response than a 404 to the retry.

@milosgajdos
Copy link
Copy Markdown
Member

I think returning 416 on PATCH retry seems reasonable 🤔 In fact the spec does [sort of] cover that case

If the server cannot accept the chunk, a 416 Requested Range Not Satisfiable response will be returned and will include a Range header indicating the current status:

That being said, this PR is headed in the right direction, just feel we should expand it?

@wy65701436
Copy link
Copy Markdown
Collaborator Author

I think returning 416 on PATCH retry seems reasonable 🤔 In fact the spec does [sort of] cover that case

If the server cannot accept the chunk, a 416 Requested Range Not Satisfiable response will be returned and will include a Range header indicating the current status:

That being said, this PR is headed in the right direction, just feel we should expand it?

I agree that we should disable upload.Cancel(buh) in the error handling of ResumeBlobUpload. However, I'm not sure if we need to update the response code from 404 to 416, which would be a breaking change for the client?

@sudo-bmitch
Copy link
Copy Markdown
Contributor

sudo-bmitch commented Oct 24, 2022

That being said, this PR is headed in the right direction, just feel we should expand it?

Yes, the PR certainly fixes an issue that needs to be fixed. So either expand or a separate PR would both be good.

I picked 416 because it's defined in the spec and I think GHCR does this already.

@milosgajdos
Copy link
Copy Markdown
Member

I picked 416 because it's defined in the spec and I think GHCR does this already.

Personally, I'd also go for 416. Imho, that's the correct behaviour (according to spec, anyways) -- I see nothing wrong with nudging the existing clients towards this path.

@wy65701436
Copy link
Copy Markdown
Collaborator Author

@milosgajdos @sudo-bmitch I've updated the code according to the about conversation, can you help to review it?

@sudo-bmitch
Copy link
Copy Markdown
Contributor

Overall design looks right to me, but I'm not familiar enough with the code base. I'll make an intentionally broken client to do some testing.

Copy link
Copy Markdown
Contributor

@sudo-bmitch sudo-bmitch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gave it a quick test and the changes LGTM.

@wy65701436
Copy link
Copy Markdown
Collaborator Author

1, return the right upload offset for client when asks.
2, do not call ResumeBlobUpload on getting status.
3, return 416 rather than 404 on failed to patch chunk blob.
4, add the missing upload close

Signed-off-by: Wang Yan <wangyan@vmware.com>
Copy link
Copy Markdown
Member

@milosgajdos milosgajdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! @deleteriousEffect PTAL

@wy65701436 wy65701436 requested a review from heww October 28, 2022 07:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants