Skip to content

Client: introduce the retry backoff#2097

Merged
jtopjian merged 2 commits intogophercloud:masterfrom
kayrus:backoff-retry
Jan 23, 2021
Merged

Client: introduce the retry backoff#2097
jtopjian merged 2 commits intogophercloud:masterfrom
kayrus:backoff-retry

Conversation

@kayrus
Copy link
Copy Markdown
Contributor

@kayrus kayrus commented Jan 15, 2021

For #2096

@coveralls
Copy link
Copy Markdown

coveralls commented Jan 15, 2021

Coverage Status

Coverage increased (+0.01%) to 79.788% when pulling 1598318 on kayrus:backoff-retry into bb95efe on gophercloud:master.

@theopenlab-ci
Copy link
Copy Markdown

theopenlab-ci bot commented Jan 15, 2021

Build succeeded.

Copy link
Copy Markdown
Contributor

@jtopjian jtopjian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kayrus Thanks for suggesting this. I agree that that it makes sense to add this functionality to the core client.

I left two general comments.

retryAfter := respErr.ResponseHeader.Get("Retry-After")
if retryAfter == "" {
return e
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that a 429 should always include a Retry-After header, but in the event that one is missing, should the code silently continue (return nil) instead of fail?

I feel that if the code returns nil here, it more closely matches the current behaviour of not supporting a backoff at all. Therefore, we're not introducing a large change in behaviour to the client.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, it should not return nil. e is the error, which comes from the doRequest method. So when there is no Retry-After header or it's wrong, a provider client will behave like the RetryBackoffDisabled is true.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right - I misread the code.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jtopjian Hello, I don't know if this is the correct place to post this but I believe your statement regarding Retry-After needing to be included in 429 responses is incorrect. If you have a look at the relevant section of the RFC it states:

   The response representations SHOULD include details explaining the
   condition, and MAY include a Retry-After header indicating how long
   to wait before making a new request.

Given this I believe it makes sense to, when the header is missing, sleep for a user-modifiable amount of time instead of simply treating the 429 request as an error.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mboruta1 This is a closed pull request, so it's probably not the best place to discuss code. Are you running into a problem that this PR caused?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Openstack cluster I run terraform against does not specify Retry-After in its 429 error response. I can change it to do so and thus enable this PR's retry-on-429 feature, but other users of Gopher cloud may not have that option (e.g. users of a public Openstack cloud). Thus I think it makes sense to not pre-condition this feature on the presence of a header that does not need to be present.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mboruta1 I think you might be looking at outdated code. The code that was merged (https://github.com/gophercloud/gophercloud/pull/2097/files) doesn't explicitly look for a Retry-After header. It provides the ability for a developer to define their own RetryBackoffFunc function. This PR included documentation and unit tests that define example RetryBackoffFunc functions, but by default no function is defined and therefore no logic takes place.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, I suppose I should have commented on this instead which defines what appears to be the default RetryBackoffFunc function which, correct me if I am wrong, does look for the Retry-After header. This is the same understanding that Terraform's Openstack Provider plugin has (see this).

Instead of asking users to override the default RetryBackoffFunc when dealing with providers that do not provide the Retry-After header, this case could be handled by the default retry function itself.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mboruta1 Gotcha.

In my opinion, if there is at least one OpenStack service that does not respond with a Retry-After header along with a 429 response, the default RetryBackoffFunc function bundled in gophercloud/utils (and being used by Terraform) should account for that. Opening an Issue/PR at https://github.com/gophercloud/utils to track this would be the best way forward.

sleep = time.Duration(v) * time.Second
}

if v, err := time.Parse(http.TimeFormat, retryAfter); err != nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this code only execute if the above ParseUint was unsuccessful?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, fixed

@kayrus
Copy link
Copy Markdown
Contributor Author

kayrus commented Jan 18, 2021

@jtopjian I think that there should be an indication, that the client received Retry-After code. Otherwise it will sleep silently and this can cause confusion from developer side. Thoughts?

@theopenlab-ci
Copy link
Copy Markdown

theopenlab-ci bot commented Jan 18, 2021

Build succeeded.

@jtopjian
Copy link
Copy Markdown
Contributor

Yeah, I thought about this as well.

I'm thinking the best option is to not have a default func for this. Instead, document it in the FAQ, similar to how the round tripper is documented. This way, all implementations of Gophercloud that want to do retries have to implement their own function which can include their own logger, signals, and such.

@kayrus kayrus marked this pull request as ready for review January 21, 2021 21:26
@kayrus
Copy link
Copy Markdown
Contributor Author

kayrus commented Jan 21, 2021

@jtopjian ready for review

@theopenlab-ci
Copy link
Copy Markdown

theopenlab-ci bot commented Jan 22, 2021

Build failed.

@theopenlab-ci
Copy link
Copy Markdown

theopenlab-ci bot commented Jan 22, 2021

Build succeeded.

Copy link
Copy Markdown
Contributor

@jtopjian jtopjian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kayrus Nice work. This looks good to me - we'll give it a shot and see how it goes :)

@jtopjian jtopjian merged commit 3f19ae7 into gophercloud:master Jan 23, 2021
@kayrus kayrus deleted the backoff-retry branch January 24, 2021 10:24
@majewsky
Copy link
Copy Markdown
Contributor

This should also be done for 498. Swift uses 498 instead of 429 for historical reasons.

@kayrus
Copy link
Copy Markdown
Contributor Author

kayrus commented Jan 31, 2021

@majewsky indeed, it uses 498 response code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants