-
Notifications
You must be signed in to change notification settings - Fork 7.8k
Description
A user recently reached out and made us aware of the following scenario:
While performing a call to gh at verify, which reaches out to GitHub's API, we had a blip of availability. A single request failed, for whatever reason, and returned a 500. gh at verify then immediately error'ed out within a few milliseconds, seemingly without any attempts at retrying the request.
As far as we can tell, given our monitoring of our services, had gh at verify retried the request then the command would have succeeded. Consequently,
gh at verifyshould retry requests that return with a 500- probably with some jittery backoff
- up to a certain threshold (10s? 30s?)
If this were a server component, I'd confidently say "exponential backoff for up to 5 minutes". But given that gh is more human oriented, as I write this I'm unsure of whether we should make the retry exponential/last that long. 30s is a long time to hang without feedback to the user.
I assume there is ample prior art within gh, and I will next go looking for it.