Skip to content

Retry git clone on git clone failure in gh repo fork --clone#6962

Merged
samcoe merged 4 commits intocli:trunkfrom
jsoref:issue-5118
Feb 21, 2023
Merged

Retry git clone on git clone failure in gh repo fork --clone#6962
samcoe merged 4 commits intocli:trunkfrom
jsoref:issue-5118

Conversation

@jsoref
Copy link
Contributor

@jsoref jsoref commented Feb 3, 2023

Fixes #5118 -- hopefully

This is entirely untested, but the linter likes it, so there's that.

@jsoref jsoref requested a review from a team as a code owner February 3, 2023 19:46
@jsoref jsoref requested review from mislav and removed request for a team February 3, 2023 19:46
@cliAutomation cliAutomation added the external pull request originating outside of the CLI core team label Feb 3, 2023
Copy link
Contributor

@samcoe samcoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsoref Thanks for starting on this work. It looks good so far. I left a comment regarding keeping the retry mechanism internal and not exposing those low level implementation details to our users. Let me know what you think.

Additionally, if we could get a test verifying this retry behavior works I think this is good to ship.

@samcoe samcoe self-assigned this Feb 7, 2023
@jsoref
Copy link
Contributor Author

jsoref commented Feb 7, 2023

I'm happy to drop the flags. I have no idea how to test the failure case.

@samcoe
Copy link
Contributor

samcoe commented Feb 7, 2023

@jsoref No problem. Feel free to commit code for dropping of the flags and one of our team members will take over writing of a test.

@jsoref
Copy link
Contributor Author

jsoref commented Feb 7, 2023

Great. I couldn't even get it to build. The Makefile wasn't helpful; I ran into the incredibly confounding:

$ go mod download
go: modules disabled by GO111MODULE=off; see 'go help modules'

(Which does not explain anything relating to the message.)

And even when I forced it to on, nothing happened.

@mislav
Copy link
Contributor

mislav commented Feb 7, 2023

I ran into the incredibly confounding:

It might help you to remove the legacy GO111MODULE environment variable from wherever it's being set.

GO111MODULE= make

Our project builds in a pristine Go development environment, but we can't guarantee building in an environment you have modified.

@jsoref
Copy link
Contributor Author

jsoref commented Feb 7, 2023

That didn't work at all (I've tried repeatedly while fighting various golang projects), apparently golang hid its setting in some config file, so I wanted go env -w GO111MODULE=... (I may file a PR to improve the messaging, as it's really unhelpful). But even once I did that, I didn't actually get any dependencies. Yes, I have a build env that is used to develop lots of things and has evolved over ~6 years.

@mislav
Copy link
Contributor

mislav commented Feb 7, 2023

Maybe then explicitly setting it to "on" just for the gh project might be the way to go? GO111MODULE=on go build -o bin/gh ./cmd/gh

@jsoref
Copy link
Contributor Author

jsoref commented Feb 7, 2023

That worked. Now I have a gh and the run test and debug test links in VSCode work...

mislav
mislav previously requested changes Feb 7, 2023
Copy link
Contributor

@mislav mislav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking this on! My main concern is that this doesn't solve your original problem (more about that in comments). My secondary concerns are that it would be best that the retry code is testable and that it supports Context cancellation.

if err == nil {
break
}
time.Sleep(retryBackoff * (time.Duration(i) + 1))
Copy link
Contributor

@mislav mislav Feb 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testable Go code should never be allowed to sleep the same period of time in tests as it does during normal runtime. Even a relatively short duration like sleeping 100ms would make tests really slow. Finding a way to inject a shorter (fake) Duration in tests would be crucial.

Also, the Sleep statement would not return early if a Context has been cancelled, therefore if a Context is involved, it would be better to use a Timer and use a select statement to react to either a Timer completing or a Context being cancelled, whichever comes first. Example: https://github.com/cli/oauth/blob/adf5f733ffd14993aad9f3ff02c757a737f7c63b/device/poller.go#L30-L37

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is beyond what I'm up for doing at this time.

It should involve introducing a poller.go, and really there should be a poller_test.go -- and even that repo doesn't have one.

@jsoref jsoref changed the title Retry git clone Retry git clone on git clone failure in gh repo fork --clone Feb 7, 2023
@jsoref jsoref requested a review from mislav February 7, 2023 18:37
@samcoe
Copy link
Contributor

samcoe commented Feb 8, 2023

@jsoref @mislav I went ahead and updated this PR to use a new version of the BackOff package which helped clean things up a lot. I also wrote a couple tests. Let me know what you think. Only thing I was unsure of was the exponential backoff duration variables. Not sure we really need exponential backoff here and we could simplify to just a set backoff duration.

The BackOff package is fairly easy to use and well documented, I would vote for using it going forward in regards to #6985.

I did have to rebase the PR to fix the tests I wrote that were failing due the recent remote add changes.

@samcoe samcoe self-requested a review February 8, 2023 00:24
ios.SetStdoutTTY(tt.tty)
ios.SetStderrTTY(tt.tty)
tt.opts.IO = ios
t.Run(tt.name, func(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something that I had not thought about until now is that when using the -run flag with go test that all this set up happens for each test even though the tests won't be run. If we encapsulate all the code in t.Run the setup won't happen unless the test is actually going to be run.

Comment on lines +336 to +347
cloneDir, err := backoff.RetryWithData(func() (string, error) {
forkedRepoURL := ghrepo.FormatRemoteURL(forkedRepo, protocol)
dir, err := gitClient.Clone(ctx, forkedRepoURL, opts.GitArgs)
if err == nil {
return dir, err
}
var execError errWithExitCode
if errors.As(err, &execError) && execError.ExitCode() == 128 {
return "", err
}
return "", backoff.Permanent(err)
}, backoff.WithContext(backoff.WithMaxRetries(opts.BackOff, 3), ctx))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will users see an indication of the failures/retries?

Since I trigger this problem regularly, I'd love to have a sense of how many retries I'm spending.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As of right now, no they will not. Could be added in though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth adding. That'd help inform decisions about the right time values.

Alternatively, I suppose I could try using this version of gh w/ that logging and see what they say...

Lemme see if I can manage to make that work locally...

Comment on lines +330 to +332
bo.Multiplier = 1.1
bo.MaxInterval = 10 * time.Second
bo.MaxElapsedTime = 5 * time.Minute
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have absolutely no idea what proper numbers would be. I stole mine from something else because it was suggested.

IME, a clone very shortly after the failed clone works, but I don't have a great sense of scale and I'm not sure if it relates to the size of a repository (and if that's by commits, by number of files in the working directory, or some other measure).

My clones come from repositories of various sizes somewhat randomly/haphazardly...

@samcoe samcoe enabled auto-merge (squash) February 21, 2023 03:59
@samcoe samcoe dismissed mislav’s stale review February 21, 2023 03:59

Addressed comments and had offline conversation about changes.

@samcoe samcoe merged commit 7541ee6 into cli:trunk Feb 21, 2023
@jsoref jsoref deleted the issue-5118 branch February 21, 2023 04:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

external pull request originating outside of the CLI core team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fork failed to clone

4 participants