Skip to content

Add repo sync command#3813

Merged
samcoe merged 20 commits intotrunkfrom
repo-sync
Aug 6, 2021
Merged

Add repo sync command#3813
samcoe merged 20 commits intotrunkfrom
repo-sync

Conversation

@samcoe
Copy link
Contributor

@samcoe samcoe commented Jun 10, 2021

This PR adds the repo sync command. The command allows syncing between two repositories, referred to as destination repository and source repository, where changes from the source repository are synced to the destination repository. When a destination repository is not specified by an argument the current local directory is chosen as the destination repository. The command will try to smartly pick the source repository based on the destination repository but it can also be specified using the --source flag. By default the source repository default branch will be synced but can be overridden with the --branch flag.

Additionally this PR adds in the counterfeiter library as an experiment with generating test doubles for interfaces to see if we like it as an alternative to how we are writing tests now.

Examples syncing a local repo
Screen Shot 2021-06-10 at 10 37 34 AM

Examples syncing a remote repo
Screen Shot 2021-06-10 at 10 38 39 AM

Example trying to sync a remote repo that is not a fork
Screen Shot 2021-06-10 at 10 39 15 AM

Example trying to sync a remote repo with diverging commits from parent repo
Screen Shot 2021-06-10 at 10 40 09 AM

Fixes #444

@samcoe samcoe self-assigned this Jun 10, 2021
@samcoe samcoe marked this pull request as ready for review June 14, 2021 15:06
@samcoe samcoe requested a review from a team June 15, 2021 12:34
Copy link
Contributor

@mislav mislav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fired up about this command!

I would like to see two major simplifications to the command's implementation:

  1. Removal of the interactive prompt. I think that the prompt doesn't add much value to the user, but adds much complexity to the command. As far as I'm concerned as a user, syncing should always "just work" unless a sync would result in a loss of data. In that case, it should fail and instruct me to use --force.
  2. Splitting up the mode of running when we are syncing the local git repo vs. syncing a remote fork with its parent. I feel these are two very different approaches, but I think they are intertwined in code. Having two completely separate implementations could be beneficial for long-term maintenance of this command.

@billygriffin
Copy link
Contributor

Removal of the interactive prompt. I think that the prompt doesn't add much value to the user, but adds much complexity to the command. As far as I'm concerned as a user, syncing should always "just work" unless a sync would result in a loss of data. In that case, it should fail and instruct me to use --force.

I'm not 100% sure I agree with this, though I see what you mean. I'm wondering whether for a lot of people if the prompt is just an extra level of confidence that it's going to do what they think it's going to do (or not). Given how confused we all were in talking through all the possible scenarios, I'm just concerned that if it "just works" it raises the potential for uncertainty and frustration. Open to either way, but wanted to raise that for discussion with the team.

@samcoe samcoe requested a review from mislav June 23, 2021 21:22
@samcoe
Copy link
Contributor Author

samcoe commented Jun 24, 2021

@billygriffin @vilmibm @mislav At first I was of the mind that the confirmation prompt helped with reducing confusion of the command, and in our offline discussion I said as much, but after addressing the PR comments and working on the documentation for the command more I have flipped my opinion to match @mislav. The confirmation prompt doesn't add a lot of value.

I think when we originally discussed the functionality that this command would encompass, it was confusing because we were envisioning it doing a lot more, but in its current form I think this command is fairly straight forward. Additionally, I think that if we are trying to reduce confusion about what a command is doing a confirmation prompt is not the right solution, but improving docs or reworking the command to do less is more appropriate. Lastly, the command is not doing anything destructive unless the --force flag is used, and I am worried about introducing the behavior of confirmations before non-destructive actions since we don't have it anywhere else from what I can tell.

@samcoe samcoe requested a review from a team June 24, 2021 15:39
@vilmibm
Copy link
Contributor

vilmibm commented Jun 24, 2021

that works for me.

Copy link
Contributor

@vilmibm vilmibm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some testing and I'm a bit concerned about the reliability of this command; I quickly got into a state that was confusing and erroneous.

Setup

I have two repositories:

  • evilmibm/cackle
  • vilmibm/cackle (fork of evilmibm/cackle)

I cloned vilmibm/cackle, setting evilmibm/cackle as the base repo.

I then committed to both parent and fork, ensuring there would be a conflict.

expected conflict, got false positive

gh repo sync # allow to sync from evilmibm/cackle
# => this worked fine; local repo matches evilmibm/cackle:master
gh repo sync --source vilmibm/cackle # attempt to sync from the fork
# => claims to have succeeded, but local repo state has not changed

half-completed force sync

I prepared a branch with conflicting changes and saw the expected "can't sync" error; but then when I used force it seemed to work but I do see an opaque git error:

gh repo sync -bevilmibm-patch-1
can't sync because there are diverging changes, you can use `--force` to overwrite the changes
gh repo sync -bevilmibm-patch-1 --force
exit status 128
git log
# => see that I am on evilmibm-patch-1 branch with expected changes; unsure what the error is but it makes me afraid
gh repo sync
✓ Synced .:master from evilmibm/cackle:master
# => claims to succeed...but still on evilmibm-patch-1 branch. nothing has changed.

Copy link
Contributor

@vilmibm vilmibm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found another issue; branch resolution can fail even with source specified when a branch exists in multiple remotes:

image

return err
}
} else {
err = git.Checkout([]string{"--track", fmt.Sprintf("%s/%s", remote, branch)})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should ensure that if there are multiple remotes with the same branch name we will checkout the correct one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the invocation be: git checkout <branch> --track <remote>/<branch>? It looks like the local branch argument is missing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the docs (https://git-scm.com/docs/git-checkout#Documentation/git-checkout.txt---track) the name to use for the local branch will be automatically determined.

Copy link
Contributor

@vilmibm vilmibm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ran through some QA and everything worked well ✨ awesome work

@mislav mislav added the enhancement a request to improve CLI label Aug 4, 2021
Copy link
Contributor

@mislav mislav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation looks good! I suggested a bunch of UI/docs changes that I would ideally like to see before this lands.

One note about the overall functionality: if this commands is designed to sync a single branch from one repository to another, wouldn't repo sync be a misnomer? The command name sounds like it sync repositories with each other, but all it does is it sync a single branch with its upstream equivalent. I wonder if the name might mislead people into misinterpreting what this command actually does. But I don't feel very strongly about this concern; just thinking out loud.

return err
}
} else {
err = git.Checkout([]string{"--track", fmt.Sprintf("%s/%s", remote, branch)})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the invocation be: git checkout <branch> --track <remote>/<branch>? It looks like the local branch argument is missing

}
}
if startBranch != branch {
err = git.Checkout([]string{startBranch})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this covered by one of the above git.Checkout?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand the question. The first git.Checkout is to checkout the branch to sync and then this git.Checkout is to switch back to the initial branch the user was on. I didn't want to assume that the user wanted to switch to the branch that was being sync'ed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I've missed that detail. We do not have to switch branches just to update a non-current branch to another ref: there is git update-ref for that! The benefit of not switching branches unnecessarily is that the filesystem needs fewer writes (since the working copy is not being updated) and git reflog is cleaner, which benefits tooling. I'll open a PR to address this.

@samcoe samcoe requested a review from mislav August 5, 2021 01:30
@samcoe
Copy link
Contributor Author

samcoe commented Aug 5, 2021

@mislav Thanks for the thorough review! I addressed/commented on your concerns. Regarding the name of the command, do you have better idea? I think repo branch-sync might be more accurate but also more verbose.

@mislav
Copy link
Contributor

mislav commented Aug 5, 2021

Regarding the name of the command, do you have better idea? I think repo branch-sync might be more accurate but also more verbose.

I'm fine with repo sync if we in the future expand this command to do more across whole repositories, e.g. sync all local branches to their equivalents? (I'm obviously biased because I that's how I made hub sync to work.) Together with another potential feature to clean up branches belonging to merged PRs, I think this command has the potential to be a swiss army knife for repo syncing. But if all it ever does for the foreseeable future is sync a single branch, then sync-branch might be a more descriptive command name.

Copy link
Contributor

@mislav mislav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

I think we might have discussed this in a meeting in the past, but do you also envision this command being able to sync 3 repos, i.e. local - remote - parent in a single go in the future? I wonder whether people will want to sync their merge branch in their remote fork but also in the local checkout at the same time, and right now it sounds like they will have to make 2 separate repo sync invocations.

return git.HasLocalBranch(branch)
}

func (g *gitExecuter) IsAncestor(ancestor, progeny string) (bool, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love the argument names 😍

@samcoe
Copy link
Contributor Author

samcoe commented Aug 5, 2021

@mislav I don't see this command becoming a swiss army knife type of command and I would like to keep it specific to syncing branches, this could include pruning merged PR branches, and syncing all local branches in the future. I think the main challenge is that the behavior when invoked for a local repo vs a remote repo can get out of sync very fast since we have lots of control over a local repo as opposed to a remote one. In the future when we add features we need to be cognizant of that and think through what the behavior might look like for remote repos or make the features local repo specific.

We purposefully skipped the feature of syncing 3 repos to see if our users actually wanted it, so you are correct that as of now it will have to be two separate invocation of repo sync but I do think that we can introduce that specific case in the future using a flag.

@meyerweb
Copy link

Just tried this out (I’m a little behind) and this is SO GOOD. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement a request to improve CLI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add ability to sync a fork with the upstream repository

5 participants