Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

batches: retain external fork name#47397

Merged
courier-new merged 27 commits into
mainfrom
kr/retain-fork-name
Feb 13, 2023
Merged

batches: retain external fork name#47397
courier-new merged 27 commits into
mainfrom
kr/retain-fork-name

Conversation

@courier-new

@courier-new courier-new commented Feb 4, 2023

Copy link
Copy Markdown
Contributor

Closes https://github.com/sourcegraph/sourcegraph/issues/46096.
Closes https://github.com/sourcegraph/sourcegraph/issues/47362.

Don't be alarmed at the huge file diff, it's probably 75% changes to tests and golden test data. Because if this whole forks exercise taught me anything, it's that I know nothing about forks, and tests help me figure out what's even supposed to happen. 😅

To walk you through the changes here at a high level, this PR accomplishes 3 primary things:

  1. Backfill external_fork_name for changesets created on Bitbucket Server and Bitbucket Cloud with an OOB migration.
    • This is only possible for Bitbucket Server/Cloud and not GitHub/GitLab because we can derive the external_fork_name from information available in a changeset's metadata. This is not possible with the existing metadata for changesets on GitHub/GitLab.
    • This is best-effort and that's okay, because Bitbucket Server is the only code host impacted by the backwards incompatibility in the first place, and they're luckily one of the 2 code hosts we can do this for!
    • We'll be able to live-backfill external_fork_name for GitHub and GitLab changesets the next time they are synced, because we also...
  2. Set external_fork_name when handling changeset metadata for future changesets, in SetMetadata
    • This method is invoked in every code host source on every method which handles a pull request, from Load to Open to Merge to Comment, meaning we'll catch any updates to the fork repo names (including ones on GitHub and GitLab) the very next time the changeset is synced, which is often immediately after Sourcegraph starts up and the background worker queues a sync.
    • On GitHub, this is made possible by requesting an additional field on the Repo with our GraphQL requests.
    • On GitLab, this is made possible by setting an additional property when we make follow-up requests to "decorate" the merge request.
  3. Use external_fork_name when retrieving the fork repo for future requests
    • Previously we had two methods two methods on the Source interface for getting the fork: GetUserFork and GetNamespaceFork. But the purpose of their differentiation was fuzzy and mostly to separate default behavior before a changeset has been created to specific behavior after it's already been published to a fork. This fuzzy differentiation was partially responsible for the bug with updating changesets on Forks on GitHub ( https://github.com/sourcegraph/sourcegraph/issues/47362). I've simplified the interface to just one method, GetFork, which takes a fork namespace and name as optional parameters. This vastly simplifies the logic for retrieving the remote repo for a changeset.
    • This method also behaves more consistently across all code hosts in a couple ways:
      • GitHub now checks for the existence of the fork prior separately, prior to trying to create it, to circumvent the issue from https://github.com/sourcegraph/sourcegraph/issues/47362.
      • GitHub, GitLab, and Bitbucket Cloud now all check that the repo returned from the code host APIs is actually a fork, and is actually a fork of the original target repo. This check was previously only done on Bitbucket Server.
      • All methods use a new common method on the Source, DefaultForkName, for the fork name if we're creating a new fork.

Beyond this, almost everything else is updating or adding tests.

Random other things I threw in this PR

  • Fixed some misc misspellings of "BitBucket" to "Bitbucket"
  • Removed Adam from the CODENOTIFY for Bitbucket Cloud

Test plan

LET ME TELL YOU.

  • Existing tests pass, VCR tests have been re-recorded.
  • Added so. many. new. tests.
  • Manually testing the backwards compatibility issue of https://github.com/sourcegraph/sourcegraph/issues/46096.
  • Manually tested the regression of https://github.com/sourcegraph/sourcegraph/issues/47362.
  • Tested all manner of weird, specific scenarios locally, including but not limited to:
    • What if I forked the repo before Sourcegraph?
    • What if I forked the repo to a weird name before Sourcegraph?
    • What if I rename the fork?
    • What if I delete the fork?
    • What if the fork is in a nested namespace (GitLab)?
    • Granted we don't have full automatic support for all these cases, but I made sure we at least showed a reasonable error message when we didn't support something.

@courier-new courier-new added the batch-changes Issues related to Batch Changes label Feb 4, 2023
@courier-new courier-new self-assigned this Feb 4, 2023
@cla-bot cla-bot Bot added the cla-signed label Feb 4, 2023
@courier-new courier-new requested a review from a team February 7, 2023 14:36
@courier-new courier-new marked this pull request as ready for review February 7, 2023 14:37
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

batch-changes Issues related to Batch Changes cla-signed

Projects

None yet

3 participants