Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

gitserver: make vcs syncer implementation request a fresh clone URL in between each remote git operation#61179

Merged
ggilmore merged 1 commit into
mainfrom
spr/main/4575598b
Mar 19, 2024
Merged

gitserver: make vcs syncer implementation request a fresh clone URL in between each remote git operation#61179
ggilmore merged 1 commit into
mainfrom
spr/main/4575598b

Conversation

@ggilmore

@ggilmore ggilmore commented Mar 14, 2024

Copy link
Copy Markdown
Contributor

Fixes: https://github.com/sourcegraph/sourcegraph/issues/60682

The key insight into fixing https://github.com/sourcegraph/sourcegraph/issues/60682 is to make our vcssyncer implementation (that performs the fetch and clone operations) stop using a "static" remote URL for all of it's operations.

Before, when we used a Github App installation, Sourcegraph will receive an authentication token that's valid for one hour. We embed this token in the remote URL that we use to configure git for the fetch and git remote show head (to figure out what the default branch is pointing to) operations. When a huge repository like https://github.com/sgtest/megarepo gets cloned - it's likely that it will take > 1 hour to clone. This means that the auth information embedded the remote URL will expire in between the fetch and remote head operations - which would fail the overall "clone" operation.

The solution to this is to introduce a new abstraction:

// RemoteURLSource is a source of a remote URL for a repository.
//
// The remote URL may change over time, so it's important to use this interface
// to get the remote URL every time instead of caching it at the call site.
type RemoteURLSource interface {
	// RemoteURL returns the latest remote URL for a repository.
	RemoteURL(ctx context.Context) (*vcs.URL, error)
}

The RemoteURLSource encapsulates all the logic to figure out what remote URL to use for a given repository (including authentication information, token refreshing, etc.). Instead of passing around a static remote URL, the RemoteURLSource is stored as a struct member on each of the syncers. Whenever the syncer performs a git operation (like fetch) - it consults the source for the latest remoteURL to use - and only uses that url for the current operation. This means that individual git operations now have the chance to "refresh" the token in the remote URL (if necessary) which solves the original problem.


Test plan:

@cla-bot cla-bot Bot added the cla-signed label Mar 14, 2024
@github-actions github-actions Bot added the team/source Tickets under the purview of Source - the one Source to graph it all label Mar 14, 2024
@ggilmore ggilmore force-pushed the spr/main/4575598b branch from 3ce14de to b3867d2 Compare March 14, 2024 23:24
@ggilmore ggilmore marked this pull request as draft March 14, 2024 23:29
@ggilmore ggilmore requested review from eseliger and pjlast March 14, 2024 23:29
@ggilmore ggilmore force-pushed the spr/main/4575598b branch 4 times, most recently from 448d792 to 18c9d65 Compare March 14, 2024 23:45
Comment thread internal/github_apps/auth/auth.go Outdated

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be removed soon, this is just a local hack I created to try to revoke tokens without waiting an hour.

@ggilmore ggilmore force-pushed the spr/main/4575598b branch 2 times, most recently from a04bf3a to cea3d55 Compare March 14, 2024 23:51

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whenever I see this []byte -> string -> []byte conversion pattern I wonder if we wouldn't be better off just having []byte implementations of these functions

@ggilmore ggilmore force-pushed the spr/main/4575598b branch 2 times, most recently from 7270b34 to 999215c Compare March 19, 2024 15:29
@ggilmore ggilmore marked this pull request as ready for review March 19, 2024 16:16
@ggilmore ggilmore changed the title wip: gitserver: make vcs syncer implementation request a fresh clone URL in between each remote git operation gitserver: make vcs syncer implementation request a fresh clone URL in between each remote git operation Mar 19, 2024

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future readers, this number increased by 3 due to the 3 extra tryWrite calls I added in gitVCSSyncer.Clone (the additional logs before / after SetHEAD + an additional log once the fetch finished): https://github.com/sourcegraph/sourcegraph/pull/61179/files#diff-22d28b810252fd7a9c6e94ce2c5bbbb1a61f079b6a6d8407661f1ae28f282250R111

@ggilmore ggilmore force-pushed the spr/main/4575598b branch 3 times, most recently from 7348076 to 48c4100 Compare March 19, 2024 19:28
…URL in between each remote git operation

commit-id:4575598b
@ggilmore ggilmore force-pushed the spr/main/4575598b branch from 48c4100 to c49c84f Compare March 19, 2024 20:46
@ggilmore ggilmore merged commit cbbb9c3 into main Mar 19, 2024
@ggilmore ggilmore deleted the spr/main/4575598b branch March 19, 2024 20:57
@sourcegraph-release-bot

Copy link
Copy Markdown
Collaborator

The backport to 5.3 failed at https://github.com/sourcegraph/sourcegraph/actions/runs/8350113978:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-5.3 5.3
# Navigate to the new working tree
cd .worktrees/backport-5.3
# Create a new branch
git switch --create backport-61179-to-5.3
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 cbbb9c3d42a4fa5adad899a0a589d76348dc1df6
# Push it to GitHub
git push --set-upstream origin backport-61179-to-5.3
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-5.3

If you encouter conflict, first resolve the conflict and stage all files, then run the commands below:

git cherry-pick --continue
# Push it to GitHub
git push --set-upstream origin backport-61179-to-5.3
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-5.3
  • Follow above instructions to backport the commit.
  • Create a pull request where the base branch is 5.3 and the compare/head branch is backport-61179-to-5.3., click here to create the pull request.
  • Make sure to tag @sourcegraph/release in the pull request description.
  • Once the backport pull request is created, kindly remove the release-blocker from this pull request.

Comment on lines +303 to +306
var exitErr *exec.ExitError
if errors.As(err, &exitErr) {
// Redact the stderr output if we have it.
exitErr.Stderr = []byte(redactor(string(exitErr.Stderr)))

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we use err not exitErr further down, is exitErr a reference or a copy of err?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

backport 5.3 backports cla-signed failed-backport-to-5.3 release-blocker Prevents us from releasing: https://about.sourcegraph.com/handbook/engineering/releases team/product-platform team/source Tickets under the purview of Source - the one Source to graph it all

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cloning very large repos with GitHub App installation tokens can fail

5 participants