Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

gitserver: Made git GC recloning less dependent on time#62776

Merged
eseliger merged 1 commit into
mainfrom
es/gc-counter
May 21, 2024
Merged

gitserver: Made git GC recloning less dependent on time#62776
eseliger merged 1 commit into
mainfrom
es/gc-counter

Conversation

@eseliger

@eseliger eseliger commented May 18, 2024

Copy link
Copy Markdown
Member

On small instances, we gave git GC a lot of time to recover after a failure. However, on large instances, the time based approach could mean that after only a single failed attempt we will already reclone a repo.

This PR changes that to use a counter instead, and moves away from git config for that, fs.ReadFile on a file is much cheaper than spawning a whole process.

Closes https://github.com/sourcegraph/sourcegraph/issues/62649

Test plan:

Tests still pass, added some new tests.

eseliger commented May 18, 2024

Copy link
Copy Markdown
Member Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @eseliger and the rest of your teammates on Graphite Graphite

@github-actions github-actions Bot added team/product-platform team/source Tickets under the purview of Source - the one Source to graph it all labels May 18, 2024
@eseliger eseliger force-pushed the es/corruption-stat branch from 0f5c133 to 7d9ada6 Compare May 18, 2024 00:08
@eseliger eseliger marked this pull request as ready for review May 18, 2024 00:09
@eseliger eseliger requested a review from a team May 18, 2024 00:09
Comment thread cmd/gitserver/internal/cleanup.go Outdated

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fs.ReadFile on a file is much cheaper than spawning a whole process.

What is our logic then for when it warrants putting stuff in git config and when not to?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the issues we've seen from storing sg specific things in git config, I think we should only store config in there that git itself cares about - everything else we can do with lower IOPS / process spawning overhead ourselves.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that we could also use go-git to modify the git config in a structured way (if we want to keep the property of storing things in the git config without spawning the process). I don't feel strongly about this though.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about switching (had a very old branch actually that switched those to go-git), but when I looked at the implementation it didn't seem to use the same lockfile mechanism that git itself uses so it would probably not be thread safe and come with its own challenges :(

Comment thread cmd/gitserver/internal/cleanup.go Outdated

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 5?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random number to get started - 1 seems too risky, 5 is not 10 😆 I want to tune this as we learn more about the likelihood of this happening

Base automatically changed from es/corruption-stat to main May 21, 2024 18:05
On small instances, we gave git GC a lot of time to recover after a failure. However, on large instances, the time based approach could mean that after only a single failed attempt we will already reclone a repo.

This PR changes that to use a counter instead, and moves away from git config for that, fs.ReadFile on a file is much cheaper than spawning a whole process.

Test plan:

Tests still pass, added some new tests.

eseliger commented May 21, 2024

Copy link
Copy Markdown
Member Author

Merge activity

  • May 21, 2:06 PM EDT: Graphite rebased this pull request after merging its parent, because this pull request is set to merge when ready.
  • May 21, 2:20 PM EDT: @eseliger merged this pull request with Graphite.

@eseliger eseliger merged commit 3f8eaca into main May 21, 2024
@eseliger eseliger deleted the es/gc-counter branch May 21, 2024 18:20
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

cla-signed team/product-platform team/source Tickets under the purview of Source - the one Source to graph it all

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reduce amount of calls to git config

3 participants