Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

gitserver: disable sg maintenance#42856

Merged
stefanhengl merged 3 commits into
mainfrom
sh/disable_sg_maintenance
Oct 13, 2022
Merged

gitserver: disable sg maintenance#42856
stefanhengl merged 3 commits into
mainfrom
sh/disable_sg_maintenance

Conversation

@stefanhengl

@stefanhengl stefanhengl commented Oct 12, 2022

Copy link
Copy Markdown
Member

This PR sets GCAUTO as the new default and disables sg maintenance.

sg maintenance has probably caused more harm than good. While it does have a significant performance impact for very large repos, it is also those repos that suffered the most from repo corruption.

The job has been disabled on many customer instances and, as far as I know, on all managed instances.

Test plan

  • just a config change. The new default config already runs succesfuly on many customer instances.

This PR stes GC_AUTO as the new default and disables sg maintenance.

sg maintenance has probably caused more harm than good. While it does have a
significant performance impact for very large repos, it is also those
repos that suffered the most from repo corruption.

The job has been disabled on many customer instances and, as far as I
know, on all managed instances.
@stefanhengl stefanhengl requested review from a team October 12, 2022 07:56
@cla-bot cla-bot Bot added the cla-signed label Oct 12, 2022
@stefanhengl

Copy link
Copy Markdown
Member Author

cc @DaedalusG

@sourcegraph-bot

sourcegraph-bot commented Oct 12, 2022

Copy link
Copy Markdown
Contributor

Codenotify: Notifying subscribers in CODENOTIFY files for diff 8f52cae...e5b55ae.

Notify File(s)
@indradhanush cmd/gitserver/server/cleanup.go
@ryanslade cmd/gitserver/server/cleanup.go
@sashaostrikov cmd/gitserver/server/cleanup.go

@sashaostrikov sashaostrikov left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will completely fix sgm one day and bring it back as a default again :)

@mrnugget

Copy link
Copy Markdown
Contributor

Since you've all gathered here, out of curiosity: what was the original reason for introducing sg maintenance? why is it better than GCAUTO?

@stefanhengl

Copy link
Copy Markdown
Member Author

TLDR: it speeds up git-log operations for very large repos. EG, without it, displaying git history (last 5 commits of a file) becomes very slow. At the time we introduced this, git didn't expose the right flags/config on the porcelain level. I think it is worth re-checking this and maybe contribute to git rather than writing our own wrapper around plumbing commands.

Here is some backround

@DaedalusG

Copy link
Copy Markdown
Contributor

@mrnugget Additionally around the same time this was being developed we were trying to resolve an issue in which a customer was experiencing failures in indexing their monorepo, in which zoekt would fail to fetch from gitserver.
https://github.com/sourcegraph/customer/issues/586

@stefanhengl

stefanhengl commented Oct 13, 2022

Copy link
Copy Markdown
Member Author

Ah, yes good data point. Limiting the memory with -window-memory definitely helped.

I guess the decision is between which kinds of problems are more acceptable for customers. I believe repo corruption tops everything, especially for monorepo customers where re-cloning is expensive. By disabling sourcegraph maintenance we will get more complaints about MEM and slow UI. At least we know how to limit MEM, because -window-memory is exposed as config parameter.

@stefanhengl stefanhengl merged commit 75aa74b into main Oct 13, 2022
@stefanhengl stefanhengl deleted the sh/disable_sg_maintenance branch October 13, 2022 06:33
@stefanhengl

stefanhengl commented Oct 14, 2022

Copy link
Copy Markdown
Member Author

For anyone who wonders what that means with respect to their Sourcegraph installation:

This change will roll out with Sourcegraph 4.1.

Overall, this change will increase stability for all customers.

Customers with large monorepos have most likely already disabled sg maintenance, either because they ran into problems or because they were advised by their CE.

For customers with large monorepos that have sg maintenance enabled, git operations based on git-log will most likely run slower now. An example is the history of the last commits displayed at the bottom of the tree view. The core search experience, however, is not affected. If they prefer to leave sg maintenance enabled (NOT RECOMMENDED), they have to explicitly set "SRC_ENABLE_SG_MAINTENANCE=true" and "SRC_ENABLE_GC_AUTO=false" for gitserver.

In some cases, gitserver might not be able to run garbage collection because of lack of memory. This might manifest itself by a failed fetch or clone. In this case we advise customers to set git config pack.windowMemory 100m either globally or for the repo in question on gitserver.

Customers with small to medium sized repositories won't notice a difference.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants