Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

rockskip: Replace second long-running process with gRPC API#62734

Merged
eseliger merged 1 commit into
mainfrom
es/05-16-rockskipreplacesecondlong-runningprocesswithgrpcapi
May 21, 2024
Merged

rockskip: Replace second long-running process with gRPC API#62734
eseliger merged 1 commit into
mainfrom
es/05-16-rockskipreplacesecondlong-runningprocesswithgrpcapi

Conversation

@eseliger

@eseliger eseliger commented May 16, 2024

Copy link
Copy Markdown
Member

We want to migrate the LogReverseEach call to gRPC methods as well.
However, it isn't a great practice to have to keep this process running for potentially hours, as any server restart will have to make the entire process start over.

With one call per commit, we're not occupying a process slot on gitserver, and rollouts of gitserver should much less affect rockskip, as individual commits can be retried before being returned to the rockskip code.

One thing: I reimplemented what ChangedFiles does so that we can still implement integration tests on this level, but since gitserver is a separate service, a go unit test is likely long-term not the best test bed for these kinds of tests. Ideally we would spin up a small gitserver with that repo cloned, and a small program that runs the rockskip index function.
Unfortunately I think we have little precedence for these things, but that would be the highest confidence that things actually work E2E in production.

Closes #62100

Test plan:

Existing tests still pass, I let it index a repo locally and compared it to the results I got on main with a fresh index there.

@cla-bot cla-bot Bot added the cla-signed label May 16, 2024
@github-actions github-actions Bot added team/product-platform team/source Tickets under the purview of Source - the one Source to graph it all labels May 16, 2024
@eseliger eseliger force-pushed the es/05-16-rockskipreplacesecondlong-runningprocesswithgrpcapi branch 3 times, most recently from dc8d53c to 2cbe72e Compare May 16, 2024 18:03
@eseliger eseliger marked this pull request as ready for review May 16, 2024 18:07
@eseliger eseliger requested review from a team and jtibshirani May 16, 2024 18:08
@jtibshirani jtibshirani requested a review from a team May 16, 2024 18:55
@jtibshirani

Copy link
Copy Markdown
Contributor

This looks good to me!

Ideally we would spin up a small gitserver with that repo cloned, and a small program that runs the rockskip index function.

Indeed! I have a TODO to add a couple backend integration tests for Rockskip, that will at least give some coverage here (https://github.com/sourcegraph/sourcegraph/issues/62550)

Did we ever resolve @keegancsmith 's question here: https://github.com/sourcegraph/sourcegraph/pull/62260#pullrequestreview-2053060862 ? Maybe we were waiting on the perf numbers he mentioned?

@eseliger

Copy link
Copy Markdown
Member Author

Did we ever resolve @keegancsmith 's question here: https://github.com/sourcegraph/sourcegraph/pull/62260#pullrequestreview-2053060862 ? Maybe we were waiting on the perf numbers he mentioned?

Ah! No, curious to hear what the numbers are :) src-cli indexed in about the same time locally vs main, but it's a very small repo.

@ggilmore ggilmore left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, but @sourcegraph/search-platform should review

@jtibshirani jtibshirani left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me! I feel the test coverage is as good as possible given our lack of end-to-end tests -- we've covered the ChangedFiles logic, Rockskip indexing and search, and done manual checks.

@jtibshirani

Copy link
Copy Markdown
Contributor

Did we ever resolve @keegancsmith 's question here: https://github.com/sourcegraph/sourcegraph/pull/62260#pullrequestreview-2053060862 ? Maybe we were waiting on the perf numbers he mentioned?

@eseliger and I chatted offline and decided we don't have a big concern here. I do think there's room to improve observability here, to be able to debug what's slow if we observe slow index times.

Base automatically changed from es/05-11-gitserveraddcommitlogapitoreplaceclient-sidecommits to main May 21, 2024 13:21
@eseliger eseliger force-pushed the es/05-16-rockskipreplacesecondlong-runningprocesswithgrpcapi branch from db73857 to 3574efb Compare May 21, 2024 13:22

eseliger commented May 21, 2024

Copy link
Copy Markdown
Member Author

Merge activity

  • May 21, 9:22 AM EDT: Graphite rebased this pull request after merging its parent, because this pull request is set to merge when ready.
  • May 21, 11:45 AM EDT: @eseliger merged this pull request with Graphite.

We want to migrate the LogReverseEach call to gRPC methods as well.
However, it isn't a great practice to have to keep this process running for potentially hours, as any server restart will have to make the entire process start over.

With one call per commit, we're not occupying a process slot on gitserver, and rollouts of gitserver should much less affect rockskip, as individual commits can be retried before being returned to the rockskip code.

Closes #62100

Test plan:

Existing tests still pass, I let it index a repo locally and that was fast and I get results for symbols instantly now, but not a Rockskip expert so requesting review from the owners.
@eseliger eseliger force-pushed the es/05-16-rockskipreplacesecondlong-runningprocesswithgrpcapi branch from 3574efb to 4dda696 Compare May 21, 2024 15:14
@eseliger eseliger merged commit c338a95 into main May 21, 2024
@eseliger eseliger deleted the es/05-16-rockskipreplacesecondlong-runningprocesswithgrpcapi branch May 21, 2024 15:45
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

cla-signed team/product-platform team/source Tickets under the purview of Source - the one Source to graph it all

Projects

None yet

Development

Successfully merging this pull request may close these issues.

gitserver: Move LogReverseEach to new gRPC pattern

3 participants