Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

[Backport 5.2] Embeddings: fix low-hanging issues with scheduling job#58651

Merged
keegancsmith merged 3 commits into
5.2from
jtibs/embeddings
Nov 29, 2023
Merged

[Backport 5.2] Embeddings: fix low-hanging issues with scheduling job#58651
keegancsmith merged 3 commits into
5.2from
jtibs/embeddings

Conversation

@jtibshirani

Copy link
Copy Markdown
Contributor

As part of the embeddings policy framework, a worker periodically checks what repos can be embedded. For every candidate repo, it queries the DB to see if there's a new revision to embed. This runs every minute and becomes increasingly expensive as the jobs table fills up with more entries over time.

This change makes small optimizations to improve this:

  • Add an index to make selecting on repo_id and revision much faster
  • Check the repos every 5 minutes instead of 1 minute. This shouldn't make a huge difference in user experience, since by default embeddings jobs aren't allowed to be scheduled within 24h of the last run

Backport of #58510

@cla-bot

cla-bot Bot commented Nov 29, 2023

Copy link
Copy Markdown

We require contributors to sign our Contributor License Agreement (CLA), and we don't have yours on file. In order for us to review and merge your code, please sign CLA to get yourself added.

Sourcegraph teammates should refer to Accepting contributions for guidance.

@jtibshirani jtibshirani added the backport/bugfix Standard patches to fix bugs label Nov 29, 2023
@jtibshirani jtibshirani changed the base branch from main to 5.2 November 29, 2023 08:00
As part of the embeddings policy framework, a worker periodically checks what
repos can be embedded. For every candidate repo, it queries the DB to see if
there's a new revision to embed. This runs every minute and becomes
increasingly expensive as the jobs table fills up with more entries over time.

This change makes small optimizations to improve this:
* Add an index to make selecting on `repo_id` and `revision` much faster
* Check the repos every 5 minutes instead of 1 minute. This shouldn't make a
huge difference in user experience, since by default embeddings jobs aren't
allowed to be scheduled within 24h of the last run
@cla-bot cla-bot Bot added the cla-signed label Nov 29, 2023
@jtibshirani jtibshirani changed the title Jtibs/embeddings [Backport 5.2] Embeddings: fix low-hanging issues with scheduling job Nov 29, 2023
@jtibshirani jtibshirani requested a review from a team November 29, 2023 08:36
@jtibshirani jtibshirani marked this pull request as ready for review November 29, 2023 08:37
@sourcegraph-bot

sourcegraph-bot commented Nov 29, 2023

Copy link
Copy Markdown
Contributor

Codenotify: Notifying subscribers in CODENOTIFY files for diff 19823d7...0f5f30d.

Notify File(s)
@efritz enterprise/cmd/worker/internal/embeddings/repo/scheduler.go

@keegancsmith keegancsmith enabled auto-merge (squash) November 29, 2023 08:44
@sourcegraph-bot

sourcegraph-bot commented Nov 29, 2023

Copy link
Copy Markdown
Contributor

📖 Storybook live preview

@keegancsmith keegancsmith merged commit f97f7f7 into 5.2 Nov 29, 2023
@keegancsmith keegancsmith deleted the jtibs/embeddings branch November 29, 2023 10:28
@varungandhi-src varungandhi-src mentioned this pull request Jan 16, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

backport/bugfix Standard patches to fix bugs cla-signed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants