Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

Fix: Repo embedding scheduler job routine as internal actor#55698

Merged
camdencheek merged 3 commits into
mainfrom
garyl/embed_job_scheduler_internal_actor
Aug 10, 2023
Merged

Fix: Repo embedding scheduler job routine as internal actor#55698
camdencheek merged 3 commits into
mainfrom
garyl/embed_job_scheduler_internal_actor

Conversation

@gl-srgr

@gl-srgr gl-srgr commented Aug 9, 2023

Copy link
Copy Markdown
Contributor

We received reports in slack of repositories never getting scheduled for embedding jobs despite the filter / pattern preview showing repos that match.

In both cases linked private repos appear to be impacted. After fetching repo ids from GetEmbeddableRepos we then make another fetch for getting the repo names for those repo ids. This read operation is where we appear to end up missing out on some repos because we are not an internal actor and therefore we must have permissions to the repos that we want Listed. When we aren't authorized for any repos then we end up never scheduling those repos since we never got the repo names.

Test plan

N/A

@cla-bot cla-bot Bot added the cla-signed label Aug 9, 2023
@sourcegraph-bot

sourcegraph-bot commented Aug 9, 2023

Copy link
Copy Markdown
Contributor

Codenotify: Notifying subscribers in CODENOTIFY files for diff ea67924...bc018ca.

Notify File(s)
@efritz enterprise/cmd/worker/internal/embeddings/repo/scheduler.go

@gl-srgr gl-srgr requested review from a team and camdencheek August 9, 2023 22:51
workCtx := actor.WithInternalActor(context.Background())
return []goroutine.BackgroundRoutine{
newRepoEmbeddingScheduler(ctx, gitserver.NewClient(db), db, repo.NewRepoEmbeddingJobsStore(db)),
newRepoEmbeddingScheduler(workCtx, gitserver.NewClient(db), db, repo.NewRepoEmbeddingJobsStore(db)),

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to double check, have you actually tested this with a private repo? It's not immediately obvious that the context passed in is the same that's used for actually running the background goroutine

@camdencheek

Copy link
Copy Markdown
Member

Beat me to it by like 2 minutes!

@camdencheek camdencheek merged commit ef63a4f into main Aug 10, 2023
@camdencheek camdencheek deleted the garyl/embed_job_scheduler_internal_actor branch August 10, 2023 00:10
@github-actions

Copy link
Copy Markdown
Contributor

The backport to 5.1 failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-5.1 5.1
# Navigate to the new working tree
cd .worktrees/backport-5.1
# Create a new branch
git switch --create backport-55698-to-5.1
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 ef63a4f9ecd8913a191bc5e5d562b80d46da527b
# Push it to GitHub
git push --set-upstream origin backport-55698-to-5.1
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-5.1

Then, create a pull request where the base branch is 5.1 and the compare/head branch is backport-55698-to-5.1.

@github-actions github-actions Bot added backports failed-backport-to-5.1 release-blocker Prevents us from releasing: https://about.sourcegraph.com/handbook/engineering/releases labels Aug 10, 2023
camdencheek added a commit that referenced this pull request Aug 10, 2023
…actor (#55698) (#55701)

We received reports of repositories never getting scheduled for
embedding jobs despite the filter / pattern preview showing repos that
match.

In both cases linked private repos appear to be impacted. After fetching
repo ids from `GetEmbeddableRepos` we then make another fetch for
getting the repo names for those repo ids. This read operation is where
we appear to end up missing out on some repos because we are not an
internal actor and therefore we must have
permissions to the repos that we want `List`ed. When we aren't
authorized for any repos then we end up never scheduling those repos
since we never got the repo names.

(cherry picked from commit ef63a4f)
@BolajiOlajide BolajiOlajide removed the release-blocker Prevents us from releasing: https://about.sourcegraph.com/handbook/engineering/releases label Sep 11, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants