This repository was archived by the owner on Sep 30, 2024. It is now read-only.
Embeddings: avoid constantly rerunning job if it failed#58980
Merged
Conversation
Contributor
Author
|
Thanks for the review. I'll backport this as part of a round of embeddings fixes. |
jtibshirani
added a commit
that referenced
this pull request
Dec 18, 2023
The embeddings policy framework attempts to rerun a repo job even if a previous run failed at the exact same revision. This means that when a job failed, for example because of rate limits or a problematic file, it would immediately be rescheduled and fail again. This can be expensive and noisy. Now, the policy framework does **not** rerun failed jobs unless the revision changes. An admin can always kick off a job manually if they want to rerun a job at the revision. This reduces noise and feels like a better trade-off.
camdencheek
pushed a commit
that referenced
this pull request
Jan 9, 2024
#59090) * Embeddings: refactor job scheduling code (#58787) Refactors the repository scheduling logic into two separate methods, one used by the GraphQL API, and the other by the policy framework. This makes the code easier to read and lets us make some improvements: * For policy scheduling, stop translating back-and-forth between repo IDs and names * Make sure to fail entire GraphQL request if there is an error fetching repos, instead of silently ignoring it ## Test plan Added new unit test * Embeddings: avoid constantly rerunning job if it failed (#58980) The embeddings policy framework attempts to rerun a repo job even if a previous run failed at the exact same revision. This means that when a job failed, for example because of rate limits or a problematic file, it would immediately be rescheduled and fail again. This can be expensive and noisy. Now, the policy framework does **not** rerun failed jobs unless the revision changes. An admin can always kick off a job manually if they want to rerun a job at the revision. This reduces noise and feels like a better trade-off. * Fix compile error
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The embeddings policy framework attempts to rerun a repo job even if a previous
run failed at the exact same revision. This means that when a job failed, for
example because of rate limits or a problematic file, it would immediately be
rescheduled and fail again. This can be expensive and noisy.
Now, the policy framework does not rerun failed jobs unless the revision
changes. An admin can always kick off a job manually if they want to rerun a
job at the revision. This reduces noise and feels like a better trade-off.
Test plan
Modified unit tests