Embeddings: fail job immediately if rate limited exceeded#58869
Conversation
| // To avoid failing large jobs on a flaky API, just mark all files | ||
| // as failed and continue. This means we may have some missing | ||
| // files, but they will be logged as such below and some embeddings | ||
| // are better than no embeddings. |
There was a problem hiding this comment.
do we know what errors we would get back in this case? Feels like it would be more robust to allow list the errors we have this behaviour with rather than assuming it is the flaky API. Either way this is an improvement so rather land this PR than directly address this comment. Also fine to say we don't know what the errors are :)
There was a problem hiding this comment.
Good point! Looking at the PR where it was added (https://github.com/sourcegraph/sourcegraph/pull/57224), I don't think we have a precise handle on the causes of flakiness. So I'll leave this as-is for now.
|
The backport to To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-5.2 5.2
# Navigate to the new working tree
cd .worktrees/backport-5.2
# Create a new branch
git switch --create backport-58869-to-5.2
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 6ea73218c13abcaf9938e912d6b4809d02e36ace
# Push it to GitHub
git push --set-upstream origin backport-58869-to-5.2
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-5.2If you encouter conflict, first resolve the conflict and stage all files, then run the commands below: git cherry-pick --continue
# Push it to GitHub
git push --set-upstream origin backport-58869-to-5.2
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-5.2
|
Usually, during an embeddings job we allow 10% of embedding requests to fail, simply skipping over failed chunks. If a customer has hit their rate limits, this means we might continually send a huge number of embedding requests that we know will immediately fail. With this change, we immediately fail a job if the rate limit is exceeded. It also increases the amount of time between attempting to run a job to 15 minutes. This won't make a big difference to user experience, since by default embeddings jobs aren't allowed to be scheduled within 24h of the last run. But it helps prevent jobs from continuously being scheduled then failing. This change is unlikely to have a user-facing impact, but just helps cut down on noise in logs and excessive requests to Cody Gateway.
Usually, during an embeddings job we allow 10% of embedding requests to fail, simply skipping over failed chunks. If a customer has hit their rate limits, this means we might continually send a huge number of embedding requests that we know will immediately fail. With this change, we immediately fail a job if the rate limit is exceeded. It also increases the amount of time between attempting to run a job to 15 minutes. This won't make a big difference to user experience, since by default embeddings jobs aren't allowed to be scheduled within 24h of the last run. But it helps prevent jobs from continuously being scheduled then failing. This change is unlikely to have a user-facing impact, but just helps cut down on noise in logs and excessive requests to Cody Gateway.
…ded (#58939) Embeddings: fail job immediately if rate limited exceeded (#58869) Usually, during an embeddings job we allow 10% of embedding requests to fail, simply skipping over failed chunks. If a customer has hit their rate limits, this means we might continually send a huge number of embedding requests that we know will immediately fail. With this change, we immediately fail a job if the rate limit is exceeded. It also increases the amount of time between attempting to run a job to 15 minutes. This won't make a big difference to user experience, since by default embeddings jobs aren't allowed to be scheduled within 24h of the last run. But it helps prevent jobs from continuously being scheduled then failing. This change is unlikely to have a user-facing impact, but just helps cut down on noise in logs and excessive requests to Cody Gateway.
Usually, during an embeddings job we allow 10% of embedding requests to fail,
simply skipping over failed chunks. If a customer has hit their rate limits,
this means we might continually send a huge number of embedding requests that
we know will immediately fail. With this change, we immediately fail a job if
the rate limit is exceeded.
It also increases the amount of time between attempting to run a job to 15
minutes. This won't make a big difference to user experience, since by default
embeddings jobs aren't allowed to be scheduled within 24h of the last run. But
it helps prevent jobs from continuously being scheduled then failing.
This change is unlikely to have a user-facing impact, but just helps cut down
on noise in logs and excessive requests to Cody Gateway.
Test plan
Added new unit test
Preview 🤩
Preview Link