Fix azure completions api#63491
Conversation
|
@arafatkatze Instead of this 'try both APIs and cache which one works' approach in code (which is kind of scary); can you go with one of these options instead:
Either of these would work. I think I prefer (2) but I am find with either option. |
|
Okay Stephen I would go with option 2 great recommendation. |
24810b7 to
4af5394
Compare
|
@slimsag Can you take a look again please? |
| } | ||
|
|
||
| // Streaming with ChatCompletions API | ||
| func tryStreamChatCompletionsAPI( |
There was a problem hiding this comment.
If you can rename these tryFooBar functions to doFooBar that'd be nice.
emidoots
left a comment
There was a problem hiding this comment.
Seems reasonable now, thanks for the update!
96b040e to
0835f3f
Compare
a40f2b4 to
39f7ee6
Compare
Pull Request is not mergeable
|
The backport to To backport this PR manually, you can either: Via the sg toolUse the sg backport -r 5.3.x -p 63491Via your terminalTo backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-5.3.x 5.3.x
# Navigate to the new working tree
cd .worktrees/backport-5.3.x
# Create a new branch
git switch --create backport-63491-to-5.3.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 f5d5deceb0cfe6a6094f5619786a7ad2f890b2c2
# Push it to GitHub
git push --set-upstream origin backport-63491-to-5.3.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-5.3.xIf you encouter conflict, first resolve the conflict and stage all files, then run the commands below: git cherry-pick --continue
# Push it to GitHub
git push --set-upstream origin backport-63491-to-5.3.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-5.3.x
Once the pull request has been created, please ensure the following:
|
|
@arafatkatze messed around to see if the backport bot would work here but it looks like its got some kinks still, also, these v5.3.x branches are a new thing so maybe the HEAD's not right or something like that |
[Linear Issue ](https://linear.app/sourcegraph/issue/CODY-2586/fix-completions-models-api-for-azure-to-use-the-right-model-with-the) The purpose of this PR is to make a backwords compatible solution such that the completions logic in our codebase for azure supports both the completions API(which is old) and also supports the chat/completions API which is new. This way we can use models from both of them with autocomplete. NOTe: Since we can't figure out which model we are using because azure has the deployment name instead of model name and because of that we can't decide which API to use for which model we try with both of the APIs and then the API that works is cached for that model and then we used the cached API logic to choose the api to make subsequent completion calls this way we can choose either of the APIs and not have added latency with completions. I used the azure keys to try out different deployment models that we have both with the old and the new api. Old API -> Completions (gpt-3.5-turbo-instruct, gpt-3.5-turbo(301), gpt-3.5-turbo(613)) New API -> Chat Completions(gpt-3.5-turbo(301), gpt-4o, gpt-3.5-turbo(613), gpt-3.5-turbo-16k) NOTE both of the set of models work seamless with this PR. <!-- REQUIRED; info at https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles --> <!-- OPTIONAL; info at https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c --> (cherry picked from commit f5d5dec)
Linear Issue
The purpose of this PR is to make a backwords compatible solution such that the completions logic in our codebase for azure supports both the completions API(which is old) and also supports the chat/completions API which is new. This way we can use models from both of them with autocomplete.
NOTe: Since we can't figure out which model we are using because azure has the deployment name instead of model name and because of that we can't decide which API to use for which model we try with both of the APIs and then the API that works is cached for that model and then we used the cached API logic to choose the api to make subsequent completion calls this way we can choose either of the APIs and not have added latency with completions.
Test plan
I used the azure keys to try out different deployment models that we have both with the old and the new api.
Old API -> Completions (gpt-3.5-turbo-instruct, gpt-3.5-turbo(301), gpt-3.5-turbo(613))
New API -> Chat Completions(gpt-3.5-turbo(301), gpt-4o, gpt-3.5-turbo(613), gpt-3.5-turbo-16k)
NOTE both of the set of models work seamless with this PR.
Changelog