feat/cody: Refactor completions API to use new modelconfig (support more models)#63797
Conversation
| } | ||
| }) | ||
|
|
||
| t.Run("NoCompletionsConfig", func(t *testing.T) { |
There was a problem hiding this comment.
These tests were removed because we no longer return this error.
The code path was for when Cody was enabled in the site config, but no "completions config" was available. But we now skip looking for the "completions config" directly, and fetch the LLM model configuration data via modelconfig.Get().Get(). (i.e. the model config service, watching for site config updates, etc.)
| assert.Contains(t, respBody, `unable to find model "anthropic::api::model"`) | ||
| }) | ||
| assert.Equal(t, http.StatusInternalServerError, status) | ||
| assert.Equal(t, "no client known for upstream provider acmeco", respBody) |
There was a problem hiding this comment.
Previously we try to send the unknown model request to Cody Gateway. (Or at least fail when we try to create the Cody Gateway-specific completions provider.)
Now we reject the request much sooner, when we try to resolve the model that the user is requesting. (i.e. "is it found in the site config or not".)
If I get to it, I'll add some more unit tests specifically for the "Cody Enterprise configured to use Cody Gateway" and confirm that that particular path still works as expected. (Similarly, we don't have much automated test coverage for the "Sourcegraph.com x Cody Pro" completion path.)
There was a problem hiding this comment.
... thinking about this a little more, this probably isn't a big deal. Since if you want to use Sourcegraph/Cody Gateway for some LLM models, it seems kinda reasonable that we require the provider IDs are exactly "anthropic", "fireworks", etc.
e.g. if a customer wired up a provider named "foo" and somehow that translated to Cody Gateway x OpenAI... that would be confusing, and so that not working might be considered a feature.
…t 'anthropic/gpt-4o')
emidoots
left a comment
There was a problem hiding this comment.
Reviewed everything as carefully as I could, looks good to me.
…ls/mixtral-8x22b-instruct'
This PR if what the past dozen or so cleanup, refactoring, and test PRs were all about: using the new
modelconfigsystem for the completion APIs.This will enable users to:
CodyCompletionRequest.Modelfield. And now support users specifying any LLM model (provided it is "supported" by the Sourcegraph instance).Beyond those two things, everything should continue to work like before. With any existing "completions" configuration data being converted into the
modelconfigsystem (see https://github.com/sourcegraph/sourcegraph/pull/63533).Overview
In order to understand how this all fits together, I'd suggest reviewing this PR commit-by-commit.
Update internal/completions to use modelconfig
The first change was to update the code we use to serve LLM completions. (Various implementations of the
types.CompletionsProviderinterface.)The key changes here were as follows:
CompletionRequesttype to include theModelConfigInfofield (to make the new Provider and Model-specific configuration data available.)CompletionRequest.Modelfield toCompletionRequest.RequestedModel. (But with a JSON annotation to maintain compatibility with existing callers.) This is to catch any bugs related to using the field directly, since that is now almost guaranteed to be a mistake. (See below.)With these changes, all of the
CompletionProviders were updated to reflect these changes.CompletionRequest.Parameters.RequestedModelshould now refer toCompletionRequest.ModelConfigInfo.Model.ModelName. The "model name" being the thing that should be passed to the API provider, e.g.gpt-3.5-turbo.azureopenai) we needed to rely on the Model ID as a more human-friendly identifier. This isn't 100% accurate, but will match the behavior we have today. A long doc comment calls out the details of what is wrong with that.awsbedrock,azureopenai) we read the newmodelconfigdata to configure the API provider (e.g.Azure.UseDeprecatedAPI), or surface model-specific metadata (e.g. AWS Provisioned Throughput ARNs). While the code is a little clunky to avoid larger refactoring, this is the heart and soul of how we will be writing new completion providers in the future. That is, taking specific configuration bags with whatever data that is required.Fix bugs in modelconfig
While we had lots of tests for converting the existing "completions" site config data into the
modelconfig.ModelConfigurationstructure, there were a couple of subtle bugs that I found while testing the larger change.The updated unit tests and comments should make that clear.
Update frontend/internal/httpapi/completions to use modelconfig
The final step was to update the HTTP endpoints that serve the completion requests. There weren't any logic changes here, just refactoring how we lookup the required data. (e.g. converting the user's requested model into an actual model found in the site configuration.)
We support Cody clients sending either "legacy mrefs" of the form
provider/modellike before, or the newer mrefprovider::apiversion::model. Although it will likely be a while before Cody clients are updated to only use the newer-style model references.The existing unit tests for the competitions APIs just worked, which was the plan. But for the few changes that were required I've added comments to explain the situation.
Fix: Support requesting models just by their ID
Yeah, so apparently I lied 😅 . After doing more testing, the extension also sends requests where the requested model is just
"model". (Without the provider prefix.)So that now works too. And we just blindly match "gtp-3.5-turbo" to the first mref with the matching model ID, such as "anthropic::unknown::gtp-3.5-turbo".
Test plan
Existing unit tests pass, added a few tests. And manually tested my Sg instance configured to act as both "dotcom" mode and a prototypical Cody Enterprise instance.
Changelog
Update the Cody APIs for chat or code completions to use the "new style" model configuration. This allows for great flexibility in configuring LLM providers and exposing new models, but also allows Cody Enterprise users to select different models for chats.
This will warrant a longer, more detailed changelog entry for the patch release next week. As this unlocks many other exciting features.