feat(cody): Expose Sg modelconfig data via HTTP REST API#63604
Conversation
| // SECURITY: It's critical that we do not serve any server-side configuration data | ||
| // to the client, as it contains sensitive data like access tokens. | ||
| modelconfigSDK.RedactServerSideConfig(currentConfig) | ||
|
|
||
| rawJSON, err := json.MarshalIndent(currentConfig, "", " ") | ||
| if err != nil { | ||
| h.logger.Error("marshalling configuration", log.Error(err)) | ||
| http.Error(w, http.StatusText(http.StatusInternalServerError), http.StatusInternalServerError) | ||
| return | ||
| } | ||
| http.Error(w, string(rawJSON), http.StatusOK) |
Check notice
Code scanning / Semgrep OSS
Semgrep Finding: security-semgrep-rules.semgrep-rules.generic.comment-tagging-rule
| var providerToOverride *types.Provider | ||
| for i := range mergedConfig.Providers { | ||
| if mergedConfig.Providers[i].ID == providerOverride.ID { | ||
| providerToOverride = &mergedConfig.Providers[i] |
There was a problem hiding this comment.
Just to disambiguate, can we use a different variable for i in this inner for loop?
There was a problem hiding this comment.
🫨 OMFG yes, sorry. I could have sworn this was using j when I first wrote it. I'm kinda shocked that you can alias i like this and have it not break things.
I'll just rename this inner loop to mergedProvIdx since even j is too similar to be honest.
| mergedConfig.DefaultModels.Chat = *validModel | ||
| } | ||
| if mergedConfig.DefaultModels.FastChat == "" { | ||
| validModel := getModelMatchingCategory(types.ModelCategorySpeed, types.ModelCategoryBalanced) |
There was a problem hiding this comment.
Can you remind me: did we end up getting a way to distinguish 'models that can ONLY be used for code completion and literally cannot produce chat results'? Is that handled appropriately here?
There was a problem hiding this comment.
Good catch! I'll fix this.
We do. The Model type has a Capabilities []ModelCapability json:"capabilities"`` field, with the values of "chat" or "autocomplete".
So it totally makes sense to add that sort of check here, since it's something we would want to rely on in the future. (e.g. a site admin specifying a ModelOverride that gpt-4o can only be used for chat, and therefore requests to use it for autocompletions will fail.)
There was a problem hiding this comment.
Updated to:
if mergedConfig.DefaultModels.Chat == "" {
validModel := getModelWithRequirements(types.ModelCapabilityAutocomplete, accuracy, balanced)
if validModel == nil {
return nil, errors.New("no suitable model found for Chat")
}
mergedConfig.DefaultModels.Chat = *validModel
}| // The base model configuration data is what is embedded in this binary. | ||
| staticModelConfig, err := embedded.GetCodyGatewayModelConfig() | ||
| if err != nil { | ||
| return errors.New("embedded model data is missing or corrupt") |
There was a problem hiding this comment.
Are we embedding this correctly today? i.e. if we merge this will it prevent servers from starting or do we already embed model data
There was a problem hiding this comment.
We are embedding the data today, by virtue of simply importing the Go package and building it. So I do not believe it is possible -- outside of maybe build flag shenanigans? Or using CGO and messing with the linker? -- to reference the embedded package and somehow have it built without the go:embed.
So while I'm not confident enough to have this function just "return or panic", I am unaware of any way to actually trigger this error condition.
This PR exposes a new HTTP endpoint from the
frontendthat will return the LLM models currently supported by the Sourcegraph instance. (Or a 500 if Cody isn't enabled or not configured correctly.)This endpoint will be used by Cody clients in order to fetch LLM models dynamically, so that new LLM models can automatically be picked up as they are available from Cody Gateway. The underlying data structures will also make it easier for the Sourcegraph instance to configure arbitrary LLM providers, customize model settings, etc.
curl \ -H "Authorization: token ${SG_LOCALHOST_TOKEN}" \ https://sourcegraph.test:3443/.api/modelconfig/supported-models.jsonOverview
This PR is the latest (long) series of steps for enabling arbitrary LLM model configuration in the Sourcegraph backend. This PR introduces three new components into the
cmd/frontend/internal/modelconfigpackage:builder.go,builder_test.goThe
builder structis the logic for how we generate the LLM model configuration data based on the current site configuration, what is embedded in the Sourcegraph binary, and later, what we obtain from Cody Gateway.This is essentially how we apply things like the "model allow/deny list" or "provider/model overrides" that will later be available in the Sourcegraph site configuration.
For now, some of this functionality is inert. As there is no way for you to actually specify the type of configuration data that the
buildersupports. (e.g. you cannot specify the "LLM model allow or deny list", or fetch models from Cody Gateway.)So realistically, with this PR the only thing it is doing is just converting a (
internal/modelconfig/types)SiteModelConfigurationinto aModelConfigurationobject. (Which was added in https://github.com/sourcegraph/sourcegraph/pull/63533.)init.go,service.goNext is the
Service interface, which is a new global service that can be used anywhere within thefrontendbinary to fetch the current LLM configuration data.This will be used later to allow the Completions API endpoints to look up various provider and LLM configuration data.
init.gosupplies an initialization function to the Sourcegraph machinery, and then it goes about loading the LLM model configuration data and exposing it in a singleton object. Part of this is also registering a "site configuration watcher", so that we will automatically pick up on any changes to the site configuration that may impact the LLM model configuration.httpapi.goFinally, we expose a
HTTPHandlers structthat contains a singlehttp.HandlerFuncthat will serve the HTTP response.My intention is to later add more endpoints, such as a way for Site admins to easily
curlthe full configuration information. Or surface diagnostic-type information about available models.Next Steps
With this piece, we now expose a view of the Sourcegraph instance's currently supported LLM models. However, the existing completions API stilly reads the site configuration directly. (And does not use the newer, and more flexible
modelconfigpackage.)So the next step to continue wiring this up is to refactor the
frontend/internal/completionsandinternal/completionspackages to rely on the "newer style" configuration data. (Which 🤞 is faithfully reproducing the current site configuration data.)The end result should be a no-op, but will allow us to no longer rely on hard-coded LLM model names anywhere in the codebase. And instead we can just rely on the model configuration data.
Test plan
Added new unit tests. Verified things worked manually. And looked through existing unit tests for other components to sanity check some of the assumptions made in newer code.
Changelog
Sourcegraph instances how expose an HTTP endpoint that authenticated users can call to get a list of LLM models supported by the Sourcegraph instance. In the future this will be used to allow Cody users to select the LLM model dynamically, based on what is currently available.