feature(cody): Data types for encoding LLM Model config via Sourcegraph site config by chrsmith · Pull Request #63533 · sourcegraph/sourcegraph-public-snapshot

chrsmith · 2024-06-27T23:34:29Z

This PR is part of the epic space opera that is Server-side Cody Model Selection and friends, such as Cody with Self-hosted Models. It doesn't add any user-facing functionality, but rather introduces the foundational data types and conversion logic so we can integrate the new "LLM Model Configuration" code into the Sourcegraph instance.

Overview

The internal/modelconfig package defines the data types and validation logic for LLM Model Configuration. The goal is to provide an ultra-flexible way for a Sourcegraph backend to represent LLM configuration data, so that it can rely on it for registering its own completions API endpoints as well as send that data to Cody clients so that they can understand what LLM models the Sourcegraph backend supports.

To date, all of this LLM configuration data has been bundled into the types.ModelConfiguration data type. (internal/modelconfig/types) It contains all of the "Providers", "Models", etc. that the Sourcegraph instance understands.

However, this data type is not easy to manipulate. It is designed to make it easy to look up data when we need to process an LLM completion request, or to easily consumable by Cody clients. But there isn't a good way for a Sourcegraph admin to "modify" or "patch" that object.

This PR introduces a new data type that is conceptually the same -- a bunch of LLM model configuration -- but is meant to be expressed by a Sourcegraph administrator manually configuring how Cody should work on their instance.

Introducing `SiteModelConfiguration`.

Inside internal/modelconfig/types/documents.go we define a new type, SiteModelConfiguration. This is intended to be encoded directly into the site configuration (site.schema.json) and be reasonably easy for a non-expert to configure.

@slimsag and I spent a lot of time talking through the various scenarios that we want to enable, and various tradeoffs for how to represent the data. This is what we wound up with and is able to serve all known requirements. (As far as we know for now. I hope. 😅)

While it isn't acceptable as a valid form of documentation (which I'll produce once the end-to-end of this is all working, and I can show how things come together), I'd suggest you take a look at internal/modelconfig/validation_test.go to understand how to think about this new data type. (As well as all the doc comments on the SiteModelConfiguration type itself.)

Converting the existing Site Configuration

In order to provide some more context for how we expect to use these data types, I've included some related functionality in this PR as well.

This PR adds a new cmd/frontend/internal/modelconfig package. This is where we will put the Sourcegraph instance-specific logic for reading, updating, and serving the LLM model configuration data.

For now, it just contains utility types and one meaningful function:

// convertCompletionsConfig converts the supplied Completions configuration blob (the Cody Enterprise configuration data)
// into the newer types.SiteModelConfiguration structure.
//
// Assumes that the supplied completions object is valid, and contains all the required settings. e.g. the site admin
// can leave some things blank, but `conf/computed.go`'s `GetCompletionsConfig()` will fill the Endpoint and related
// fields with meaingful defaults.
func convertCompletionsConfig(completionsCfg *conftypes.CompletionsConfig) *types.SiteModelConfiguration {

In order to refactor our current completions APIs to use the new modelconfig data types, we first need to safely read existing Cody Enterprise configuration data and represent it in this new data structure.

The convertCompletionsConfig function loads the CompletionsConfig data in the site configuratg (i.e. all this stuff and converts it into a SiteModelConfiguration.

This part is really important to get right, since a mistake here would just be a regression for any Cody Enterprise customer that upgrades their Sourcegraph instance. (After I continue refactoring the completions API codepath.)

We'll be able to do more meaningful testing as we land the next few PRs, if you look at cmd/frontend/internal/modelconfig/siteconfig_completions_test.go you can get a sense for how the existing Cody Enterprise configuration data gets surfaced in the new data types.

Next Steps

After this PR lands, the next step is to continue building out the cmd/frontend/internal/modelconfig package. Specifically, to expose the types.ModelConfiguration data so that the Completions APIs (cmd/frontend/internal/completions) can actually use this data for making LLM requests, either via BYOK or to Cody Gateway.

This will involve some super-annoying to write code that "merges" the types.ModelConfiguration data that we'll fetch from Cody Gateway and/or is embedded in the Sourcegraph instance with the Sourcegraph admin supplied types. types.SiteModelConfiguration.

And... FINALLY!!!! at long last, once that is done, it's just a matter of updating the Completions handlers to use these new data types, and enjoy the strongly typed and easily extendable ServerSideProviderConfig configuration objects.

Test plan

Lots of new unit tests, but none of this is surfaced to end users. YET.

emidoots · 2024-07-01T20:45:51Z

+	// "foo/bar" => "foo::unknown::bar"
+	//
+	// And aws-bedrock provisioned throughput... is handled in its own way.
+	convertToModelRef := func(cfgModelID string) string {


I think that we also need a special case in this function for Azure.

For context:

When admins upgrade, if they have "provider": "azure-openai" then they will be warned at the top of the page when using azure-openai provider its mandatory to set both completions.azureChatModel and completions.azureCompletionModel for proper LLM Token usage - this was implemented in https://github.com/sourcegraph/sourcegraph/pull/63100/files#diff-c4de0420e936b4e62127fd642bf96984f80610be72f612012d66fc3014b3a502R78 - and they will not have completionsCfg.azureChatModel or completionsCfg.azureCompletionModel set yet.

cfgModelID here would refer to the Azure deployment name -- not the actual model.

The way I think you should handle this in this PR:

When convertToModelRef is being called for the azure-openai provider, use the values of completionsCfg.AzureChatModel instead of completionsCfg.ChatModel, same for the ocmpletions model.

For fastchat .... it seems we didn't consider this case. Introduce a new AzureFastChatModel site config field and respect that?

I think that we also need a special case in this function for Azure.

Yes, absolutely. In the latest set of changes stacked on top of this, I've made that change.

However, I do need to update it to honor azureChatModel/azureCompletionModel... what is the difference between that and chatModel/completionModel? For the Azure case the azureCompletionModel is the actual LLM model name, whereas the compoetionModel is the "Azure deployment name"?

Yes, with Azure, chatModel, fastChatModel, and completionModel are all the "Azure deployment name" - a unique identifier that comes from the customer's azure deployment - a grave mistake we made when introducing that provider.

To try to correct it, we introduced azureChatModel, azureCompletionModel which are the actual model names like gpt-35-turbo-instruct and not unique to the customer's deployment.

emidoots · 2024-07-01T21:14:38Z

+	// If the ModelRef is unknown (e.g. the Sourcegraph-supplied model was removed
+	// via the SourcegraphModelConfig's deny list.) In that case, all model settings
+	// need to be supplied, either explicitly or by the provider's DefaultModelConfig.
+	ModelOverrides []ModelOverride `json:"modelOverrides"`


Just FYI - this ModelOverride struct is currently missing the ability to use 'recommended' settings as we talked about for a specific model key, e.g.:

// Replaces any models, and may just add new ones if the mref is unique. "modelOverrides": [ { "mistral::v1::mixtral-8x22b-instruct": { "useRecommendSettings": true, }, ... } ]

I can work on this in a follow-up PR, but will be something I want for self-hosted models

Ah yes. I ended up just having reasonable-ish defaults for everything but CompletionWindow. Will call that out in the next update.

Just to be clear, the goal here is for me Stephen to be able to express this:

If a site admin is configuring mistral::v1::mixtral-8x22b-instruct specifically (no other mrefs, just that one in specific - hard-coded in our Go code somewhere), and is a self-hosted model customer, let the site admin say "use whatever configuration Stephen suggests for all fields here with this mref".

Gotcha. Then yes, that is something that isn't accounted for in these Go types. (And perhaps that would be something that "only exists" in the Site Config projection? And those "useRecommendedSettings" get applied when reading the site config and converting it into the modelconfig types?)

For reference, here's how this works in this particular PR:

We define a DefaultModelConfig for each ProviderOverride.

We do not provide a default for ContextWindow (which is definitely required), because we honor CompletionModelMaxTokens and related.

// Generic defaults. defaultModelConfig := types.DefaultModelConfig{ Capabilities: []types.ModelCapability{ types.ModelCapabilityAutocomplete, types.ModelCapabilityChat, }, Category: types.ModelCategoryBalanced, Status: types.ModelStatusStable, Tier: types.ModelTierEnterprise, // IMPORTANT: The default model config contains an invalid // context window (0, 0). The ModelOverrides MUST set the // expected values. }

emidoots

Excellent work

…th/modelconfig-site-config

emidoots · 2024-07-01T22:04:08Z

+	case defModelFastChatModel:
+		return "fast-chat"
+	default:
+		return "unknown"


do we actually need to express an unknown case in some real scenario? Otherwise I might suggest panic()

emidoots · 2024-07-01T22:10:38Z

+		switch legacyModelRef.provider {
+		case "aws-bedrock":
+			effectiveProviderID = "anthropic"
+		case "azure-ai":


Suggested change

case "azure-ai":

case "azure-openai":

Unless I'm mistaken this is referring to the legacy model provider from the site config

Yes. Good catch.

This stanza is actually kinda redundant, since we do this same "provider remapping" inside of parseLegacyModelRef(...). But given how subtle it is, I figured it was worth keeping in even if it is a no-op/duplicate. (Of course, it doesn't help if it uses the wrong enum value... 😭)

// AzureOpenAI case conftypes.CompletionsProviderNameAzureOpenAI: // AzureOpenAI (at least for v5.3) only supports openai-provided models. // However, the site configuration requires you to specify the "Azure Deployment ID" // in place of the model name. And to later diambiguate this, admins could optionally // supply more meaningful model names. (Although the "deployment ID" is what is still // required to be used when making the API call.) providerID = "openai"

emidoots · 2024-07-01T22:13:23Z

+// Assumes that the supplied completions object is valid, and contains all the required settings. e.g. the site admin
+// MAY leave some things blank, but `conf/computed.go`'s `GetCompletionsConfig()` will provide defaults for all the
+// values. So even if the site admin didn't provide the `Endpoint` field, we expect it to be present.
+func convertCompletionsConfig(completionsCfg *conftypes.CompletionsConfig) (*types.SiteModelConfiguration, error) {


minor comment: this function surface-level looks super complex, I feel like some portions of it could be broken out into more readable function names.

this function surface-level looks super complex

OMFG yes.

As I later added more tests, and accounted for more corner cases, I ended up rewriting most of it. While the function is still super long and surprisingly complicated, it is hopefully easier to follow.

emidoots

Great

chrsmith · 2024-07-01T23:11:28Z

@slimsag would you mind taking another look before I submit?

Merged in the latest changes from main
Updated the code to account for AzureChatModel and friends, added a unit test
Added a modelconfig.SanitizeResourceName(string) string function. Since it'll come in handy.

emidoots · 2024-07-01T23:23:42Z

+			{
+				// BUG: The "model name alias" configuration doesn't apply to FastChat, hence the awkward ModelID.
+				m := siteModelConfig.ModelOverrides[0]
+				assert.EqualValues(t, "openai::unknown::azure-deployment-id_-__54e1331b-4ccc-44b8-b9ef-59c6b050fdb8_", m.ModelRef)


Leaving this comment as a reminder for myself to add an azureFastChatModel and fix this. I'll send a PR for this

Edit: I should also update the site config alert wording and the site.schema.json to not say (optional) anymore. we should just require these fields.

as a reminder for myself to add an azureFastChatModel and fix this.
🙏 I'd offer to just fix this myself, if I didn't already have like several thousand lines of code that needs to be teased out into PRs already 😬 🙇 .

to not say (optional) anymore. we should just require these fields.
💯 🙏

emidoots · 2024-07-01T23:31:57Z

@chrsmith the only thing I see missing here - and maybe we can add it as a follow-on later, is the deployment names like "azure-deployment-id - {7d1f5676-189d-4a97-9dd9-10ae187aba99}" are going to be needed by the actual azure backend Go client. I don't see anywhere here that we store them in the Go data structures, but maybe I missed it?

emidoots

lgtm

chrsmith · 2024-07-01T23:36:49Z

@chrsmith the only thing I see missing here - and maybe we can add it as a follow-on later, is the deployment names like "azure-deployment-id - {7d1f5676-189d-4a97-9dd9-10ae187aba99}" are going to be needed by the actual azure backend Go client. I don't see anywhere here that we store them in the Go data structures, but maybe I missed it?

@slimsag great question.

This will make more sense when I send out the PRs that refactor how the actual completion providers work, but I believe we have all the information we need on the types.Model.

The Model.ID is the opaque ID just used to uniquely identify the model, e.g. "gpt-4o". The meaningful value that we pass to the API Provider for that particular model however is in the Model.ModelName field. So as long that value has the "azure-deployment-id - {7d1f5676-189d-4a97-9dd9-10ae187aba99}" string, then things should work just fine.

And this is the behavior we have today. It's just that, we treat Model.ID == Model.ModelName, instead of having these as two distinct concepts. (Among other changes introduced with the new types.)

emidoots · 2024-07-02T00:04:03Z

Sounds good, as long as the Go client has a way to get that exact "azure-deployment-id - {7d1f5676-189d-4a97-9dd9-10ae187aba99}" string as it was originally written, should be fine.

This PR exposes a new HTTP endpoint from the `frontend` that will return the LLM models currently supported by the Sourcegraph instance. (Or a 500 if Cody isn't enabled or not configured correctly.) This endpoint will be used by Cody clients in order to fetch LLM models dynamically, so that new LLM models can automatically be picked up as they are available from Cody Gateway. The underlying data structures will also make it easier for the Sourcegraph instance to configure arbitrary LLM providers, customize model settings, etc. ```zsh curl \ -H "Authorization: token ${SG_LOCALHOST_TOKEN}" \ https://sourcegraph.test:3443/.api/modelconfig/supported-models.json ``` ## Overview This PR is the latest (long) series of steps for enabling arbitrary LLM model configuration in the Sourcegraph backend. This PR introduces three new components into the `cmd/frontend/internal/modelconfig` package: ### `builder.go`, `builder_test.go` The `builder struct` is the logic for how we generate the LLM model configuration data based on the current site configuration, what is embedded in the Sourcegraph binary, and later, what we obtain from Cody Gateway. This is essentially how we apply things like the "model allow/deny list" or "provider/model overrides" that will later be available in the Sourcegraph site configuration. For now, some of this functionality is inert. As there is no way for you to actually specify the type of configuration data that the `builder` supports. (e.g. you cannot specify the "LLM model allow or deny list", or fetch models from Cody Gateway.) So realistically, with this PR the only thing it is doing is just converting a (`internal/modelconfig/types`) `SiteModelConfiguration` into a `ModelConfiguration` object. (Which was added in https://github.com/sourcegraph/sourcegraph/pull/63533.) ### `init.go`, `service.go` Next is the `Service interface`, which is a new global service that can be used anywhere within the `frontend` binary to fetch the current LLM configuration data. This will be used later to allow the Completions API endpoints to look up various provider and LLM configuration data. `init.go` supplies an initialization function to the Sourcegraph machinery, and then it goes about loading the LLM model configuration data and exposing it in a singleton object. Part of this is also registering a "site configuration watcher", so that we will automatically pick up on any changes to the site configuration that may impact the LLM model configuration. ### `httpapi.go` Finally, we expose a `HTTPHandlers struct` that contains a single `http.HandlerFunc` that will serve the HTTP response. My intention is to later add more endpoints, such as a way for Site admins to easily `curl` the full configuration information. Or surface diagnostic-type information about available models. ## Next Steps With this piece, we now expose a view of the Sourcegraph instance's currently supported LLM models. However, the existing completions API stilly reads the site configuration directly. (And does not use the newer, and more flexible `modelconfig` package.) So the next step to continue wiring this up is to refactor the `frontend/internal/completions` and `internal/completions` packages to rely on the "newer style" configuration data. (Which 🤞 is faithfully reproducing the current site configuration data.) The end result should be a no-op, but will allow us to no longer rely on hard-coded LLM model names anywhere in the codebase. And instead we can just rely on the model configuration data. ## Test plan Added new unit tests. Verified things worked manually. And looked through existing unit tests for other components to sanity check some of the assumptions made in newer code. ## Changelog Sourcegraph instances how expose an HTTP endpoint that authenticated users can call to get a list of LLM models supported by the Sourcegraph instance. In the future this will be used to allow Cody users to select the LLM model dynamically, based on what is currently available.

…ore models) (#63797) This PR if what the past dozen or so [cleanup](https://github.com/sourcegraph/sourcegraph/pull/63359), [refactoring](https://github.com/sourcegraph/sourcegraph/pull/63731), and [test](https://github.com/sourcegraph/sourcegraph/pull/63761) PRs were all about: using the new `modelconfig` system for the completion APIs. This will enable users to: - Use the new site config schema for specifying LLM configuration, added in https://github.com/sourcegraph/sourcegraph/pull/63654. Sourcegraph admins who use these new site config options will be able to support many more LLM models and providers than is possible using the older "completions" site config. - For Cody Enterprise users, we no longer ignore the `CodyCompletionRequest.Model` field. And now support users specifying any LLM model (provided it is "supported" by the Sourcegraph instance). Beyond those two things, everything should continue to work like before. With any existing "completions" configuration data being converted into the `modelconfig` system (see https://github.com/sourcegraph/sourcegraph/pull/63533). ## Overview In order to understand how this all fits together, I'd suggest reviewing this PR commit-by-commit. ### [Update internal/completions to use modelconfig](https://github.com/sourcegraph/sourcegraph/commit/e6b7eb171eea6bd6a512f0e61457170a86128eae) The first change was to update the code we use to serve LLM completions. (Various implementations of the `types.CompletionsProvider` interface.) The key changes here were as follows: 1. Update the `CompletionRequest` type to include the `ModelConfigInfo` field (to make the new Provider and Model-specific configuration data available.) 2. Rename the `CompletionRequest.Model` field to `CompletionRequest.RequestedModel`. (But with a JSON annotation to maintain compatibility with existing callers.) This is to catch any bugs related to using the field directly, since that is now almost guaranteed to be a mistake. (See below.) With these changes, all of the `CompletionProvider`s were updated to reflect these changes. - Any situation where we used the `CompletionRequest.Parameters.RequestedModel` should now refer to `CompletionRequest.ModelConfigInfo.Model.ModelName`. The "model name" being the thing that should be passed to the API provider, e.g. `gpt-3.5-turbo`. - In some situations (`azureopenai`) we needed to rely on the Model ID as a more human-friendly identifier. This isn't 100% accurate, but will match the behavior we have today. A long doc comment calls out the details of what is wrong with that. - In other situations (`awsbedrock`, `azureopenai`) we read the new `modelconfig` data to configure the API provider (e.g. `Azure.UseDeprecatedAPI`), or surface model-specific metadata (e.g. AWS Provisioned Throughput ARNs). While the code is a little clunky to avoid larger refactoring, this is the heart and soul of how we will be writing new completion providers in the future. That is, taking specific configuration bags with whatever data that is required. ### [Fix bugs in modelconfig](https://github.com/sourcegraph/sourcegraph/commit/75a51d8cb520e35918bd3a67a090a36d456b1797) While we had lots of tests for converting the existing "completions" site config data into the `modelconfig.ModelConfiguration` structure, there were a couple of subtle bugs that I found while testing the larger change. The updated unit tests and comments should make that clear. ### [Update frontend/internal/httpapi/completions to use modelconfig](https://github.com/sourcegraph/sourcegraph/commit/084793e08fca51a5ab84a7d73421d575caeebaa1) The final step was to update the HTTP endpoints that serve the completion requests. There weren't any logic changes here, just refactoring how we lookup the required data. (e.g. converting the user's requested model into an actual model found in the site configuration.) We support Cody clients sending either "legacy mrefs" of the form `provider/model` like before, or the newer mref `provider::apiversion::model`. Although it will likely be a while before Cody clients are updated to only use the newer-style model references. The existing unit tests for the competitions APIs just worked, which was the plan. But for the few changes that were required I've added comments to explain the situation. ### [Fix: Support requesting models just by their ID](https://github.com/sourcegraph/sourcegraph/pull/63797/commits/99715feba614230aa84cf94aae571adb96768035) > ... We support Cody clients sending either "legacy mrefs" of the form `provider/model` like before ... Yeah, so apparently I lied 😅 . After doing more testing, the extension _also_ sends requests where the requested model is just `"model"`. (Without the provider prefix.) So that now works too. And we just blindly match "gtp-3.5-turbo" to the first mref with the matching model ID, such as "anthropic::unknown::gtp-3.5-turbo". ## Test plan Existing unit tests pass, added a few tests. And manually tested my Sg instance configured to act as both "dotcom" mode and a prototypical Cody Enterprise instance. ## Changelog Update the Cody APIs for chat or code completions to use the "new style" model configuration. This allows for great flexibility in configuring LLM providers and exposing new models, but also allows Cody Enterprise users to select different models for chats. This will warrant a longer, more detailed changelog entry for the patch release next week. As this unlocks many other exciting features.

chrsmith added 3 commits June 27, 2024 15:57

Trivial: Call modelconfig.ValidateModelConfig in cody-gateway-config

6dcf974

modelconfig SDK update: Add site-config types

981cd4d

Add frontend/modelconfig for converting site config

b8aeba5

chrsmith requested review from RXminuS and emidoots June 27, 2024 23:34

cla-bot Bot added the cla-signed label Jun 27, 2024

emidoots reviewed Jul 1, 2024

View reviewed changes

Comment thread cmd/frontend/internal/modelconfig/siteconfig_completions.go Outdated

emidoots reviewed Jul 1, 2024

View reviewed changes

Comment thread internal/modelconfig/types/documents.go Outdated

emidoots reviewed Jul 1, 2024

View reviewed changes

emidoots approved these changes Jul 1, 2024

View reviewed changes

chrsmith added 2 commits July 1, 2024 14:36

Incorporate latest changes

cd94cda

Merge branch 'main' of github.com:sourcegraph/sourcegraph into chrsmi…

a3ea1ad

…th/modelconfig-site-config

emidoots reviewed Jul 1, 2024

View reviewed changes

Comment thread cmd/frontend/internal/modelconfig/siteconfig_completions.go Outdated

emidoots reviewed Jul 1, 2024

View reviewed changes

emidoots approved these changes Jul 1, 2024

View reviewed changes

chrsmith added 2 commits July 1, 2024 15:59

Support new Azure configuration setting

651cd16

Remove tabs from comments

b357ba4

chrsmith added 2 commits July 1, 2024 16:15

Fix typo

f29184b

Address PR feedback

6949986

emidoots reviewed Jul 1, 2024

View reviewed changes

emidoots approved these changes Jul 1, 2024

View reviewed changes

chrsmith merged commit aa87316 into main Jul 1, 2024

chrsmith deleted the chrsmith/modelconfig-site-config branch July 1, 2024 23:33

chrsmith mentioned this pull request Jul 2, 2024

feat(cody): Expose Sg modelconfig data via HTTP REST API #63604

Merged

chrsmith mentioned this pull request Jul 12, 2024

feat/cody: Refactor completions API to use new modelconfig (support more models) #63797

Merged

Conversation

chrsmith commented Jun 27, 2024

Overview

Introducing SiteModelConfiguration.

Converting the existing Site Configuration

Next Steps

Test plan

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

emidoots left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

emidoots left a comment

Choose a reason for hiding this comment

Uh oh!

chrsmith commented Jul 1, 2024

Uh oh!

emidoots Jul 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

emidoots commented Jul 1, 2024

Uh oh!

emidoots left a comment

Choose a reason for hiding this comment

Uh oh!

chrsmith commented Jul 1, 2024

Uh oh!

emidoots commented Jul 2, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Introducing `SiteModelConfiguration`.

emidoots Jul 1, 2024 •

edited

Loading