Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

feature(cody): Data types for encoding LLM Model config via Sourcegraph site config#63533

Merged
chrsmith merged 9 commits into
mainfrom
chrsmith/modelconfig-site-config
Jul 1, 2024
Merged

feature(cody): Data types for encoding LLM Model config via Sourcegraph site config#63533
chrsmith merged 9 commits into
mainfrom
chrsmith/modelconfig-site-config

Conversation

@chrsmith

Copy link
Copy Markdown
Contributor

This PR is part of the epic space opera that is Server-side Cody Model Selection and friends, such as Cody with Self-hosted Models. It doesn't add any user-facing functionality, but rather introduces the foundational data types and conversion logic so we can integrate the new "LLM Model Configuration" code into the Sourcegraph instance.

Overview

The internal/modelconfig package defines the data types and validation logic for LLM Model Configuration. The goal is to provide an ultra-flexible way for a Sourcegraph backend to represent LLM configuration data, so that it can rely on it for registering its own completions API endpoints as well as send that data to Cody clients so that they can understand what LLM models the Sourcegraph backend supports.

To date, all of this LLM configuration data has been bundled into the types.ModelConfiguration data type. (internal/modelconfig/types) It contains all of the "Providers", "Models", etc. that the Sourcegraph instance understands.

However, this data type is not easy to manipulate. It is designed to make it easy to look up data when we need to process an LLM completion request, or to easily consumable by Cody clients. But there isn't a good way for a Sourcegraph admin to "modify" or "patch" that object.

This PR introduces a new data type that is conceptually the same -- a bunch of LLM model configuration -- but is meant to be expressed by a Sourcegraph administrator manually configuring how Cody should work on their instance.

Introducing SiteModelConfiguration.

Inside internal/modelconfig/types/documents.go we define a new type, SiteModelConfiguration. This is intended to be encoded directly into the site configuration (site.schema.json) and be reasonably easy for a non-expert to configure.

@slimsag and I spent a lot of time talking through the various scenarios that we want to enable, and various tradeoffs for how to represent the data. This is what we wound up with and is able to serve all known requirements. (As far as we know for now. I hope. 😅)

While it isn't acceptable as a valid form of documentation (which I'll produce once the end-to-end of this is all working, and I can show how things come together), I'd suggest you take a look at internal/modelconfig/validation_test.go to understand how to think about this new data type. (As well as all the doc comments on the SiteModelConfiguration type itself.)

Converting the existing Site Configuration

In order to provide some more context for how we expect to use these data types, I've included some related functionality in this PR as well.

This PR adds a new cmd/frontend/internal/modelconfig package. This is where we will put the Sourcegraph instance-specific logic for reading, updating, and serving the LLM model configuration data.

For now, it just contains utility types and one meaningful function:

// convertCompletionsConfig converts the supplied Completions configuration blob (the Cody Enterprise configuration data)
// into the newer types.SiteModelConfiguration structure.
//
// Assumes that the supplied completions object is valid, and contains all the required settings. e.g. the site admin
// can leave some things blank, but `conf/computed.go`'s `GetCompletionsConfig()` will fill the Endpoint and related
// fields with meaingful defaults.
func convertCompletionsConfig(completionsCfg *conftypes.CompletionsConfig) *types.SiteModelConfiguration {

In order to refactor our current completions APIs to use the new modelconfig data types, we first need to safely read existing Cody Enterprise configuration data and represent it in this new data structure.

The convertCompletionsConfig function loads the CompletionsConfig data in the site configuratg (i.e. all this stuff and converts it into a SiteModelConfiguration.

This part is really important to get right, since a mistake here would just be a regression for any Cody Enterprise customer that upgrades their Sourcegraph instance. (After I continue refactoring the completions API codepath.)

We'll be able to do more meaningful testing as we land the next few PRs, if you look at cmd/frontend/internal/modelconfig/siteconfig_completions_test.go you can get a sense for how the existing Cody Enterprise configuration data gets surfaced in the new data types.

Next Steps

After this PR lands, the next step is to continue building out the cmd/frontend/internal/modelconfig package. Specifically, to expose the types.ModelConfiguration data so that the Completions APIs (cmd/frontend/internal/completions) can actually use this data for making LLM requests, either via BYOK or to Cody Gateway.

This will involve some super-annoying to write code that "merges" the types.ModelConfiguration data that we'll fetch from Cody Gateway and/or is embedded in the Sourcegraph instance with the Sourcegraph admin supplied types. types.SiteModelConfiguration.

And... FINALLY!!!! at long last, once that is done, it's just a matter of updating the Completions handlers to use these new data types, and enjoy the strongly typed and easily extendable ServerSideProviderConfig configuration objects.

Test plan

Lots of new unit tests, but none of this is surfaced to end users. YET.

@chrsmith chrsmith requested review from RXminuS and emidoots June 27, 2024 23:34
@cla-bot cla-bot Bot added the cla-signed label Jun 27, 2024
// "foo/bar" => "foo::unknown::bar"
//
// And aws-bedrock provisioned throughput... is handled in its own way.
convertToModelRef := func(cfgModelID string) string {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we also need a special case in this function for Azure.

For context:

The way I think you should handle this in this PR:

  1. When convertToModelRef is being called for the azure-openai provider, use the values of completionsCfg.AzureChatModel instead of completionsCfg.ChatModel, same for the ocmpletions model.
  2. For fastchat .... it seems we didn't consider this case. Introduce a new AzureFastChatModel site config field and respect that?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we also need a special case in this function for Azure.

Yes, absolutely. In the latest set of changes stacked on top of this, I've made that change.

However, I do need to update it to honor azureChatModel/azureCompletionModel... what is the difference between that and chatModel/completionModel? For the Azure case the azureCompletionModel is the actual LLM model name, whereas the compoetionModel is the "Azure deployment name"?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, with Azure, chatModel, fastChatModel, and completionModel are all the "Azure deployment name" - a unique identifier that comes from the customer's azure deployment - a grave mistake we made when introducing that provider.

To try to correct it, we introduced azureChatModel, azureCompletionModel which are the actual model names like gpt-35-turbo-instruct and not unique to the customer's deployment.

Comment thread cmd/frontend/internal/modelconfig/siteconfig_completions.go Outdated
Comment thread internal/modelconfig/types/documents.go Outdated
// If the ModelRef is unknown (e.g. the Sourcegraph-supplied model was removed
// via the SourcegraphModelConfig's deny list.) In that case, all model settings
// need to be supplied, either explicitly or by the provider's DefaultModelConfig.
ModelOverrides []ModelOverride `json:"modelOverrides"`

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just FYI - this ModelOverride struct is currently missing the ability to use 'recommended' settings as we talked about for a specific model key, e.g.:

        // Replaces any models, and may just add new ones if the mref is unique.
        "modelOverrides": [
            {
                "mistral::v1::mixtral-8x22b-instruct": {
                    "useRecommendSettings": true,
                },
                ...
            }
        ]

I can work on this in a follow-up PR, but will be something I want for self-hosted models

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes. I ended up just having reasonable-ish defaults for everything but CompletionWindow. Will call that out in the next update.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be clear, the goal here is for me Stephen to be able to express this:

If a site admin is configuring mistral::v1::mixtral-8x22b-instruct specifically (no other mrefs, just that one in specific - hard-coded in our Go code somewhere), and is a self-hosted model customer, let the site admin say "use whatever configuration Stephen suggests for all fields here with this mref".

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. Then yes, that is something that isn't accounted for in these Go types. (And perhaps that would be something that "only exists" in the Site Config projection? And those "useRecommendedSettings" get applied when reading the site config and converting it into the modelconfig types?)

For reference, here's how this works in this particular PR:

  • We define a DefaultModelConfig for each ProviderOverride.
  • We do not provide a default for ContextWindow (which is definitely required), because we honor CompletionModelMaxTokens and related.
	// Generic defaults.
	defaultModelConfig := types.DefaultModelConfig{
		Capabilities: []types.ModelCapability{
			types.ModelCapabilityAutocomplete,
			types.ModelCapabilityChat,
		},
		Category: types.ModelCategoryBalanced,
		Status:   types.ModelStatusStable,
		Tier:     types.ModelTierEnterprise,

		// IMPORTANT: The default model config contains an invalid
		// context window (0, 0). The ModelOverrides MUST set the
		// expected values.
	}

@emidoots emidoots left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work

case defModelFastChatModel:
return "fast-chat"
default:
return "unknown"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we actually need to express an unknown case in some real scenario? Otherwise I might suggest panic()

switch legacyModelRef.provider {
case "aws-bedrock":
effectiveProviderID = "anthropic"
case "azure-ai":

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
case "azure-ai":
case "azure-openai":

Unless I'm mistaken this is referring to the legacy model provider from the site config

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Good catch.

This stanza is actually kinda redundant, since we do this same "provider remapping" inside of parseLegacyModelRef(...). But given how subtle it is, I figured it was worth keeping in even if it is a no-op/duplicate. (Of course, it doesn't help if it uses the wrong enum value... 😭)

	// AzureOpenAI
	case conftypes.CompletionsProviderNameAzureOpenAI:
		// AzureOpenAI (at least for v5.3) only supports openai-provided models.
		// However, the site configuration requires you to specify the "Azure Deployment ID"
		// in place of the model name. And to later diambiguate this, admins could optionally
		// supply more meaningful model names. (Although the "deployment ID" is what is still
		// required to be used when making the API call.)
		providerID = "openai"

Comment thread cmd/frontend/internal/modelconfig/siteconfig_completions.go Outdated
// Assumes that the supplied completions object is valid, and contains all the required settings. e.g. the site admin
// MAY leave some things blank, but `conf/computed.go`'s `GetCompletionsConfig()` will provide defaults for all the
// values. So even if the site admin didn't provide the `Endpoint` field, we expect it to be present.
func convertCompletionsConfig(completionsCfg *conftypes.CompletionsConfig) (*types.SiteModelConfiguration, error) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comment: this function surface-level looks super complex, I feel like some portions of it could be broken out into more readable function names.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function surface-level looks super complex

OMFG yes.

As I later added more tests, and accounted for more corner cases, I ended up rewriting most of it. While the function is still super long and surprisingly complicated, it is hopefully easier to follow.

@emidoots emidoots left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great

@chrsmith

chrsmith commented Jul 1, 2024

Copy link
Copy Markdown
Contributor Author

@slimsag would you mind taking another look before I submit?

  • Merged in the latest changes from main
  • Updated the code to account for AzureChatModel and friends, added a unit test
  • Added a modelconfig.SanitizeResourceName(string) string function. Since it'll come in handy.

{
// BUG: The "model name alias" configuration doesn't apply to FastChat, hence the awkward ModelID.
m := siteModelConfig.ModelOverrides[0]
assert.EqualValues(t, "openai::unknown::azure-deployment-id_-__54e1331b-4ccc-44b8-b9ef-59c6b050fdb8_", m.ModelRef)

@emidoots emidoots Jul 1, 2024

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving this comment as a reminder for myself to add an azureFastChatModel and fix this. I'll send a PR for this

Edit: I should also update the site config alert wording and the site.schema.json to not say (optional) anymore. we should just require these fields.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as a reminder for myself to add an azureFastChatModel and fix this.
🙏 I'd offer to just fix this myself, if I didn't already have like several thousand lines of code that needs to be teased out into PRs already 😬 🙇 .

to not say (optional) anymore. we should just require these fields.
💯 🙏

@emidoots

emidoots commented Jul 1, 2024

Copy link
Copy Markdown
Member

@chrsmith the only thing I see missing here - and maybe we can add it as a follow-on later, is the deployment names like "azure-deployment-id - {7d1f5676-189d-4a97-9dd9-10ae187aba99}" are going to be needed by the actual azure backend Go client. I don't see anywhere here that we store them in the Go data structures, but maybe I missed it?

@emidoots emidoots left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm :shipit:

@chrsmith chrsmith merged commit aa87316 into main Jul 1, 2024
@chrsmith chrsmith deleted the chrsmith/modelconfig-site-config branch July 1, 2024 23:33
@chrsmith

chrsmith commented Jul 1, 2024

Copy link
Copy Markdown
Contributor Author

@chrsmith the only thing I see missing here - and maybe we can add it as a follow-on later, is the deployment names like "azure-deployment-id - {7d1f5676-189d-4a97-9dd9-10ae187aba99}" are going to be needed by the actual azure backend Go client. I don't see anywhere here that we store them in the Go data structures, but maybe I missed it?

@slimsag great question.

This will make more sense when I send out the PRs that refactor how the actual completion providers work, but I believe we have all the information we need on the types.Model.

The Model.ID is the opaque ID just used to uniquely identify the model, e.g. "gpt-4o". The meaningful value that we pass to the API Provider for that particular model however is in the Model.ModelName field. So as long that value has the "azure-deployment-id - {7d1f5676-189d-4a97-9dd9-10ae187aba99}" string, then things should work just fine.

And this is the behavior we have today. It's just that, we treat Model.ID == Model.ModelName, instead of having these as two distinct concepts. (Among other changes introduced with the new types.)

@emidoots

emidoots commented Jul 2, 2024

Copy link
Copy Markdown
Member

Sounds good, as long as the Go client has a way to get that exact "azure-deployment-id - {7d1f5676-189d-4a97-9dd9-10ae187aba99}" string as it was originally written, should be fine.

chrsmith referenced this pull request Jul 3, 2024
This PR exposes a new HTTP endpoint from the `frontend` that will return
the LLM models currently supported by the Sourcegraph instance. (Or a
500 if Cody isn't enabled or not configured correctly.)

This endpoint will be used by Cody clients in order to fetch LLM models
dynamically, so that new LLM models can automatically be picked up as
they are available from Cody Gateway. The underlying data structures
will also make it easier for the Sourcegraph instance to configure
arbitrary LLM providers, customize model settings, etc.

```zsh
curl \
    -H "Authorization: token ${SG_LOCALHOST_TOKEN}" \
    https://sourcegraph.test:3443/.api/modelconfig/supported-models.json 
```

## Overview

This PR is the latest (long) series of steps for enabling arbitrary LLM
model configuration in the Sourcegraph backend. This PR introduces three
new components into the `cmd/frontend/internal/modelconfig` package:

### `builder.go`, `builder_test.go`

The `builder struct` is the logic for how we generate the LLM model
configuration data based on the current site configuration, what is
embedded in the Sourcegraph binary, and later, what we obtain from Cody
Gateway.

This is essentially how we apply things like the "model allow/deny list"
or "provider/model overrides" that will later be available in the
Sourcegraph site configuration.

For now, some of this functionality is inert. As there is no way for you
to actually specify the type of configuration data that the `builder`
supports. (e.g. you cannot specify the "LLM model allow or deny list",
or fetch models from Cody Gateway.)

So realistically, with this PR the only thing it is doing is just
converting a (`internal/modelconfig/types`) `SiteModelConfiguration`
into a `ModelConfiguration` object. (Which was added in
https://github.com/sourcegraph/sourcegraph/pull/63533.)

### `init.go`, `service.go`

Next is the `Service interface`, which is a new global service that can
be used anywhere within the `frontend` binary to fetch the current LLM
configuration data.

This will be used later to allow the Completions API endpoints to look
up various provider and LLM configuration data.

`init.go` supplies an initialization function to the Sourcegraph
machinery, and then it goes about loading the LLM model configuration
data and exposing it in a singleton object. Part of this is also
registering a "site configuration watcher", so that we will
automatically pick up on any changes to the site configuration that may
impact the LLM model configuration.

### `httpapi.go`

Finally, we expose a `HTTPHandlers struct` that contains a single
`http.HandlerFunc` that will serve the HTTP response.

My intention is to later add more endpoints, such as a way for Site
admins to easily `curl` the full configuration information. Or surface
diagnostic-type information about available models.

## Next Steps

With this piece, we now expose a view of the Sourcegraph instance's
currently supported LLM models. However, the existing completions API
stilly reads the site configuration directly. (And does not use the
newer, and more flexible `modelconfig` package.)

So the next step to continue wiring this up is to refactor the
`frontend/internal/completions` and `internal/completions` packages to
rely on the "newer style" configuration data. (Which 🤞 is faithfully
reproducing the current site configuration data.)

The end result should be a no-op, but will allow us to no longer rely on
hard-coded LLM model names anywhere in the codebase. And instead we can
just rely on the model configuration data.

## Test plan

Added new unit tests. Verified things worked manually. And looked
through existing unit tests for other components to sanity check some of
the assumptions made in newer code.

## Changelog

Sourcegraph instances how expose an HTTP endpoint that authenticated
users can call to get a list of LLM models supported by the Sourcegraph
instance. In the future this will be used to allow Cody users to select
the LLM model dynamically, based on what is currently available.
chrsmith referenced this pull request Jul 12, 2024
…ore models) (#63797)

This PR if what the past dozen or so
[cleanup](https://github.com/sourcegraph/sourcegraph/pull/63359),
[refactoring](https://github.com/sourcegraph/sourcegraph/pull/63731),
and [test](https://github.com/sourcegraph/sourcegraph/pull/63761) PRs
were all about: using the new `modelconfig` system for the completion
APIs.

This will enable users to:

- Use the new site config schema for specifying LLM configuration, added
in https://github.com/sourcegraph/sourcegraph/pull/63654. Sourcegraph
admins who use these new site config options will be able to support
many more LLM models and providers than is possible using the older
"completions" site config.
- For Cody Enterprise users, we no longer ignore the
`CodyCompletionRequest.Model` field. And now support users specifying
any LLM model (provided it is "supported" by the Sourcegraph instance).

Beyond those two things, everything should continue to work like before.
With any existing "completions" configuration data being converted into
the `modelconfig` system (see
https://github.com/sourcegraph/sourcegraph/pull/63533).

## Overview

In order to understand how this all fits together, I'd suggest reviewing
this PR commit-by-commit.

### [Update internal/completions to use
modelconfig](https://github.com/sourcegraph/sourcegraph/commit/e6b7eb171eea6bd6a512f0e61457170a86128eae)

The first change was to update the code we use to serve LLM completions.
(Various implementations of the `types.CompletionsProvider` interface.)

The key changes here were as follows:

1. Update the `CompletionRequest` type to include the `ModelConfigInfo`
field (to make the new Provider and Model-specific configuration data
available.)
2. Rename the `CompletionRequest.Model` field to
`CompletionRequest.RequestedModel`. (But with a JSON annotation to
maintain compatibility with existing callers.) This is to catch any bugs
related to using the field directly, since that is now almost guaranteed
to be a mistake. (See below.)

With these changes, all of the `CompletionProvider`s were updated to
reflect these changes.

- Any situation where we used the
`CompletionRequest.Parameters.RequestedModel` should now refer to
`CompletionRequest.ModelConfigInfo.Model.ModelName`. The "model name"
being the thing that should be passed to the API provider, e.g.
`gpt-3.5-turbo`.
- In some situations (`azureopenai`) we needed to rely on the Model ID
as a more human-friendly identifier. This isn't 100% accurate, but will
match the behavior we have today. A long doc comment calls out the
details of what is wrong with that.
- In other situations (`awsbedrock`, `azureopenai`) we read the new
`modelconfig` data to configure the API provider (e.g.
`Azure.UseDeprecatedAPI`), or surface model-specific metadata (e.g. AWS
Provisioned Throughput ARNs). While the code is a little clunky to avoid
larger refactoring, this is the heart and soul of how we will be writing
new completion providers in the future. That is, taking specific
configuration bags with whatever data that is required.

### [Fix bugs in
modelconfig](https://github.com/sourcegraph/sourcegraph/commit/75a51d8cb520e35918bd3a67a090a36d456b1797)

While we had lots of tests for converting the existing "completions"
site config data into the `modelconfig.ModelConfiguration` structure,
there were a couple of subtle bugs that I found while testing the larger
change.

The updated unit tests and comments should make that clear.

### [Update frontend/internal/httpapi/completions to use
modelconfig](https://github.com/sourcegraph/sourcegraph/commit/084793e08fca51a5ab84a7d73421d575caeebaa1)

The final step was to update the HTTP endpoints that serve the
completion requests. There weren't any logic changes here, just
refactoring how we lookup the required data. (e.g. converting the user's
requested model into an actual model found in the site configuration.)

We support Cody clients sending either "legacy mrefs" of the form
`provider/model` like before, or the newer mref
`provider::apiversion::model`. Although it will likely be a while before
Cody clients are updated to only use the newer-style model references.

The existing unit tests for the competitions APIs just worked, which was
the plan. But for the few changes that were required I've added comments
to explain the situation.

### [Fix: Support requesting models just by their
ID](https://github.com/sourcegraph/sourcegraph/pull/63797/commits/99715feba614230aa84cf94aae571adb96768035)

> ... We support Cody clients sending either "legacy mrefs" of the form
`provider/model` like before ...

Yeah, so apparently I lied 😅 . After doing more testing, the extension
_also_ sends requests where the requested model is just `"model"`.
(Without the provider prefix.)

So that now works too. And we just blindly match "gtp-3.5-turbo" to the
first mref with the matching model ID, such as
"anthropic::unknown::gtp-3.5-turbo".

## Test plan

Existing unit tests pass, added a few tests. And manually tested my Sg
instance configured to act as both "dotcom" mode and a prototypical Cody
Enterprise instance.

## Changelog

Update the Cody APIs for chat or code completions to use the "new style"
model configuration. This allows for great flexibility in configuring
LLM providers and exposing new models, but also allows Cody Enterprise
users to select different models for chats.

This will warrant a longer, more detailed changelog entry for the patch
release next week. As this unlocks many other exciting features.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants