Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

modelconfig: handle converting site config JSON -> internal data types#63706

Merged
emidoots merged 5 commits into
mainfrom
sg/modelconfiguration
Jul 9, 2024
Merged

modelconfig: handle converting site config JSON -> internal data types#63706
emidoots merged 5 commits into
mainfrom
sg/modelconfiguration

Conversation

@emidoots

@emidoots emidoots commented Jul 9, 2024

Copy link
Copy Markdown
Member

This causes the modelconfig package to actually begin converting the new "modelConfiguration" site config which was introduced in #63654 into the internal data types we use for model configuration.

This is all pretty straightforward / lame type conversion code, but carefully written to preserve all the semantics we care about.

Test plan

Written very carefully, and confirmed that this query:

curl -H "Authorization: token $(cat ~/local-token)" https://sourcegraph.test:3443/.api/modelconfig/supported-models.json

returns what I would expect with this site configuration:

site config
  // Setting this field means we are opting into the new Cody model configuration system which is in beta.
  "modelConfiguration": {
    // Disable use of Sourcegraph's servers for model discovery
    "sourcegraph": null,

    // Configure the OpenAI-compatible API endpoints that Cody should use to provide
    // mistral and bigcode (starcoder) models.
    "providerOverrides": [
      {
        "displayName": "Mistral",
        "id": "mistral",
        "serverSideConfig": {
          "type": "openaicompatible",
          "endpoint": "...",
          "accessToken": "...",
        },
      },
      {
        "displayName": "Bigcode",
        "id": "bigcode",
        "serverSideConfig": {
          "type": "openaicompatible",
          "endpoint": "...",
          "accessToken": "...",
        },
      },
    ],

    // Configure which exact mistral and starcoder models we want available
    "modelOverridesRecommendedSettings": [
      "bigcode::v1::starcoder2-7b",
      "mistral::v1::mixtral-8x7b-instruct"
    ],

    // Configure which models Cody will use by default
    "defaultModels": {
      "chat": "mistral::v1::mixtral-8x7b-instruct",
      "fastChat": "mistral::v1::mixtral-8x7b-instruct",
      "codeCompletion": "bigcode::v1::starcoder2-7b",
    }
  }

We could certainly use more unit tests here to prevent potential future regressions, I can add those in a future PR.

Changelog

Has no effect on users unless they opt into the early-access "modelConfiguration" site config feature.

This causes the `modelconfig` package to actually begin converting the
new `"modelConfiguration"` site config which was introduced in #63654
into the internal data types we use for model configuration.

This is all pretty straightforward / lame type conversion code, but
carefully written to preserve all the semantics we care about.

Signed-off-by: Stephen Gutekanst <stephen@sourcegraph.com>
@cla-bot cla-bot Bot added the cla-signed label Jul 9, 2024
Signed-off-by: Stephen Gutekanst <stephen@sourcegraph.com>

@chrsmith chrsmith left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine to get off the ground, but I am confident we'll want to rework the code around the recommendedSettings map[types.ModelRef]types.ModelOverride type, since it seems like the only thing we are really adding beyond the display name is the context window. And so we can probably cut down on a lot of the boiler plate. (But what you have here is fine to start, if we only are concerned about a dozen or so models.)

Comment thread cmd/frontend/internal/modelconfig/init.go Outdated
Comment thread cmd/frontend/internal/modelconfig/init.go Outdated
Comment thread cmd/frontend/internal/modelconfig/init.go Outdated
// This function is responsible for converting the schema.SiteConfiguration that admins write
// into our internal data type.
func getSiteModelConfigurationOrNil(logger log.Logger, siteConfig schema.SiteConfiguration) (*types.SiteModelConfiguration, error) {
// If "modelConfiguration" is specified, then we respect that and only that. If it is not specified,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't looked at your other PRs, but did we add any "site config validation" logic to return an error if both forms of configuration are adding? If not, just to confirm, we would want to do that, right? Since having both types of config specified would be super-confusing?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right, I haven't added any site config validation logic yet. I do have it on my TODO to do in the coming days though.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JFYI, I think we really do want to lean on the internal/modelconfig validation routines to ensure that the overall internal/modelconfig/types.ModelConfiguration type is well-formed.

Since otherwise, it's too easy to have confusing things break at runtime that would much easier be described as invalid configuration settings. We do that for the CompletionsProvider config codepath. But we can ship without it for ~today.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree; my thinking was that site config validation ('you cannot save your configuration, there was an error:') would consist of two things:

  1. Confirming that you did not specify both "completions" and the new "modelConfiguration" at the same time.
  2. Confirming that you did not list the same mref in both modelOverrides and modelOverridesRecommendedSettings (explained in site config: minor fixes, make modelConfiguration enable new backend models API, initial self-hosted model config #63697)
  3. Assuming the above two are not issues, running the converted configuration through the internal/modelconfig validation routines and yielding any errors it finds.

Comment thread cmd/frontend/internal/modelconfig/siteconfig.go Outdated
Stephen Gutekanst and others added 3 commits July 9, 2024 11:17
Co-authored-by: Chris Smith <chrsmith@users.noreply.github.com>
Signed-off-by: Stephen Gutekanst <stephen@sourcegraph.com>
Signed-off-by: Stephen Gutekanst <stephen@sourcegraph.com>
@emidoots

emidoots commented Jul 9, 2024

Copy link
Copy Markdown
Member Author

I am confident we'll want to rework the code around the recommendedSettings map[types.ModelRef]types.ModelOverride type, since it seems like the only thing we are really adding beyond the display name is the context window. And so we can probably cut down on a lot of the boiler plate. (But what you have here is fine to start, if we only are concerned about a dozen or so models.)

Yeah, I agree - I foresee this logic changing a good amount. The types I wrote out here right now are just stubs and I suspect that they will actually differ a fair amount once I get to actual testing with them

@emidoots emidoots merged commit 071de9e into main Jul 9, 2024
@emidoots emidoots deleted the sg/modelconfiguration branch July 9, 2024 18:40
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants