This repository was archived by the owner on Sep 30, 2024. It is now read-only.
Several refactoring to prepare for rolling out modelconfig#63731
Merged
Conversation
emidoots
approved these changes
Jul 9, 2024
emidoots
left a comment
Member
There was a problem hiding this comment.
Looks great, shipit.png
I'll base my work on top of this
chrsmith
referenced
this pull request
Jul 10, 2024
A couple of minor changes to minimize the diff for "large completions API refactoring". Most changes are just a refactoring of the `openai` completions provider, which I apparently missed in https://github.com/sourcegraph/sourcegraph/pull/63731. (There are still some smaller tweaks that can be made to the `fireworks` or `google` completion providers, but they aren't as meaningful. This PR also removes a couple of unused fields and methods. e.g. `types.CompletionRequestParameters::Prompt`. There was a comment to the effect of it being long since deprecated, and it is no longer read anywhere on the server side. So I'm assuming that a green CI/CD build means it is safe to remove. ## Test plan CI/CD ## Changelog NA
chrsmith
referenced
this pull request
Jul 12, 2024
…ore models) (#63797) This PR if what the past dozen or so [cleanup](https://github.com/sourcegraph/sourcegraph/pull/63359), [refactoring](https://github.com/sourcegraph/sourcegraph/pull/63731), and [test](https://github.com/sourcegraph/sourcegraph/pull/63761) PRs were all about: using the new `modelconfig` system for the completion APIs. This will enable users to: - Use the new site config schema for specifying LLM configuration, added in https://github.com/sourcegraph/sourcegraph/pull/63654. Sourcegraph admins who use these new site config options will be able to support many more LLM models and providers than is possible using the older "completions" site config. - For Cody Enterprise users, we no longer ignore the `CodyCompletionRequest.Model` field. And now support users specifying any LLM model (provided it is "supported" by the Sourcegraph instance). Beyond those two things, everything should continue to work like before. With any existing "completions" configuration data being converted into the `modelconfig` system (see https://github.com/sourcegraph/sourcegraph/pull/63533). ## Overview In order to understand how this all fits together, I'd suggest reviewing this PR commit-by-commit. ### [Update internal/completions to use modelconfig](https://github.com/sourcegraph/sourcegraph/commit/e6b7eb171eea6bd6a512f0e61457170a86128eae) The first change was to update the code we use to serve LLM completions. (Various implementations of the `types.CompletionsProvider` interface.) The key changes here were as follows: 1. Update the `CompletionRequest` type to include the `ModelConfigInfo` field (to make the new Provider and Model-specific configuration data available.) 2. Rename the `CompletionRequest.Model` field to `CompletionRequest.RequestedModel`. (But with a JSON annotation to maintain compatibility with existing callers.) This is to catch any bugs related to using the field directly, since that is now almost guaranteed to be a mistake. (See below.) With these changes, all of the `CompletionProvider`s were updated to reflect these changes. - Any situation where we used the `CompletionRequest.Parameters.RequestedModel` should now refer to `CompletionRequest.ModelConfigInfo.Model.ModelName`. The "model name" being the thing that should be passed to the API provider, e.g. `gpt-3.5-turbo`. - In some situations (`azureopenai`) we needed to rely on the Model ID as a more human-friendly identifier. This isn't 100% accurate, but will match the behavior we have today. A long doc comment calls out the details of what is wrong with that. - In other situations (`awsbedrock`, `azureopenai`) we read the new `modelconfig` data to configure the API provider (e.g. `Azure.UseDeprecatedAPI`), or surface model-specific metadata (e.g. AWS Provisioned Throughput ARNs). While the code is a little clunky to avoid larger refactoring, this is the heart and soul of how we will be writing new completion providers in the future. That is, taking specific configuration bags with whatever data that is required. ### [Fix bugs in modelconfig](https://github.com/sourcegraph/sourcegraph/commit/75a51d8cb520e35918bd3a67a090a36d456b1797) While we had lots of tests for converting the existing "completions" site config data into the `modelconfig.ModelConfiguration` structure, there were a couple of subtle bugs that I found while testing the larger change. The updated unit tests and comments should make that clear. ### [Update frontend/internal/httpapi/completions to use modelconfig](https://github.com/sourcegraph/sourcegraph/commit/084793e08fca51a5ab84a7d73421d575caeebaa1) The final step was to update the HTTP endpoints that serve the completion requests. There weren't any logic changes here, just refactoring how we lookup the required data. (e.g. converting the user's requested model into an actual model found in the site configuration.) We support Cody clients sending either "legacy mrefs" of the form `provider/model` like before, or the newer mref `provider::apiversion::model`. Although it will likely be a while before Cody clients are updated to only use the newer-style model references. The existing unit tests for the competitions APIs just worked, which was the plan. But for the few changes that were required I've added comments to explain the situation. ### [Fix: Support requesting models just by their ID](https://github.com/sourcegraph/sourcegraph/pull/63797/commits/99715feba614230aa84cf94aae571adb96768035) > ... We support Cody clients sending either "legacy mrefs" of the form `provider/model` like before ... Yeah, so apparently I lied 😅 . After doing more testing, the extension _also_ sends requests where the requested model is just `"model"`. (Without the provider prefix.) So that now works too. And we just blindly match "gtp-3.5-turbo" to the first mref with the matching model ID, such as "anthropic::unknown::gtp-3.5-turbo". ## Test plan Existing unit tests pass, added a few tests. And manually tested my Sg instance configured to act as both "dotcom" mode and a prototypical Cody Enterprise instance. ## Changelog Update the Cody APIs for chat or code completions to use the "new style" model configuration. This allows for great flexibility in configuring LLM providers and exposing new models, but also allows Cody Enterprise users to select different models for chats. This will warrant a longer, more detailed changelog entry for the patch release next week. As this unlocks many other exciting features.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR bundles several small refactoring to make it easier to review the pending "refactor completion APIs to read from modelconfig".
Going commit-by-commit:
Export the ValidateModelRef function
This just exports a validation function that was previously private from the
internal/modelconfigpackage. We'll be using this to sanity check theModelRefthat is returned from the "getModelsFn".Add modelconfig.{InitMock, ResetMock}
In order to update the
frontend/internal/httpapi/completionsunit tests, we need to have changes to the site configuration data ALSO update the globalmodelconfig.Service. Typically this would happen by a "config watcher" that gets registered, but we don't do that as part of unit tests.So instead, I just added some clunky
InitMock() errorandResetMock() errorfunctions to support unit testing. If you have a better idea for how to expose this behavior, I'm all ears.Move fireworks-specific API types into types.go
Moves the API data types for the
fireworksclient into their own file, just to keep things tidy.Refactor completion clients
This is a bit involved, but from a high-level, we are just passing the
types.CompletionRequestobject "lower" into the call stack. Rather than only return the.Parametersfield (types.CompletionRequestParameters).This is necessary, because I plan on adding another field to
types.CompletionRequest(to include themodelconfig.Providerandmodelconfig.Modeldata). So this change just does some of the work for that now, resulting in a smaller and easier to review diff later.Also, in some cases I moved the calls to
tokenManager.UpdateTokenCountsFromModelUsageinto a dedicated function to simplify the callsites.Test plan
Existing unit tests
Changelog
NA