Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

Add more Completions handler tests#63761

Merged
chrsmith merged 6 commits into
mainfrom
chrsmith/more-completion-tests
Jul 10, 2024
Merged

Add more Completions handler tests#63761
chrsmith merged 6 commits into
mainfrom
chrsmith/more-completion-tests

Conversation

@chrsmith

Copy link
Copy Markdown
Contributor

This PR adds more unit tests for the "Chat Completions" HTTP endpoint. The goal is to have unit tests for more of the one-off quirks that we support today, so that we can catch any regressions when refactoring this code.

This PR adds another layer of test infrastructure to use to streamline writing completion tests. (Since they are kinda involved, and are mocking out multiple interactions, it's kinda necessary.)

It introduces a new data type completionsRequestTestData which contains all of the "inputs" to the test case, as well as some of the things we want to validate.

type completionsRequestTestData struct {
	SiteConfig schema.SiteConfiguration
	UserCompletionRequest types.CodyCompletionRequestParameters
	WantRequestToLLMProvider map[string]any
	WantRequestToLLMProviderPath string
	ResponseFromLLMProvider map[string]any
	WantCompletionResponse types.CompletionResponse
}

Then to run one of these tests, you just call the new function:

func runCompletionsTest(t *testing.T, infra *apiProviderTestInfra, data completionsRequestTestData) {

With this, the new pattern for completion tests is of the form:

func TestProviderX(t *testing.T) {
    // Return a valid site configuration, and the expected API request body
    // we will send to the LLM API provider X.
    getValidTestData := func() completionsRequestTestData {
        ...
    }

    t.Run("TestDataIsValid", func(t *testing.T) {
        // Just confirm that the stock test data works as expected,
        // without any test-specific modifications.
        data := getValidTestData()
        runCompletionsTest(t, infra, data)
    })
}

And then, for more sophisticated tests, we would just overwrite whatever subset of fields are necessary from the stock test data.

For example, testing the way AWS Bedrock provisioned throughput ARNs get reflected in the completions API can be done by creating a function to return the specific site configuration data, and then:

	t.Run("Chat", func(t *testing.T) {
			data := getValidTestData()
			data.SiteConfig.Completions = getProvisionedThroughputSiteConfig()

			// The chat model is using provisioned throughput, so the
			// URLs are different.
			data.WantRequestToLLMProviderPath = "/model/arn:aws:bedrock:us-west-2:012345678901:provisioned-model/abcdefghijkl/invoke"

			runCompletionsTest(t, infra, data)
		})

		t.Run("FastChat", func(t *testing.T) {
			data := getValidTestData()
			data.SiteConfig.Completions = getProvisionedThroughputSiteConfig()

			data.UserCompletionRequest.Fast = true

			// The fast chat model does not have provisioned throughput, and
			// so the request path to bedrock just has the model's name. (No ARN.)
			data.WantRequestToLLMProviderPath = "/model/anthropic.claude-v2-fastchat/invoke"

			runCompletionsTest(t, infra, data)
		})

Test plan

Added more unit tests.

Changelog

NA

@chrsmith chrsmith requested a review from emidoots July 10, 2024 18:51
@cla-bot cla-bot Bot added the cla-signed label Jul 10, 2024

@emidoots emidoots left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More tests? I'm sold!

@chrsmith chrsmith merged commit e80144e into main Jul 10, 2024
@chrsmith chrsmith deleted the chrsmith/more-completion-tests branch July 10, 2024 21:10
chrsmith referenced this pull request Jul 12, 2024
…ore models) (#63797)

This PR if what the past dozen or so
[cleanup](https://github.com/sourcegraph/sourcegraph/pull/63359),
[refactoring](https://github.com/sourcegraph/sourcegraph/pull/63731),
and [test](https://github.com/sourcegraph/sourcegraph/pull/63761) PRs
were all about: using the new `modelconfig` system for the completion
APIs.

This will enable users to:

- Use the new site config schema for specifying LLM configuration, added
in https://github.com/sourcegraph/sourcegraph/pull/63654. Sourcegraph
admins who use these new site config options will be able to support
many more LLM models and providers than is possible using the older
"completions" site config.
- For Cody Enterprise users, we no longer ignore the
`CodyCompletionRequest.Model` field. And now support users specifying
any LLM model (provided it is "supported" by the Sourcegraph instance).

Beyond those two things, everything should continue to work like before.
With any existing "completions" configuration data being converted into
the `modelconfig` system (see
https://github.com/sourcegraph/sourcegraph/pull/63533).

## Overview

In order to understand how this all fits together, I'd suggest reviewing
this PR commit-by-commit.

### [Update internal/completions to use
modelconfig](https://github.com/sourcegraph/sourcegraph/commit/e6b7eb171eea6bd6a512f0e61457170a86128eae)

The first change was to update the code we use to serve LLM completions.
(Various implementations of the `types.CompletionsProvider` interface.)

The key changes here were as follows:

1. Update the `CompletionRequest` type to include the `ModelConfigInfo`
field (to make the new Provider and Model-specific configuration data
available.)
2. Rename the `CompletionRequest.Model` field to
`CompletionRequest.RequestedModel`. (But with a JSON annotation to
maintain compatibility with existing callers.) This is to catch any bugs
related to using the field directly, since that is now almost guaranteed
to be a mistake. (See below.)

With these changes, all of the `CompletionProvider`s were updated to
reflect these changes.

- Any situation where we used the
`CompletionRequest.Parameters.RequestedModel` should now refer to
`CompletionRequest.ModelConfigInfo.Model.ModelName`. The "model name"
being the thing that should be passed to the API provider, e.g.
`gpt-3.5-turbo`.
- In some situations (`azureopenai`) we needed to rely on the Model ID
as a more human-friendly identifier. This isn't 100% accurate, but will
match the behavior we have today. A long doc comment calls out the
details of what is wrong with that.
- In other situations (`awsbedrock`, `azureopenai`) we read the new
`modelconfig` data to configure the API provider (e.g.
`Azure.UseDeprecatedAPI`), or surface model-specific metadata (e.g. AWS
Provisioned Throughput ARNs). While the code is a little clunky to avoid
larger refactoring, this is the heart and soul of how we will be writing
new completion providers in the future. That is, taking specific
configuration bags with whatever data that is required.

### [Fix bugs in
modelconfig](https://github.com/sourcegraph/sourcegraph/commit/75a51d8cb520e35918bd3a67a090a36d456b1797)

While we had lots of tests for converting the existing "completions"
site config data into the `modelconfig.ModelConfiguration` structure,
there were a couple of subtle bugs that I found while testing the larger
change.

The updated unit tests and comments should make that clear.

### [Update frontend/internal/httpapi/completions to use
modelconfig](https://github.com/sourcegraph/sourcegraph/commit/084793e08fca51a5ab84a7d73421d575caeebaa1)

The final step was to update the HTTP endpoints that serve the
completion requests. There weren't any logic changes here, just
refactoring how we lookup the required data. (e.g. converting the user's
requested model into an actual model found in the site configuration.)

We support Cody clients sending either "legacy mrefs" of the form
`provider/model` like before, or the newer mref
`provider::apiversion::model`. Although it will likely be a while before
Cody clients are updated to only use the newer-style model references.

The existing unit tests for the competitions APIs just worked, which was
the plan. But for the few changes that were required I've added comments
to explain the situation.

### [Fix: Support requesting models just by their
ID](https://github.com/sourcegraph/sourcegraph/pull/63797/commits/99715feba614230aa84cf94aae571adb96768035)

> ... We support Cody clients sending either "legacy mrefs" of the form
`provider/model` like before ...

Yeah, so apparently I lied 😅 . After doing more testing, the extension
_also_ sends requests where the requested model is just `"model"`.
(Without the provider prefix.)

So that now works too. And we just blindly match "gtp-3.5-turbo" to the
first mref with the matching model ID, such as
"anthropic::unknown::gtp-3.5-turbo".

## Test plan

Existing unit tests pass, added a few tests. And manually tested my Sg
instance configured to act as both "dotcom" mode and a prototypical Cody
Enterprise instance.

## Changelog

Update the Cody APIs for chat or code completions to use the "new style"
model configuration. This allows for great flexibility in configuring
LLM providers and exposing new models, but also allows Cody Enterprise
users to select different models for chats.

This will warrant a longer, more detailed changelog entry for the patch
release next week. As this unlocks many other exciting features.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants