Return 'sourcegraph' as the CodyLLMConfigurationResolver.Provider by chrsmith · Pull Request #64276 · sourcegraph/sourcegraph-public-snapshot

chrsmith · 2024-08-05T16:07:46Z

This PR fixes Cody autocomplete in some situations, fixing PRIME-426.

Our story begins with this PR: https://github.com/sourcegraph/sourcegraph/pull/63886. The PR updated the CodyLLMConfigurationResolver.Provider GraphQL endpoint to no longer return the "provider" that was put into the site configuration, but to instead return the display name of the provider, or if multiple were configured the string "various".

- func (c *codyLLMConfigurationResolver) Provider() string {
- 	return string(c.config.Provider)
- }
+ func (c *codyLLMConfigurationResolver) Provider() string {
+	if len(r.modelconfig.Providers) != 1 {
+		return "various"
+	}
+	return r.modelconfig.Providers[0].DisplayName
+}

This change was wrong on several levels, and unfortunately stemmed from the original author (me) not tracking down and understanding how the GraphQL endpoint was actually used.

Once we discovered the problem, we quickly rectified this by changing the behavior to the more correct version of returning the Provider ID of the code completion model with https://github.com/sourcegraph/sourcegraph/pull/64165:

func (r *codyLLMConfigurationResolver) Provider() string {
	return string(r.modelconfig.DefaultModels.CodeCompletion.ProviderID())
}

However, after some more testing we discovered yet another edge case. We didn't realize that the provider "sourcegraph" (which is how you configure Sourcegraph to use Cody Gateway, using the older site config method) was required in some scenarios on the Cody client.

So the new logic, of returning the Provider ID was incorrect. (Because we report the model provider, e.g. "anthropic". Not the API provider, e.g. "sourcegraph"/"Cody Gateway".)

With this change, we should just have the behavior we had initially: returning whatever the admin had configured in the "completions.provider" section of the site config. If only the newer modelconfig settings were used, we return "sourcegraph" if using Cody Gatway. And if not, just return the Provider ID. (Though in some situations even that will lead to incorrect results, because ultimately we need to update the client here.)

Test plan

I wasn't able to test this super-well manually, and am relying on Stephen and Taras' knowledge of how the client uses this today.

Changelog

NA

emidoots · 2024-08-05T21:16:33Z

I was able to confirm locally this appears to be returning the right results in the case of "completions" with a "sourcegraph" provider:

I wasn't able to confirm autocomplete actually works with that configuration locally, because of a different issue (probably the one you ran into, screenshot below) - but suspect it will work once deployed to non-dev environments.

taras-yemets · 2024-08-05T21:18:47Z

Thanks for looking into it, Chris!

To test these changes we need to do the following:

sourcegraph/sourcegraph
- Run sg start with the proper config in dev-private (we may start with the default one from the main branch).
sourcegraph/cody
- Make resolveDefaultModelFromVSCodeConfigOrFeatureFlags always return null (so that we only rely on the config from the server).
- Remove this.isLocalInstance check so that we don't use the fast path for "fireworks" (default provider) - it didn't work for me when connected ti the enterprise instance. I think found why, but anyway it's out of scope of this PR.
- Launch VS Code extension from the debug menu (F5 or "Launch VS Code extension (Desktop, recommended).
- In the new window opened by debugger ensure you're connected to the local instance and check if completions work.

taras-yemets

Based on my recollection and understanding of the client code, a model name needs to be prefixed with the provider only when using an umbrella provider like “sourcegraph”. If the “provider” field in the site configuration is set to a specific LLM provider, the model name is not prefixed with the provider name.

Consider these configs:

{ provider: "sourcegraph", model: "fireworks/starcoder"}

and

{provider: "anthropic", model: "claude-instant-v1"

I’m not entirely certain, but I suspect that if the provider is not “sourcegraph,” the model name should not include the provider prefix. In other words, when the provider is “anthropic,” for example, the client likely expects “claude-instant-v1” as the completionModel, rather than “anthropic/claude-instant-v1”.
So we may need to update the logic for the CompletionModel resolver.

taras-yemets · 2024-08-05T21:30:08Z

I wasn't able to confirm autocomplete actually works with that configuration locally, because of a different issue (probably the one you ran into, screenshot below) - but suspect it will work once deployed to non-dev environments.

@slimsag, removing isLocalInstance check https://github.com/sourcegraph/sourcegraph/pull/64276#issuecomment-2269936906 helped me when running this locally.

emidoots · 2024-08-05T21:35:53Z

+	// just use that. This matches the behavior prior to modelconfig being used, and will support
+	// the case where "sourcegraph" is the configured provider.
+	if r.doNotUseCompletionsConfig != nil {
+		return r.doNotUseCompletionsConfig.Provider


I think we may(?) have an issue here: if both "completions" and "modelConfiguration" are specified, then doNotUseCompletionsConfig would be != nil and we currently don't have any site config validation that screams about both being set as an error

You are right that that would be a problem, if you have both "completions" and "modelConfiguration" specified then if the two disagree about the API provider for code completion, we'd return the wrong one.

... but I'm not sure if we care enough to fix this? Or that this is expected to be a legitimate problem?

I think the right solution for this would be to add the site config validator to bark if you have provided two different configuration blobs for Cody Enterprise. However, I don't know if we want to make that change ~a day before cutting the August release.

So I'd be able to be convinced otherwise, but I think this fall into the "won't fix" category. What do you think?

Here's what I think we should do:

Do NOT introduce a site config validator for this case, because this complicates cloud rollout and is risky to do ~a day before the Aug release as you noted.

Make sure this condition properly respects "modelConfiguration" NOT "completions" if both are specified.

I think you can achieve this by changing the conditional:

-if r.doNotUseCompletionsConfig != nil { +if r.doNotUseCompletionsConfig != nil && !conf.UseExperimentalModelConfiguration() {

The reason I think we need to handle this now / in this PR is because this will introduce another variable - where before we can expect that if you have both specified, "completions" would be ignored.. after this PR if you have both specified 'weird things may happen' which is not good

emidoots · 2024-08-05T21:50:07Z

I’m not entirely certain, but I suspect that if the provider is not “sourcegraph,” the model name should not include the provider prefix. In other words, when the provider is “anthropic,” for example, the client likely expects “claude-instant-v1” as the completionModel, rather than “anthropic/claude-instant-v1”.
So we may need to update the logic for the CompletionModel resolver.

@taras-yemets good catch. I think(?) that may only affect older Cody clients connecting to newer Sourcegraph instances - which absolutely happen.

taras-yemets · 2024-08-05T21:57:05Z

I think(?) that only affect older Cody clients connecting to newer Sourcegraph instances

@slimsag, this branch of code is from current main. If I understand the parsing logic correctly, it's not able to parse { provider: "anthropic", modelId: "anthropic/claude-instant-v1" } properly: it'll just return the params as they are, and later the request to the upstream will probably fail as the model name/id "anthropic/claude-instant-v1" may not be found.

emidoots · 2024-08-05T22:41:43Z

@taras-yemets note parseProviderAndModel is only called if ModelsService.getDefaultModel(ModelUsage.Autocomplete) doesn't return a model (code). I need to check which condition that returns a model or not.. but I think it's an old vs. new server split?

taras-yemets · 2024-08-05T22:43:44Z

@slimsag, you're right! I missed that condition.

taras-yemets · 2024-08-05T22:46:59Z

Cody for VS Code has a special case (delimiter) for "aws-bedrock" provider, but the new LLM config resolution returns provider "anthropic" when "aws-bedrock" is set as provider in the site config, so that special case won't work on clients when connected to the Sourcegraph instance with the new version.

☝🏻 one more observation for the "old" way to set up Cody in the site config (from PRIME-426)

chrsmith · 2024-08-05T22:47:38Z

@taras-yemets , @slimsag it isn't clear from the product documentation. But it does make sense that if you are using "completions" { "provider": "sourcegraph" } then you would need to qualify model names, e.g. "chatModel": "anthropic/claude-v1"

https://sourcegraph.com/docs/cody/core-concepts/cody-gateway#configuring-custom-models

I'll update this PR to qualify model names in those situations... does that sound right to you two? Or perhaps we should start a Slack thread or something?

Ugh, yes. Here's a crystal clear example from what looks to be like HTTP traffic recordings in the cody repo.
https://github.com/sourcegraph/cody/blob/066d9c6ff48beb96a834f17021affc4e62094415/agent/recordings/autocomplete_2136217965/recording.har.yaml#L[…]15

I'll update that PR to just serve that older behavior.

If you are using the older "completions" config, we'll just return the provider or model name directly as-is. (So if you entered "anthropic/claude-v1" as the chat mode, that's what we send to clients.)
If you are using the newer "modelConfiguration" config, then it'll require some more effort. But until we update Cody clients to use the newer style model data natively, we will want to keep older versions of the Cody client working as-is.

emidoots

I think you're on the right track here. General approval to unblock, feel free to merge this whenever you please. Whatever you land on will be more correct than what is in main and will unblock testing, so I'm happy to review post-merge.

chrsmith · 2024-08-06T01:49:23Z

Merging to get these fixes in, but please take a look when you get in tomorrow @slimsag and @taras-yemets .

Rather than just complicate things, I just implemented two different GraphQL resolvers. One that matches the older behavior, which is used if you are only setting the older "completions" site config. (So there should not be any perceivable change for existing Cody Enterprise customers.)

The other resolver is only used if the "modelConfiguration" site config data is provided. And there we do all sorts of checks to refer to the API Provider (e.g. "aws-bedrock" or "sourcegraph" instead of the Model Provider like "anthropic".) Also, we return the model name instead of the model ID. Which is a small distinction, but hopefully something that will avoid errors because of a mismatch between what the client expects and the newer system.

I also added some unit tests to assert the behavior that we have in case we need to tweak this again in the future.

chrsmith · 2024-08-06T02:15:58Z

With the unit tests and some manual testing from Aravind's doc I feel good enough about this to submit. Will continue to do more testing and see if there are more corner cases to account for.

Return 'sourcegraph' as the CodyLLMConfigurationResolver.Provider

1afcd58

chrsmith requested review from emidoots and taras-yemets August 5, 2024 16:07

cla-bot Bot added the cla-signed label Aug 5, 2024

chrsmith and others added 2 commits August 5, 2024 09:13

Return 'sourcegraph' if using modelconfig, too

b8fb23e

Merge branch 'main' into chrsmith/PRIME-426/fix-autocomplete

1ca1496

taras-yemets reviewed Aug 5, 2024

View reviewed changes

emidoots reviewed Aug 5, 2024

View reviewed changes

emidoots approved these changes Aug 5, 2024

View reviewed changes

Just use two resolvers, add tests

db8c1d3

chrsmith merged commit a310035 into main Aug 6, 2024

chrsmith deleted the chrsmith/PRIME-426/fix-autocomplete branch August 6, 2024 02:15

Conversation

chrsmith commented Aug 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test plan

Changelog

Uh oh!

emidoots commented Aug 5, 2024

Uh oh!

taras-yemets commented Aug 5, 2024

Uh oh!

taras-yemets left a comment

Choose a reason for hiding this comment

Uh oh!

taras-yemets commented Aug 5, 2024

Uh oh!

emidoots Aug 5, 2024

Choose a reason for hiding this comment

Uh oh!

chrsmith Aug 5, 2024

Choose a reason for hiding this comment

Uh oh!

emidoots Aug 5, 2024

Choose a reason for hiding this comment

Uh oh!

emidoots commented Aug 5, 2024

Uh oh!

taras-yemets commented Aug 5, 2024

Uh oh!

emidoots commented Aug 5, 2024

Uh oh!

taras-yemets commented Aug 5, 2024

Uh oh!

taras-yemets commented Aug 5, 2024

Uh oh!

chrsmith commented Aug 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

emidoots left a comment

Choose a reason for hiding this comment

Uh oh!

chrsmith commented Aug 6, 2024

Uh oh!

chrsmith commented Aug 6, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

chrsmith commented Aug 5, 2024 •

edited

Loading

chrsmith commented Aug 5, 2024 •

edited

Loading