Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

fix autocomplete regression in main ("various" provider issue)#64165

Merged
emidoots merged 3 commits into
mainfrom
sg/autocomplete
Jul 31, 2024
Merged

fix autocomplete regression in main ("various" provider issue)#64165
emidoots merged 3 commits into
mainfrom
sg/autocomplete

Conversation

@emidoots

Copy link
Copy Markdown
Member

This fixes the current regression of autocomplete in main, which occurs both when (and sometimes when not) using the new modelConfiguration site config option.

In 3cb1c45 (which lives only in main, not in any official releases - except those images that went to Self-hosted models EAP customers.) we had a regression where CodyLLMConfiguration.provider would return "various".

This change makes it so that we never return "various", and instead only return provider values that would be accepted by the autocomplete provider tests of the client (create-provider.test.ts)

This change was based purely on evaluating what the client actually does with this data, see the code comment (a story / explanation in itself) for details.

Test plan

  • I am 100% confident the old "various" behavior is absolutely incorrect, and currently has broken autocomplete in main - so this change (even if it turns out to be slightly incorrect) is a major improvement over the current state of main.
  • After merge, I'd like to confirm with a good amount of manual testing that the various autocomplete providers work as expected. @taras-yemets maybe you can help out with this testing?

Note: there is no changelog entry because this regression did not ship yet.

Changelog

N/A

This fixes the current regression of autocomplete in `main`, which occurs
both when (and sometimes when not) using the new `modelConfiguration`
site config option.

In 3cb1c45 (which lives only in `main`,
not in any official releases - except those images that went to Self-hosted
models EAP customers.) we had a regression where `CodyLLMConfiguration.provider`
would return `"various"`.

This change makes it so that we never return `"various"`, and instead only
return provider values that would be accepted by the autocomplete provider
tests of the client (`create-provider.test.ts`)

This change was based purely on evaluating what the client actually does
with this data, see the code comment (a story / explanation in itself)
for details.

Signed-off-by: Stephen Gutekanst <stephen@sourcegraph.com>
@emidoots emidoots requested review from a team and chrsmith July 30, 2024 22:40
@cla-bot cla-bot Bot added the cla-signed label Jul 30, 2024

@chrsmith chrsmith left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐲 🔥 🐉 ☠️

@chrsmith chrsmith left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are lots of things we could be doing instead to improve this situation, and or make this more predictable. But this is probably the best thing to do today.

@chrsmith chrsmith left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking some more time to read the PR more carefully. This still seems like an improvement, but we'll probably want to revisit this as a P2 after the release to update both the Client-side to not rely on this data, and/or update the Server-side to try to be correct in more edge cases.

//
// Cody clients currently rely on CodyLLMConfiguration, they shouldn't, they should
// use the information provided by the ModelsService and the /.api/client-config -
// both of which supersede this information. However, they use it today.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So even if the user is running on a more recent Cody client, and the Sourcegraph instance is exposing the /client-config API, we still rely on the CodyLLMConfiguration GraphQL endpoint for "something"?

Just so I understand what our overall end-goal, we want to ... not do that, right? And for the latest-and-greatest Cody clients, make sure they get all of their configuration/LLM model data from the /client-config and /supported-models.json endpoints. Right?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So even if the user is running on a more recent Cody client, and the Sourcegraph instance is exposing the /client-config API, we still rely on the CodyLLMConfiguration GraphQL endpoint for "something"?

We still query for that information, and the code path for its usage could be a LOT more clear.. but I don't think we use it in this particular case, because of this part 1):

// The 'autocomplete provider name string' is determined by the following logic:
// 1) If the server has the new `modelConfiguration` in their site config, then the
// client will use the autocomplete `Model.provider` (provider ID, not name) field
// as the string.

Just so I understand what our overall end-goal, we want to ... not do that, right? And for the latest-and-greatest Cody clients, make sure they get all of their configuration/LLM model data from the /client-config and /supported-models.json endpoints. Right?

Absolutely, no doubt in my mind.

Comment thread cmd/frontend/internal/modelconfig/resolver.go Outdated
}
return r.modelconfig.Providers[0].DisplayName
// Provider ID is the thing we have that maps closest to these strings and should be correct in the cases noted above.
return string(r.modelconfig.DefaultModels.CodeCompletion.ProviderID())

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just thinking aloud:

This is probably going to work in the vast majority of situations. But since the Sourcegraph admin can configure things to use an arbitrary provider ID for their code completion model... this isn't "correct" in all cases.

And so from my understanding of your (excellent) write up, the ideal solution here would be to look at the actual modelconfig.Provider object for the default code completion model. And then inspect its ServerSideConfig object, and determine the exact API provider.

That would help that case. But then again, if the user can pick some other code completion provider, its would also break. (But like you called out in that comment, the Cody client shouldn't really be using this data. And so, that's more like some cleanup we can do as the /client-config and /supported-models.json stuff percolates?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably going to work in the vast majority of situations. But since the Sourcegraph admin can configure things to use an arbitrary provider ID for their code completion model... this isn't "correct" in all cases.

Yeah I agree. This should just fix the immediate regression, but is not the correct ultimate solution. If this endpoint returns something other than those strings I mention in the comment above, then autocomplete would not work today.

But like you called out in that comment, the Cody client shouldn't really be using this data. And so, that's more like some cleanup we can do as the /client-config and /supported-models.json stuff percolates?

Yeah, I think the ideal solution is that.. when using the new modelConfiguration site config (which also means they are using /.api/client-config), clients would not EVER make a GraphQL request for this CodyLLMConfiguration in the first place. This will require some careful client refactoring work.

Stephen Gutekanst and others added 2 commits July 30, 2024 20:56
Co-authored-by: Chris Smith <chrsmith@users.noreply.github.com>
@emidoots emidoots enabled auto-merge (squash) July 31, 2024 04:03
@emidoots emidoots merged commit 01bb57c into main Jul 31, 2024
@emidoots emidoots deleted the sg/autocomplete branch July 31, 2024 04:13
chrsmith referenced this pull request Aug 6, 2024
…4276)

This PR fixes Cody autocomplete in some situations, fixing
[PRIME-426](https://linear.app/sourcegraph/issue/PRIME-426/autocomplete-completely-broken-in-main-with-and-sometimes-without-the).

Our story begins with this PR:
https://github.com/sourcegraph/sourcegraph/pull/63886. The PR updated
the `CodyLLMConfigurationResolver.Provider` GraphQL endpoint to no
longer return the "provider" that was put into the site configuration,
but to instead return the _display name_ of the provider, or if multiple
were configured the string "various".

```diff
- func (c *codyLLMConfigurationResolver) Provider() string {
- 	return string(c.config.Provider)
- }
+ func (c *codyLLMConfigurationResolver) Provider() string {
+	if len(r.modelconfig.Providers) != 1 {
+		return "various"
+	}
+	return r.modelconfig.Providers[0].DisplayName
+}
```

This change was wrong on several levels, and unfortunately stemmed from
the original author (me) not tracking down and understanding how the
GraphQL endpoint was actually used.

Once we discovered the problem, we quickly rectified this by changing
the behavior to the more correct version of returning the Provider ID of
the code completion model with
https://github.com/sourcegraph/sourcegraph/pull/64165:

```go
func (r *codyLLMConfigurationResolver) Provider() string {
	return string(r.modelconfig.DefaultModels.CodeCompletion.ProviderID())
}
```

However, after some more testing we discovered yet another edge case. We
didn't realize that the provider "sourcegraph" (which is how you
configure Sourcegraph to use Cody Gateway, using the older site config
method) was required in some scenarios on the Cody client.

So the new logic, of returning the Provider ID was incorrect. (Because
we report the _model_ provider, e.g. "anthropic". Not the _API_
provider, e.g. "sourcegraph"/"Cody Gateway".)

With this change, we should just have the behavior we had initially:
returning whatever the admin had configured in the
"completions.provider" section of the site config. If only the newer
modelconfig settings were used, we return "sourcegraph" if using Cody
Gatway. And if not, just return the Provider ID. (Though in some
situations even that will lead to incorrect results, because ultimately
we need to update the client here.)

## Test plan

I wasn't able to test this super-well manually, and am relying on
Stephen and Taras' knowledge of how the client uses this today.

## Changelog

NA
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants