fix autocomplete regression in main ("various" provider issue)#64165
Conversation
This fixes the current regression of autocomplete in `main`, which occurs both when (and sometimes when not) using the new `modelConfiguration` site config option. In 3cb1c45 (which lives only in `main`, not in any official releases - except those images that went to Self-hosted models EAP customers.) we had a regression where `CodyLLMConfiguration.provider` would return `"various"`. This change makes it so that we never return `"various"`, and instead only return provider values that would be accepted by the autocomplete provider tests of the client (`create-provider.test.ts`) This change was based purely on evaluating what the client actually does with this data, see the code comment (a story / explanation in itself) for details. Signed-off-by: Stephen Gutekanst <stephen@sourcegraph.com>
chrsmith
left a comment
There was a problem hiding this comment.
There are lots of things we could be doing instead to improve this situation, and or make this more predictable. But this is probably the best thing to do today.
chrsmith
left a comment
There was a problem hiding this comment.
Taking some more time to read the PR more carefully. This still seems like an improvement, but we'll probably want to revisit this as a P2 after the release to update both the Client-side to not rely on this data, and/or update the Server-side to try to be correct in more edge cases.
| // | ||
| // Cody clients currently rely on CodyLLMConfiguration, they shouldn't, they should | ||
| // use the information provided by the ModelsService and the /.api/client-config - | ||
| // both of which supersede this information. However, they use it today. |
There was a problem hiding this comment.
So even if the user is running on a more recent Cody client, and the Sourcegraph instance is exposing the /client-config API, we still rely on the CodyLLMConfiguration GraphQL endpoint for "something"?
Just so I understand what our overall end-goal, we want to ... not do that, right? And for the latest-and-greatest Cody clients, make sure they get all of their configuration/LLM model data from the /client-config and /supported-models.json endpoints. Right?
There was a problem hiding this comment.
So even if the user is running on a more recent Cody client, and the Sourcegraph instance is exposing the /client-config API, we still rely on the CodyLLMConfiguration GraphQL endpoint for "something"?
We still query for that information, and the code path for its usage could be a LOT more clear.. but I don't think we use it in this particular case, because of this part 1):
// The 'autocomplete provider name string' is determined by the following logic:
// 1) If the server has the new `modelConfiguration` in their site config, then the
// client will use the autocomplete `Model.provider` (provider ID, not name) field
// as the string.
Just so I understand what our overall end-goal, we want to ... not do that, right? And for the latest-and-greatest Cody clients, make sure they get all of their configuration/LLM model data from the /client-config and /supported-models.json endpoints. Right?
Absolutely, no doubt in my mind.
| } | ||
| return r.modelconfig.Providers[0].DisplayName | ||
| // Provider ID is the thing we have that maps closest to these strings and should be correct in the cases noted above. | ||
| return string(r.modelconfig.DefaultModels.CodeCompletion.ProviderID()) |
There was a problem hiding this comment.
Just thinking aloud:
This is probably going to work in the vast majority of situations. But since the Sourcegraph admin can configure things to use an arbitrary provider ID for their code completion model... this isn't "correct" in all cases.
And so from my understanding of your (excellent) write up, the ideal solution here would be to look at the actual modelconfig.Provider object for the default code completion model. And then inspect its ServerSideConfig object, and determine the exact API provider.
That would help that case. But then again, if the user can pick some other code completion provider, its would also break. (But like you called out in that comment, the Cody client shouldn't really be using this data. And so, that's more like some cleanup we can do as the /client-config and /supported-models.json stuff percolates?
There was a problem hiding this comment.
This is probably going to work in the vast majority of situations. But since the Sourcegraph admin can configure things to use an arbitrary provider ID for their code completion model... this isn't "correct" in all cases.
Yeah I agree. This should just fix the immediate regression, but is not the correct ultimate solution. If this endpoint returns something other than those strings I mention in the comment above, then autocomplete would not work today.
But like you called out in that comment, the Cody client shouldn't really be using this data. And so, that's more like some cleanup we can do as the /client-config and /supported-models.json stuff percolates?
Yeah, I think the ideal solution is that.. when using the new modelConfiguration site config (which also means they are using /.api/client-config), clients would not EVER make a GraphQL request for this CodyLLMConfiguration in the first place. This will require some careful client refactoring work.
Co-authored-by: Chris Smith <chrsmith@users.noreply.github.com>
…4276) This PR fixes Cody autocomplete in some situations, fixing [PRIME-426](https://linear.app/sourcegraph/issue/PRIME-426/autocomplete-completely-broken-in-main-with-and-sometimes-without-the). Our story begins with this PR: https://github.com/sourcegraph/sourcegraph/pull/63886. The PR updated the `CodyLLMConfigurationResolver.Provider` GraphQL endpoint to no longer return the "provider" that was put into the site configuration, but to instead return the _display name_ of the provider, or if multiple were configured the string "various". ```diff - func (c *codyLLMConfigurationResolver) Provider() string { - return string(c.config.Provider) - } + func (c *codyLLMConfigurationResolver) Provider() string { + if len(r.modelconfig.Providers) != 1 { + return "various" + } + return r.modelconfig.Providers[0].DisplayName +} ``` This change was wrong on several levels, and unfortunately stemmed from the original author (me) not tracking down and understanding how the GraphQL endpoint was actually used. Once we discovered the problem, we quickly rectified this by changing the behavior to the more correct version of returning the Provider ID of the code completion model with https://github.com/sourcegraph/sourcegraph/pull/64165: ```go func (r *codyLLMConfigurationResolver) Provider() string { return string(r.modelconfig.DefaultModels.CodeCompletion.ProviderID()) } ``` However, after some more testing we discovered yet another edge case. We didn't realize that the provider "sourcegraph" (which is how you configure Sourcegraph to use Cody Gateway, using the older site config method) was required in some scenarios on the Cody client. So the new logic, of returning the Provider ID was incorrect. (Because we report the _model_ provider, e.g. "anthropic". Not the _API_ provider, e.g. "sourcegraph"/"Cody Gateway".) With this change, we should just have the behavior we had initially: returning whatever the admin had configured in the "completions.provider" section of the site config. If only the newer modelconfig settings were used, we return "sourcegraph" if using Cody Gatway. And if not, just return the Provider ID. (Though in some situations even that will lead to incorrect results, because ultimately we need to update the client here.) ## Test plan I wasn't able to test this super-well manually, and am relying on Stephen and Taras' knowledge of how the client uses this today. ## Changelog NA
This fixes the current regression of autocomplete in
main, which occurs both when (and sometimes when not) using the newmodelConfigurationsite config option.In 3cb1c45 (which lives only in
main, not in any official releases - except those images that went to Self-hosted models EAP customers.) we had a regression whereCodyLLMConfiguration.providerwould return"various".This change makes it so that we never return
"various", and instead only return provider values that would be accepted by the autocomplete provider tests of the client (create-provider.test.ts)This change was based purely on evaluating what the client actually does with this data, see the code comment (a story / explanation in itself) for details.
Test plan
main- so this change (even if it turns out to be slightly incorrect) is a major improvement over the current state ofmain.Note: there is no changelog entry because this regression did not ship yet.
Changelog
N/A