feat/cody-gateway: support wildcard models#62909
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. Join @bobheadxi and the rest of your teammates on |
acb70df to
60a9db4
Compare
There was a problem hiding this comment.
Seeing that this is upstream.go, which already does many things and has a lot of dependencies, I think it would be good to hire the "*" as an implementation detail in limits.go (say, add limits.IsWildCard())?
There was a problem hiding this comment.
Or, move the whole "is this prefixed or not" logic to a single place, handle wildcard and prefixing there?
There was a problem hiding this comment.
The docstring says this is TEMPORARY, so I think this was only meant to tide us over until the cache state migrates... ideally, this whole block of code should be removed. But, it doesn't feel appropriate to do in this PR 😅
In light of this being a one-off thing, would you be okay with keeping this here? I can add this as a docstring as well. My intent is to discourage this type of checking because the actor model list should always be prefixed, and we should never manipulate the actor models list ourselves, instead depending only on EvaluateAllowedModels to do the check
f911fcc to
34453c5
Compare
3a7b6a4 to
90ae50d
Compare
90ae50d to
0f45af8
Compare
|
Merging this first to start working my way through landing this large stack of PRs, as I shared in Slack: https://sourcegraph.slack.com/archives/C05SZB829D0/p1717173273215949?thread_ts=1715638980.052279&cid=C05SZB829D0 This is low-risk as it doesn't include any real functionality changes yet, just prepares them, as outlined in PR description. Happy to address any post-merge reviews :) |
This change makes Cody Gateway always apply a wildcard model allowlist, irrespective of what the configured model allowlist is for an Enterprise subscription is in dotcom (see #62909). The next PR in the stack, https://github.com/sourcegraph/sourcegraph/pull/62912, makes the GraphQL queries return similar results, and removes model allowlists from the subscription management UI. Closes https://linear.app/sourcegraph/issue/CORE-135 ### Context In https://sourcegraph.slack.com/archives/C05SZB829D0/p1715638980052279 we shared a decision we landed on as part of #62263: > Ignoring (then removing) per-subscription model allowlists: As part of the API discussions, we've also surfaced some opportunities for improvements - to make it easier to roll out new models to Enterprise, we're not including per-subscription model allowlists in the new API, and as part of the Cody Gateway migration (by end-of-June), we will update Cody Gateway to stop enforcing per-subscription model allowlists. Cody Gateway will still retain a Cody-Gateway-wide model allowlist. [@chrsmith](https://sourcegraph.slack.com/team/U061QHKUBJ8) is working on a broader design here and will have more to share on this later. This means there is one less thing for us to migrate as part of https://github.com/sourcegraph/sourcegraph/pull/62934, and avoids the need to add an API field that will be removed shortly post-migration. As part of this, rolling out new models to Enterprise customers no longer require additional code/override changes. ## Test plan Set up Cody Gateway locally as documented, then `sg start dotcom`. Set up an enterprise subscription + license with a high seat count (for a high quota), and force a Cody Gateway sync: ``` curl -v -H 'Authorization: bearer sekret' http://localhost:9992/-/actor/sync-all-sources ``` Verify we are using wildcard allowlist: ```sh $ redis-cli -p 6379 get 'v2:product-subscriptions:v2:slk_...' "{\"key\":\"slk_...\",\"id\":\"6ad033f4-c6da-43a9-95ef-f653bf59aaac\",\"name\":\"bobheadxi\",\"accessEnabled\":true,\"endpointAccess\":{\"/v1/attribution\":true},\"rateLimits\":{\"chat_completions\":{\"allowedModels\":[\"*\"],\"limit\":660,\"interval\":86400000000000,\"concurrentRequests\":330,\"concurrentRequestsInterval\":10000000000},\"code_completions\":{\"allowedModels\":[\"*\"],\"limit\":66000,\"interval\":86400000000000,\"concurrentRequests\":33000,\"concurrentRequestsInterval\":10000000000},\"embeddings\":{\"allowedModels\":[\"*\"],\"limit\":220000000,\"interval\":86400000000000,\"concurrentRequests\":110000000,\"concurrentRequestsInterval\":10000000000}},\"lastUpdated\":\"2024-05-24T20:28:58.283296Z\"}" ``` Using the local enterprise subscription's access token, we run the QA test suite: ```sh $ bazel test --runs_per_test=2 --test_output=all //cmd/cody-gateway/qa:qa_test --test_env=E2E_GATEWAY_ENDPOINT=http://localhost:9992 --test_env=E2E_GATEWAY_TOKEN=$TOKEN INFO: Analyzed target //cmd/cody-gateway/qa:qa_test (0 packages loaded, 0 targets configured). INFO: From Testing //cmd/cody-gateway/qa:qa_test (run 1 of 2): ==================== Test output for //cmd/cody-gateway/qa:qa_test (run 1 of 2): PASS ================================================================================ INFO: From Testing //cmd/cody-gateway/qa:qa_test (run 2 of 2): ==================== Test output for //cmd/cody-gateway/qa:qa_test (run 2 of 2): PASS ================================================================================ INFO: Found 1 test target... Target //cmd/cody-gateway/qa:qa_test up-to-date: bazel-bin/cmd/cody-gateway/qa/qa_test_/qa_test Aspect @@rules_rust//rust/private:clippy.bzl%rust_clippy_aspect of //cmd/cody-gateway/qa:qa_test up-to-date (nothing to build) Aspect @@rules_rust//rust/private:rustfmt.bzl%rustfmt_aspect of //cmd/cody-gateway/qa:qa_test up-to-date (nothing to build) INFO: Elapsed time: 13.653s, Critical Path: 13.38s INFO: 7 processes: 1 internal, 6 darwin-sandbox. INFO: Build completed successfully, 7 total actions //cmd/cody-gateway/qa:qa_test PASSED in 11.7s Stats over 2 runs: max = 11.7s, min = 11.7s, avg = 11.7s, dev = 0.0s Executed 1 out of 1 test: 1 test passes. ```
…ns (#62934) Migrates Cody Gateway to use the new Enterprise Portal's "read-only" APIs. For the most part, this is an in-place replacement - a lot of the diff is in testing and minor changes. Some changes, such as the removal of model allowlists, were made down the PR stack in https://github.com/sourcegraph/sourcegraph/pull/62911. At a high level, we replace the data requested by `cmd/cody-gateway/internal/dotcom/operations.graphql` and replace it with Enterprise Portal RPCs: - `codyaccessv1.GetCodyGatewayAccess` - `codyaccessv1.ListCodyGatewayAccesses` Use cases that previously required retrieving the active license tags now: 1. Use the display name provided by the Cody Access API https://github.com/sourcegraph/sourcegraph/pull/62968 2. Depend on the connected Enterprise Portal dev instance to only return dev subscriptions https://github.com/sourcegraph/sourcegraph/pull/62966 Closes https://linear.app/sourcegraph/issue/CORE-98 Related to https://linear.app/sourcegraph/issue/CORE-135 (https://github.com/sourcegraph/sourcegraph/pull/62909, https://github.com/sourcegraph/sourcegraph/pull/62911) Related to https://linear.app/sourcegraph/issue/CORE-97 ## Local development This change also adds Enterprise Portal to `sg start dotcom`. For local development, we set up Cody Gateway to connect to Enterprise Portal such that zero configuration is needed - all the required secrets are sourced from the `sourcegrah-local-dev` GCP project automatically when you run `sg start dotcom`, and local Cody Gateway will talk to local Enterprise Portal to do the Enterprise subscriptions sync. This is actually an upgrade from the current experience where you need to provide Cody Gateway a Sourcegraph user access token to test Enterprise locally, though the Sourcegraph user access token is still required for the PLG actor source. The credential is configured in https://console.cloud.google.com/security/secret-manager/secret/SG_LOCAL_DEV_SAMS_CLIENT_SECRET/overview?project=sourcegraph-local-dev, and I've included documentation in the secret annotation about what it is for and what to do with it:  ## Rollout plan I will open PRs to set up the necessary configuration for Cody Gateway dev and prod. Once reviews taper down I'll cut an image from this branch and deploy it to Cody Gateway dev, and monitor it closely + do some manual testing. Once verified, I'll land this change and monitor a rollout to production. Cody Gateway dev SAMS client: sourcegraph/infrastructure#6108 Cody Gateway prod SAMS client update (this one already exists): ``` accounts=> UPDATE idp_clients SET scopes = scopes || '["enterprise_portal::subscription::read", "enterprise_portal::codyaccess::read"]'::jsonb WHERE id = 'sams_cid_018ea062-479e-7342-9473-66645e616cbf'; UPDATE 1 accounts=> select name, scopes from idp_clients WHERE name = 'Cody Gateway (prod)'; name | scopes ---------------------+---------------------------------------------------------------------------------------------------------------------------------- Cody Gateway (prod) | ["openid", "profile", "email", "offline_access", "enterprise_portal::subscription::read", "enterprise_portal::codyaccess::read"] (1 row) ``` Configuring the target Enterprise Portal instances: sourcegraph/infrastructure#6127 ## Test plan Start the new `dotcom` runset, now including Enterprise Portal, and observe logs from both `enterprise-portal` and `cody-gateway`: ``` sg start dotcom ``` I reused the test plan from https://github.com/sourcegraph/sourcegraph/pull/62911: set up Cody Gateway external dependency secrets, then set up an enterprise subscription + license with a high seat count (for a high quota), and force a Cody Gateway sync: ``` curl -v -H 'Authorization: bearer sekret' http://localhost:9992/-/actor/sync-all-sources ``` This should indicate the new sync against "local dotcom" fetches the correct number of actors and whatnot. Using the local enterprise subscription's access token, we run the QA test suite: ```sh $ bazel test --runs_per_test=2 --test_output=all //cmd/cody-gateway/qa:qa_test --test_env=E2E_GATEWAY_ENDPOINT=http://localhost:9992 --test_env=E2E_GATEWAY_TOKEN=$TOKEN INFO: Analyzed target //cmd/cody-gateway/qa:qa_test (0 packages loaded, 0 targets configured). INFO: From Testing //cmd/cody-gateway/qa:qa_test (run 1 of 2): ==================== Test output for //cmd/cody-gateway/qa:qa_test (run 1 of 2): PASS ================================================================================ INFO: From Testing //cmd/cody-gateway/qa:qa_test (run 2 of 2): ==================== Test output for //cmd/cody-gateway/qa:qa_test (run 2 of 2): PASS ================================================================================ INFO: Found 1 test target... Target //cmd/cody-gateway/qa:qa_test up-to-date: bazel-bin/cmd/cody-gateway/qa/qa_test_/qa_test Aspect @@rules_rust//rust/private:clippy.bzl%rust_clippy_aspect of //cmd/cody-gateway/qa:qa_test up-to-date (nothing to build) Aspect @@rules_rust//rust/private:rustfmt.bzl%rustfmt_aspect of //cmd/cody-gateway/qa:qa_test up-to-date (nothing to build) INFO: Elapsed time: 13.653s, Critical Path: 13.38s INFO: 7 processes: 1 internal, 6 darwin-sandbox. INFO: Build completed successfully, 7 total actions //cmd/cody-gateway/qa:qa_test PASSED in 11.7s Stats over 2 runs: max = 11.7s, min = 11.7s, avg = 11.7s, dev = 0.0s Executed 1 out of 1 test: 1 test passes. ```

In https://sourcegraph.slack.com/archives/C05SZB829D0/p1715638980052279 we shared a decision we landed on as part of #62263:
To support this, we first need to extend Cody Gateway's model allowlist enforcement to respect a notion of "allow all models that are allowed in Cody Gateway". To ensure models are explicitly provided today, an empty
AllowedModelsis considered invalid, so we add a special single-element-slice-*configuration that can be used to indicate an actor's rate limit allows all models (prefixedMasterAllowlist).This change also unifies somewhat the way we enforce allowed models in various places by introducing
(*RateLimit).EvaluateAllowedModels(...)as the unified way to construct the final allowlist for a given rate limit.I'm planning to roll this out before rolling out actual functionality changes (https://github.com/sourcegraph/sourcegraph/pull/62911) to ensure changes in cached rate limits don't end up confusing an older revision of Cody Gateway that doesn't yet support wildcard models. With #62911, rolling out new models to Enterprise customers no longer require additional code/override changes.
Part of https://linear.app/sourcegraph/issue/CORE-135
Test plan
Unit tests, and E2E test of this in https://github.com/sourcegraph/sourcegraph/pull/62911