Skip to content

Enable multi-upstream for MCPServer, MCPRemoteProxy, and proxy runner#4322

Merged
jhrozek merged 7 commits intomainfrom
authserver-multi-upstream/3-enable
Mar 24, 2026
Merged

Enable multi-upstream for MCPServer, MCPRemoteProxy, and proxy runner#4322
jhrozek merged 7 commits intomainfrom
authserver-multi-upstream/3-enable

Conversation

@jhrozek
Copy link
Copy Markdown
Contributor

@jhrozek jhrozek commented Mar 23, 2026

Summary

The embedded auth server has supported sequential multi-upstream authorization chains since Phase 2, but every consumer layer (MCPServer controller, MCPRemoteProxy controller, proxy runner) blocked configs with more than one upstream provider. This PR lifts those restrictions and fixes the env var naming scheme to support multiple providers correctly.

  • The CRD now accepts multiple upstream providers at admission time with duplicate-name detection and DNS-label validation (MaxLength=63, RFC 1123 pattern)
  • Env var derivation switched from single TOOLHIVE_UPSTREAM_CLIENT_SECRET to name-keyed TOOLHIVE_UPSTREAM_CLIENT_SECRET_<PROVIDER>, making bindings position-independent across provider reordering
  • Runtime config validation enforces explicit naming for multi-upstream (empty names default to "default" only for single-upstream; "default" is reserved)
  • Upstream names validated against DNS-label regex to prevent storage key delimiter injection
  • Upstreamswap middleware log lines now include provider name for multi-upstream debugging

Fixes #4138

Type of change

  • New feature

Test plan

  • Unit tests (task test)
  • Linting (task lint-fix)

Changes

File Change
cmd/thv-operator/api/v1alpha1/mcpexternalauthconfig_types.go Remove single-upstream restriction, add duplicate-name + DNS-label validation
cmd/thv-operator/pkg/controllerutil/authserver.go Name-keyed env var binding via upstreamSecretBinding, multi-provider iteration
pkg/authserver/config.go Multi-upstream name validation, remove GetUpstream()
pkg/auth/upstreamswap/middleware.go Add provider name to structured log lines
deploy/charts/operator-crds/ Regenerated CRD manifests with Pattern + MaxLength
docs/operator/crd-api.md Regenerated API reference

Special notes for reviewers

  • The upstreamSecretBinding struct is the single source of truth for env var naming, shared between GenerateAuthServerEnvVars (Pod env) and buildUpstreamRunConfig (runtime config) to prevent desync.
  • The env var suffix is derived from provider Name (uppercased, hyphens→underscores). Since the CRD regex forbids underscores, the transform is injective — no two valid names can produce the same suffix.
  • Multi-upstream proxy runner swap middleware defaults to Upstreams[0] for now; explicit per-backend ProviderName selection is addressed in the follow-up branch (authserver-multi-upstream/4-tokens-in-identity).

Generated with Claude Code

@github-actions github-actions bot added the size/M Medium PR: 300-599 lines changed label Mar 23, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 23, 2026

Codecov Report

❌ Patch coverage is 76.78571% with 26 lines in your changes missing coverage. Please review.
✅ Project coverage is 68.66%. Comparing base (4838bb4) to head (5da616b).
⚠️ Report is 6 commits behind head on main.

Files with missing lines Patch % Lines
pkg/authserver/config.go 55.00% 2 Missing and 16 partials ⚠️
...erator/api/v1alpha1/mcpexternalauthconfig_types.go 25.00% 1 Missing and 2 partials ⚠️
cmd/thv-operator/pkg/controllerutil/authserver.go 90.90% 1 Missing and 1 partial ⚠️
...-operator/controllers/mcpremoteproxy_controller.go 92.30% 0 Missing and 1 partial ⚠️
...d/thv-operator/controllers/mcpserver_controller.go 93.33% 0 Missing and 1 partial ⚠️
pkg/auth/upstreamswap/middleware.go 91.66% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4322      +/-   ##
==========================================
- Coverage   68.86%   68.66%   -0.21%     
==========================================
  Files         479      479              
  Lines       48505    48619     +114     
==========================================
- Hits        33404    33385      -19     
- Misses      12347    12359      +12     
- Partials     2754     2875     +121     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions github-actions bot added size/M Medium PR: 300-599 lines changed and removed size/M Medium PR: 300-599 lines changed labels Mar 24, 2026
tgrunnagle
tgrunnagle previously approved these changes Mar 24, 2026
@tgrunnagle
Copy link
Copy Markdown
Contributor

Question: Should there be some validation at the proxy runner and remote proxy controllers that restricts upstream providers to length = 1?

jhrozek and others added 5 commits March 24, 2026 16:30
Move multi-upstream restrictions from the authserver library to consumer
layers. The library now accepts multi-upstream configs but enforces name
semantics: single-upstream defaults empty names to "default", while
multi-upstream requires explicit non-"default" names with distinct error
messages for empty vs reserved names.

Validate upstream names against a DNS-label regex (no leading/trailing
hyphens, lowercase alphanumeric only) to prevent delimiter injection in
storage keys. Add test coverage for invalid name formats (uppercase,
underscores, leading/trailing hyphens).

Remove the GetUpstream() convenience method (no callers remain after
Phase 2). Add a cardinality guard in the proxy runner's Run() that
rejects len(Upstreams) > 1 with an actionable error pointing to
VirtualMCPServer.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove the len > 1 guard from MCPExternalAuthConfig.validateEmbeddedAuthServer()
so the CRD accepts multi-upstream configs. Add multi-upstream rejection to
MCPServer and MCPRemoteProxy controllers in handleExternalAuthConfig(), setting
a ConditionFalse status with reason MultiUpstreamNotSupported and an actionable
error directing users to VirtualMCPServer.

Add duplicate upstream name validation in the CRD webhook so conflicts
are caught at admission time rather than Pod startup. Tighten the Name
field pattern to disallow trailing hyphens and add MaxLength=63 for
RFC 1123 compliance.

VirtualMCPServer remains unrestricted as the intended multi-upstream consumer.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The embedded auth server has supported sequential multi-upstream chains
since the Phase 2 work, but every consumer layer (MCPServer controller,
MCPRemoteProxy controller, proxy runner) blocked configs with more than
one upstream provider. This commit lifts all those restrictions and fixes
the two bugs that would have broken multi-upstream even without the guards.

Guard removal:
- MCPServer controller: remove len > 1 check in handleExternalAuthConfig
- MCPRemoteProxy controller: same removal
- Proxy runner Run(): remove len > 1 early-return
- Remove associated condition type/reason constants and rejection tests

Converter fix (authserver.go):
- GenerateAuthServerEnvVars now iterates all providers and emits
  name-keyed env vars (TOOLHIVE_UPSTREAM_CLIENT_SECRET_OKTA, _GITHUB, …)
  derived from each provider's Name field, instead of only reading
  UpstreamProviders[0] into a single unindexed var. Name-keyed bindings
  are position-independent, so reordering providers in the CRD does not
  change the secret-to-provider mapping.
- buildEmbeddedAuthServerRunnerConfig now builds Upstreams from all
  providers instead of only the first; buildUpstreamRunConfig gains an
  envVarName parameter to match the env var naming

Middleware fix (middleware.go):
- Restore ProviderName auto-derivation in addUpstreamSwapMiddleware,
  defaulting to the first upstream's name for single-upstream configs.
  Multi-upstream ProviderName selection will be addressed when a CRD
  field is added in a follow-up.

Observability (upstreamswap/middleware.go):
- Add provider field to all upstreamswap log lines (injection success,
  empty-token warning, get-tokens error) to make confused-deputy
  debugging tractable when multiple providers are in play.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extract validateUpstreamName and validateUpstreamType from
validateUpstreams to bring cyclomatic complexity under the limit.
Break long fmt.Errorf line that exceeded 130 characters.

Update stale godoc comments in controllerutil/authserver.go that
still referenced the old index-based env var naming scheme.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Client-attributable errors (session not found, no refresh token, refresh
failed, invalid binding) are expected conditions that trigger re-auth,
not server issues an operator needs to investigate. Log these at Debug
instead of Warn; keep Warn for unexpected server-side failures.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jhrozek
Copy link
Copy Markdown
Contributor Author

jhrozek commented Mar 24, 2026

Question: Should there be some validation at the proxy runner and remote proxy controllers that restricts upstream providers to length = 1?

Don't we want to extend to 1 or 2? I'd like to enable the multi-hop through corporate IDPs.

You are right though that we should add that as a separate commit with separate tests and docs. I'm adding a new patch that re-adds the guards.

@jhrozek jhrozek force-pushed the authserver-multi-upstream/3-enable branch from f86c222 to 0c9ef62 Compare March 24, 2026 17:14
@github-actions github-actions bot added size/L Large PR: 600-999 lines changed and removed size/M Medium PR: 300-599 lines changed size/L Large PR: 600-999 lines changed labels Mar 24, 2026
The upstream swap middleware only injects one provider's token, and
there is no CRD field to control which provider. Until tokenInjection-
Provider is added and Cedar upstream claims extraction is implemented,
multi-upstream configs would silently inject only the first upstream's
token while the second is stored but unused.

Re-add controller-level guards (MCPServer, MCPRemoteProxy, proxy
runner) that reject len > 1 with actionable errors directing users to
VirtualMCPServer.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jhrozek jhrozek force-pushed the authserver-multi-upstream/3-enable branch from 0c9ef62 to cdf5377 Compare March 24, 2026 17:47
@github-actions github-actions bot added size/L Large PR: 600-999 lines changed and removed size/L Large PR: 600-999 lines changed labels Mar 24, 2026
tgrunnagle
tgrunnagle previously approved these changes Mar 24, 2026
@github-actions github-actions bot added size/L Large PR: 600-999 lines changed and removed size/L Large PR: 600-999 lines changed labels Mar 24, 2026
@jhrozek jhrozek merged commit 074326e into main Mar 24, 2026
72 of 77 checks passed
@jhrozek jhrozek deleted the authserver-multi-upstream/3-enable branch March 24, 2026 20:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/L Large PR: 600-999 lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Move multi-upstream restriction from authserver library to consumer layers (RFC-0052 Phase 3)

2 participants