-
Notifications
You must be signed in to change notification settings - Fork 198
Simplify VMCP Configuration #3125
Description
Summary
VirtualMCPServer configuration is defined at least thrice: once in the CRD spec (cmd/thv-operator/api/v1alpha1/), again in the server config (pkg/vmcp/config/), and once more in pkg/vmcp/config/yaml_loader.go. Adding or modifying any configurable feature requires touching multiple layers, with bugs at any step going undetected until production.
Motivation
Developer Velocity
Every new configurable feature must traverse the entire configuration pipeline: CRD type → Converter → Server Config → Execution. This multiplies the effort required for even simple additions and creates friction for contributors.
Quality / Bug Prevention
Missing or incorrect transformations at any step result in silent configuration bugs. Real examples:
- Telemetry configuration: Worked in integration tests but failed with CRD due to broken rawConfig conversion
- PR Composite Tools: Fix plumbing for on-error-continue config #3118:
on_error.action: continuewas silently ignored due to broken config-to-execution plumbing
Integration tests pass because they configure execution-layer data structures directly, bypassing transformation layers entirely.
Confusing Documentation
The current architecture produces two different configuration formats—YAML tags for CLI (snake_case) vs CRD spec (camelCase). Documenting the same configuration twice with subtle differences creates user confusion and maintenance burden. See PR #3070 which fixed documentation that incorrectly mixed these conventions.
Higher Value, Lower Cost Testing
Issue #2852 proposes moving e2e tests to integration tests. This is blocked by the current architecture—integration tests that configure data structures directly miss config plumbing bugs. Simplifying the types makes this transition safe.
API Extensibility
Fewer, more powerful data structures describing vmcp configuration make it easier to eventually support mutating and querying this data via API.
Current Architecture (Problem)
Configuration flows through multiple transformations:
mcpv1alpha1.VirtualMCPServer → vmcpconfig.Config -> rawConfig during unmarshall -> vmcpconfig.Config again → Data Structures Used in Execution
Key problematic areas:
- Duplicative type definitions:
VirtualMCPServerSpec(CRD) andConfig(server) define the same concepts with different types - Manual conversion:
cmd/thv-operator/pkg/vmcpconfig/converter.go(900+ lines) manually maps CRD types to config types - YAML loader duplication:
pkg/vmcp/config/yaml_loader.gohas its ownrawConfigtypes that mirror the CRD types - Inconsistent validation: CRD webhook validation differs from server startup validation
Proposed Solution
Make the CRD Spec embed the server config type as a RawExtension.
The approach:
- Define a canonical
pkg/vmcp/config.Configtype as the single source of truth for the CRD-to-ConfigMap path - The CRD
VirtualMCPServerSpec.Configfield is aruntime.RawExtensioncontaining the JSON-serializedvmcpconfig.Config - Webhook validation unmarshals the JSON and applies the same validation as server startup
- The k8s operator continues to write its config to the same exact location, we just change the type.
Example CRD structure after change:
apiVersion: mcp.toolhive.io/v1alpha1
kind: VirtualMCPServer
metadata:
name: my-vmcp
spec:
# k8s config is not included in the config object.
podTemplateSpec: ...
serviceType: ClusterIP
# Server configuration uses the canonical config type
config:
groupRef:
name: my-group
incoming_auth:
type: oidc
oidc:
issuer: https://...
composite_tools:
- name: my-tool
steps: [...]Acceptance Criteria
- The
rawConfigtypes inpkg/vmcp/config/yaml_loader.goare deleted - The duplicative CRD spec fields are consolidated into a single
configfield usingRawExtension. - CRD webhook validation uses the exact same validation as server startup (
pkg/vmcp/config.Validator) - The 900+ line converter (
cmd/thv-operator/pkg/vmcpconfig/converter.go) is significantly reduced or eliminated - Documentation references a single configuration schema, not platform-specific variants
- E2e tests validating config transformations can be safely converted to integration tests with no loss of coverage
Out of Scope
- Changes to K8s-specific fields (serviceType, podTemplateSpec)
- Changes to the vmcp server runtime behavior
- MCPServer CRD (focus is on VirtualMCPServer)
- Reducing duplicative data types within the server itself (e.g.,
config.Config→ execution-layer types). This is a separate problem that should be addressed independently. PR Composite Tools: Fix plumbing for on-error-continue config #3118's bug was actually in this layer, not the CRD conversion layer.
Tasks
- Delete the rawConfig.
- Modify the shape of the config.Config to be k8s friendly, including modifying the yaml and json tags to use camelcase instead of snake case.
- Pull out the unique functionality from VirtualMCPServerSpec (i.e. discovery/resolution bits).
- Replace the remainder of the config with config.Config. Make minor kubebuilder annotations and tweaks of config.Config to satisfy generation.
- Update docs.
Related
- PR Composite Tools: Fix plumbing for on-error-continue config #3118: Example of config plumbing bug (in the config→execution layer)
- PR Fix YAML docs #3070: Documentation fix for mixed YAML field naming conventions
- Issue Split vMCP e2e tests into e2e and integration test layers #2852: Blocked on this work for safe test migration
- Issue Split vMCP e2e tests into e2e and integration test layers #2852 comment: Split vMCP e2e tests into e2e and integration test layers #2852 (comment)