Skip to content

Persist vMCP Session Metadata to Redis on Creation (RC-8) #4211

@yrobla

Description

@yrobla

Description

Update defaultMultiSessionFactory.makeSession() in pkg/vmcp/session/factory.go to persist per-backend session IDs into the transport-layer session metadata when a new MultiSession is created. Specifically, this adds the new constant MetadataKeyBackendSessionPrefix and extends populateBackendMetadata (or an equivalent helper) to write MetadataKeyBackendSessionPrefix+workloadID → backend_session_id for each successfully connected backend, alongside the already-written MetadataKeyBackendIDs.

This task provides the data that downstream tasks (RC-9 / TASK-005 and RC-16 / TASK-007) depend on to reconstruct and clean up backend sessions across replicas: without per-backend session IDs in Redis, RestoreSession cannot pass meaningful backend_session_id hints to the connector when rebuilding a session on a cache miss.

Context

The vMCP horizontal scaling epic (THV-0047) externalizes session state to Redis so that any replica can serve any client request. makeSession already writes MetadataKeyBackendIDs (comma-separated workload IDs of connected backends) via populateBackendMetadata, and it already collects per-backend session IDs into the backendSessions map (local runtime use). However, per-backend session IDs are never written to the serializable transport-session metadata, meaning they are lost on LRU eviction.

RC-8 closes this gap: by writing vmcp.backend.session.{workloadID} into transport-session metadata during makeSession, the per-backend session IDs flow through to Redis (once RC-7 wires RedisStorage into the manager) and can later be retrieved by RestoreSession (RC-9) to reconnect backends.

Parent epic: stacklok/stacklok-epics#262

Dependencies: #266 (RC-6: Redis Storage Backend), #270 (RC-7: Wire Redis Backend Selection into session.Manager)
Blocks: TASK-004 (RC-15: Persist Hijack-Prevention State), TASK-005 (RC-9: Reconstruct Sessions on Cache Miss), TASK-007 (RC-16: Update Redis Metadata on Backend Session Expiry)

Acceptance Criteria

  • A new exported constant MetadataKeyBackendSessionPrefix = "vmcp.backend.session." is defined in pkg/vmcp/session/factory.go alongside the existing MetadataKeyBackendIDs
  • populateBackendMetadata (or an equivalent helper) writes MetadataKeyBackendSessionPrefix+workloadIDr.conn.SessionID() for each successfully initialized backend in the results slice
  • The per-backend session ID metadata entries are written as part of makeSession, before security.PreventSessionHijacking wraps the session
  • When no backends connect successfully, no MetadataKeyBackendSessionPrefix+* keys are written (consistent with the existing behavior for MetadataKeyBackendIDs)
  • When some backends fail to connect and others succeed, only entries for successful backends are written (partial initialization tolerance)
  • GetMetadata() on the returned MultiSession contains one entry per successful backend with the key "vmcp.backend.session.{workloadID}" and a non-empty value equal to the backend's reported session ID
  • MetadataKeyBackendIDs behavior is unchanged: still written as a comma-separated, sorted list of workload IDs
  • Unit tests added to pkg/vmcp/session/ verify that makeSession writes the correct per-backend session ID metadata entries (using a mock connector with a known SessionID())
  • Existing tests in pkg/vmcp/session/ continue to pass without modification
  • All tests pass (go test ./pkg/vmcp/session/...)
  • Code reviewed and approved

Technical Approach

Recommended Implementation

The change is localized to pkg/vmcp/session/factory.go. Two additions are required:

  1. Add the new constant in the const block alongside MetadataKeyBackendIDs:

    // MetadataKeyBackendSessionPrefix is the key prefix for per-backend session IDs.
    // Full key: MetadataKeyBackendSessionPrefix + workloadID → backend_session_id.
    MetadataKeyBackendSessionPrefix = "vmcp.backend.session."
  2. Extend populateBackendMetadata to also write the per-backend session IDs. The results slice is already sorted and each entry's r.conn.SessionID() returns the opaque backend session ID (set by mcpSession.backendSessionID):

    func populateBackendMetadata(transportSess transportsession.Session, results []initResult) {
        if len(results) > 0 {
            ids := make([]string, len(results))
            for i, r := range results {
                ids[i] = r.target.WorkloadID
                // Persist per-backend session ID so RestoreSession can reconnect with
                // the correct backend_session_id hint (RC-9).
                transportSess.SetMetadata(
                    MetadataKeyBackendSessionPrefix+r.target.WorkloadID,
                    r.conn.SessionID(),
                )
            }
            transportSess.SetMetadata(MetadataKeyBackendIDs, strings.Join(ids, ","))
        }
    }

    No other changes to makeSession are needed — populateBackendMetadata is already called after the results slice is built and sorted, and transportSess is already the transportsession.Session that gets embedded in defaultMultiSession and persisted by the transport-layer storage.

The backendSessions map on defaultMultiSession (runtime use) is unaffected by this change and does not need to be modified.

Patterns & Frameworks

  • Follow the existing pattern in populateBackendMetadata: iterate over results, use SetMetadata on the transport session; no new dependencies required
  • Keep metadata keys consistent with the vmcp.* namespace already used by MetadataKeyBackendIDs and MetadataKeyIdentitySubject
  • Use testify/assert and testify/require with t.Parallel() consistent with pkg/vmcp/session/token_binding_test.go
  • Use the nilBackendConnector / mock connector pattern established in token_binding_test.go and default_session_test.go — inject a mock backendConnector with a known SessionID() return value via newSessionFactoryWithConnector

Code Pointers

  • pkg/vmcp/session/factory.go — primary file to modify; add constant and update populateBackendMetadata; makeSession at line 367; populateBackendMetadata at line 354; constants block at line 26
  • pkg/vmcp/session/internal/backend/session.goSession interface: SessionID() string is the method that returns the opaque backend session ID (line 54)
  • pkg/vmcp/session/internal/backend/mcp_session.gomcpSession.SessionID() returns c.backendSessionID (line 92) — this is what gets stored
  • pkg/vmcp/session/token_binding_test.go — existing test file in the same package; follow its structure (nilBackendConnector, newSessionFactoryWithConnector, table-driven sub-tests with t.Parallel()) for new unit tests
  • pkg/vmcp/session/default_session_test.go — additional test helpers (mockConnectedBackend with sessID field) that can be reused to provide a mock backend connector with a known SessionID() return value
  • pkg/transport/session/manager.go — shows how UpsertSession / AddSession calls storage.Store, which will persist the metadata map to Redis once RC-7 wiring is in place; no changes needed here

Component Interfaces

// pkg/vmcp/session/factory.go

const (
    // MetadataKeyBackendIDs — existing constant, unchanged
    MetadataKeyBackendIDs = "vmcp.backend.ids"

    // MetadataKeyBackendSessionPrefix is the key prefix for per-backend session IDs.
    // Full key: MetadataKeyBackendSessionPrefix + workloadID → backend_session_id.
    // Used by RestoreSession (RC-9) to reconnect backends with the correct session hint.
    MetadataKeyBackendSessionPrefix = "vmcp.backend.session."
)

// populateBackendMetadata — updated signature is unchanged; implementation extended.
// Writes MetadataKeyBackendIDs and, for each result, MetadataKeyBackendSessionPrefix+workloadID.
func populateBackendMetadata(transportSess transportsession.Session, results []initResult)

No interface changes are required for this task. MultiSessionFactory and MultiSession interfaces are untouched.

Testing Strategy

Add a new test file pkg/vmcp/session/factory_metadata_test.go (or add to an existing file in the package), following the token_binding_test.go pattern.

Unit Tests

  • makeSession with two successful backends: verify GetMetadata() contains "vmcp.backend.session.{workloadID-1}" and "vmcp.backend.session.{workloadID-2}" with the expected session IDs (use mockConnectedBackend.sessID)
  • makeSession with zero successful backends (all return nil, nil, nil): verify no "vmcp.backend.session.*" keys are present in metadata
  • makeSession with partial backend failure (one succeeds, one fails): verify only the successful backend's per-session-ID key is written; the failed backend's key is absent
  • makeSession still writes MetadataKeyBackendIDs correctly (sorted, comma-separated) alongside the new per-backend keys (regression check)
  • MetadataKeyBackendSessionPrefix constant has the expected value "vmcp.backend.session." (constant value guard)

Integration Tests

  • None required at this layer; unit tests with mock connectors provide sufficient coverage

Edge Cases

  • Backend whose SessionID() returns an empty string: the key is still written with an empty value (no special casing; downstream RestoreSession handles empty hints gracefully)
  • Two backends with the same workload ID: the second write overwrites the first (last writer wins, consistent with SetMetadata behavior); this is a degenerate case that should not occur in production but must not panic

Out of Scope

  • Wiring RedisStorage into the vMCP server or calling NewManagerWithRedis (deferred)
  • Implementing RestoreSession (TASK-005 / RC-9)
  • Updating Redis metadata when backend sessions expire (TASK-007 / RC-16)
  • Re-applying HijackPreventionDecorator during RestoreSession (TASK-004 / RC-15)
  • Changes to pkg/transport/session or pkg/vmcp/server/sessionmanager
  • Aggregator state persistence

References

  • RFC THV-0047: RFC: Horizontal Scaling for vMCP and Proxy Runner toolhive-rfcs#47
  • Parent epic: stacklok/stacklok-epics#262
  • Upstream RC-6 (Redis Storage Backend): stacklok/stacklok-epics#266
  • Upstream RC-7 (Wire Redis Backend Selection): stacklok/stacklok-epics#270
  • makeSession and populateBackendMetadata: pkg/vmcp/session/factory.go
  • Backend SessionID() interface: pkg/vmcp/session/internal/backend/session.go
  • Test pattern reference: pkg/vmcp/session/token_binding_test.go

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestscalabilityItems related to scalabilityvmcpVirtual MCP Server related issues

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions