-
Notifications
You must be signed in to change notification settings - Fork 614
[FEATURE]: Session affinity for stateful MCP workflows (REQ-005) #1986
Description
🔗 Feature: Session Affinity for Stateful MCP Workflows (REQ-005)
Goal
Bind all user interactions within a session to a single upstream MCP server instance. This enables stateful agentic workflows and elicitation flows where upstream servers maintain session state across multiple tool calls.
Why Now?
- Stateful Agents: AI agents increasingly need to maintain conversation state across tool calls
- Elicitation Routing: Elicitation requests must reach the correct user session
- Multi-Worker Deployments: Horizontal scaling breaks session locality without affinity
- MCP Server State: Some MCP servers store context that must persist across calls
📖 User Stories
US-1: AI Agent - Maintain Context Across Tool Calls
As an AI Agent executing multi-step workflows
I want all my tool calls routed to the same upstream MCP server
So that the server can maintain state between calls
Acceptance Criteria:
Scenario: Sequential tool calls maintain session affinity
Given a user starts a session via SSE transport
And the session is assigned to upstream MCP server "server-A"
When the user calls tool "start_task"
Then the call should route to "server-A"
When the user calls tool "check_status"
Then the call should also route to "server-A"
And both calls should see the same server-side state
Scenario: New session gets fresh assignment
Given user A has affinity to "server-A"
When user B starts a new session
Then user B may be assigned to "server-A" or "server-B"
And user B's assignment is independent of user ATechnical Requirements:
- Create
SessionAffinityServicemappingdownstream_session_id -> upstream_pool_key - Pin upstream session on first call
- Reuse pinned session for all subsequent calls
US-2: Gateway - Route Elicitation to Correct Session
As a Gateway administrator
I want elicitation requests routed to the originating user session
So that the correct user receives interactive prompts
Acceptance Criteria:
Scenario: Elicitation reaches correct user
Given user A is executing tool "confirm_delete" via session S1
And user B has active session S2
When the MCP server sends elicitation/create
Then the request should route ONLY to session S1
And user B (S2) should NOT see the elicitation
Scenario: Elicitation with session affinity
Given session S1 has affinity to upstream server US1
When US1 sends elicitation/create during tool execution
Then the response from S1 should return to US1
And the affinity should be maintainedTechnical Requirements:
- Integrate affinity service with elicitation routing
- Use session mapping to route elicitation responses
US-3: Operator - Multi-Worker Session Affinity
As a Platform Operator running multiple gateway workers
I want session affinity to work across workers
So that load balancing doesn't break session state
Acceptance Criteria:
Scenario: Cross-worker session routing
Given 3 gateway workers behind a load balancer
And user session S1 was established on worker W1
When subsequent request for S1 hits worker W2
Then W2 should lookup affinity in Redis
And route the request to the correct upstream server
And optionally redirect to W1 for optimal performance
Scenario: Worker failure recovery
Given session S1 has affinity to upstream US1 via worker W1
When W1 fails
And request arrives at W2
Then W2 should re-establish affinity to US1
And log the rebind event
And continue processing requestsTechnical Requirements:
- Store affinity mappings in Redis for cross-worker visibility
- Track
downstream_session_id -> worker_idfor worker affinity - Implement graceful rebind on worker failure
US-4: Developer - Enable/Disable Affinity Per Gateway
As a Developer configuring gateways
I want to enable session affinity for specific gateways
So that I can use it only where needed
Acceptance Criteria:
Scenario: Enable session affinity globally
Given MCPGATEWAY_SESSION_AFFINITY_ENABLED=true
When a new session is created
Then session affinity should be tracked
And all tool calls should maintain affinity
Scenario: Affinity disabled by default
Given MCPGATEWAY_SESSION_AFFINITY_ENABLED=false (default)
When tool calls are made
Then requests may route to any available upstream session
And no affinity tracking overhead occurs
Scenario: Configure affinity TTL
Given MCPGATEWAY_SESSION_AFFINITY_TTL=3600
When a session has no activity for 1 hour
Then the affinity mapping should expire
And next request gets fresh assignmentTechnical Requirements:
- Add
MCPGATEWAY_SESSION_AFFINITY_ENABLEDconfig (default: false) - Add
MCPGATEWAY_SESSION_AFFINITY_TTLconfig (default: 3600s) - Opt-in behavior to avoid overhead for stateless use cases
🏗 Architecture
Session Affinity Flow
sequenceDiagram
participant Client
participant Gateway
participant AffinityService
participant Redis
participant MCP Server
Client->>Gateway: Connect (session: S1)
Gateway->>AffinityService: Check affinity for S1
AffinityService->>Redis: GET affinity:S1
Redis-->>AffinityService: null (no affinity)
Client->>Gateway: tools/call "start_task"
Gateway->>AffinityService: Get/Create affinity for S1
AffinityService->>MCP Server: Execute on US1
AffinityService->>Redis: SET affinity:S1 = US1 (TTL: 3600)
MCP Server-->>Gateway: Result
Client->>Gateway: tools/call "check_status"
Gateway->>AffinityService: Get affinity for S1
AffinityService->>Redis: GET affinity:S1
Redis-->>AffinityService: US1
AffinityService->>MCP Server: Execute on US1 (same server!)
MCP Server-->>Gateway: Result (with state from first call)
Affinity Service Design
classDiagram
class SessionAffinityService {
-redis_client: Redis
-local_cache: Dict
-ttl: int
+get_affinity(session_id) Optional~str~
+set_affinity(session_id, upstream_key)
+remove_affinity(session_id)
+rebind_affinity(session_id, new_upstream)
}
class AffinityMapping {
+downstream_session_id: str
+upstream_pool_key: str
+worker_id: Optional~str~
+created_at: datetime
+last_used: datetime
}
📋 Implementation Tasks
Phase 1: Session ID Propagation
- Add
x-mcp-session-idtoDEFAULT_IDENTITY_HEADERS - Inject session ID header in
generate_response()for SSE - Inject session ID in streamable HTTP context vars
- Pass session ID through tool invocation path
Phase 2: Affinity Service (In-Memory)
- Create
mcpgateway/services/session_affinity_service.py - Implement
SessionAffinityServiceclass - Add in-memory storage for single-worker deployments
- Implement get/set/remove affinity methods
- Add TTL-based expiration
Phase 3: Redis Backend
- Add Redis storage backend for affinity mappings
- Use Redis for cross-worker visibility
- Implement atomic set-if-not-exists for initial binding
- Add TTL support via Redis EXPIRE
Phase 4: Tool Service Integration
- Modify tool invocation to check affinity first
- Create affinity on first tool call
- Reuse affinity for subsequent calls
- Handle affinity miss (upstream unavailable)
Phase 5: Elicitation Integration
- Use affinity service in elicitation routing
- Ensure elicitation responses maintain affinity
- Add cleanup hooks in
SessionRegistry.remove_session()
Phase 6: Worker Affinity
- Track
session_id -> worker_idmapping - Add worker health checking
- Implement graceful rebind on worker failure
- Add rebind logging and metrics
Phase 7: Metrics
- Add
session_affinity_bindings_activegauge - Add
session_affinity_hits_totalcounter - Add
session_affinity_misses_totalcounter - Add
session_affinity_rebinds_totalcounter - Add
session_affinity_failures_totalcounter
Phase 8: Testing
- Unit tests for affinity service
- Unit tests for Redis backend
- Integration tests for multi-call affinity
- Integration tests for elicitation routing
- Integration tests for worker failover
⚙️ Configuration Example
# Enable session affinity (opt-in)
MCPGATEWAY_SESSION_AFFINITY_ENABLED=false
# TTL for affinity mappings (seconds)
MCPGATEWAY_SESSION_AFFINITY_TTL=3600
# Redis required for multi-worker affinity
REDIS_URL=redis://localhost:6379
# Example: Enable for stateful workflows
# MCPGATEWAY_SESSION_AFFINITY_ENABLED=true
# MCPGATEWAY_SESSION_AFFINITY_TTL=7200✅ Success Criteria
- Downstream SSE session maintains affinity across multiple tool calls
- Downstream streamable HTTP session maintains affinity
- Elicitation requests route to originating user session
- Multi-worker deployments maintain affinity via Redis
- Graceful rebind on upstream session failure
- Metrics exposed for monitoring affinity behavior
- Configuration toggles work correctly
- No performance regression when affinity disabled
- All integration tests pass
🏁 Definition of Done
- Session ID propagation implemented
- Affinity service with in-memory backend working
- Redis backend implemented
- Tool service integration complete
- Elicitation integration complete
- Worker affinity implemented
- Metrics added and exposed
- Unit tests written and passing
- Integration tests written and passing
- Code passes
make verify - Configuration documented in
.env.example - PR reviewed and approved
📝 Additional Notes
Design Decisions
| Decision | Resolution | Rationale |
|---|---|---|
| Affinity scope | Per session (not per user) | Users may have multiple sessions |
| Concurrency | Serialize by default | Prevents race conditions in stateful servers |
| Rebind strategy | Log, carry over context | Graceful degradation preferred |
| Default state | Opt-in via config | Avoid overhead for stateless use cases |
Performance Considerations
- In-memory cache for hot path (< 1ms lookup)
- Redis fallback for cross-worker (< 5ms)
- TTL prevents unbounded memory growth
- No affinity tracking when disabled (zero overhead)
🔗 Related Issues
- Design document:
todo/session-affinity.md mcpgateway/services/mcp_session_pool.py- Session poolmcpgateway/cache/session_registry.py- Session registryllmchat_routerRedis worker affinity pattern