Authors: Gabriel Zimmerman (gjz22)
Status: Draft
Type: Standards Track
Abstract
This SEP proposes adding an optional state meta field (modelcontextprotocol.io/state/request/) to both ServerRequest and ServerResponse messages in the Model Context Protocol (MCP). The client MUST echo back the exact state value received in the request when sending the response. This change enables servers to implement stateless request flows, eliminating the requirement for durable state storage and significantly reducing operational complexity and cost. While this applies to all ServerRequests, the primary motivation is elicitation requests, where response times are unbounded and can range from seconds to hours, making stateful implementations particularly challenging for remote MCPs.
Motivation
Primary Use Case: Elicitation Requests in Remote Servers
While this SEP proposes adding state to all ServerRequest and ServerResponse messages, the primary driver is elicitation requests in remote MCP servers. Unlike other server requests such as sampling (which typically receive responses in seconds), elicitation requests have unbounded response times. A user may take minutes, hours, or even longer to respond to an elicitation prompt, making stateful server implementations particularly burdensome for remote MCP servers to the point of making them impractical for many architectures.
However, other server request flows may benefit from passing state to ensure the response does not need to go back to the original server that made the request and to remove the need for a durable store in case of disconnects.
Elicitation Protocol Flow
When a Remote MCP Server needs additional information from a user, it initiates an elicitation request through the following flow:
-
Server sends elicitation request: The server sends a ServerRequest using the elicitation/create method over an SSE (Server-Sent Events) stream, specifying what information is needed and the expected schema.
-
Client presents UI: The client displays an interface to the user (dialog, form, etc.) and waits for the user to respond. This may take seconds, minutes, or even longer depending on user availability.
-
Client sends response: Once the user provides input, the client sends a ServerResponse back to the server with the collected data.
-
Server processes response: The server uses the elicited information to continue its operation.
During this flow, if the SSE connection breaks (due to network issues, load balancer timeouts, or other failures), the Streamable HTTP transport supports reconnection using the Last-Event-ID mechanism. The client can reconnect and indicate the last event it successfully received, allowing the server to resume the stream.
The Problem with Stateful Server Requests
The current MCP protocol requires servers to maintain state for server-initiated requests (ServerRequests) that await client responses (ServerResponses). This is particularly problematic for elicitation requests in Remote MCP Servers for two key reasons:
First, the SSE connection must remain open for the duration of the request. The server must maintain this connection from when it sends the request until it receives the response. For elicitation requests, this could be an arbitrary amount of time.
Second, to support reconnection via the Last-Event-ID mechanism, the server must retain enough context to process the eventual ServerResponse even if the connection breaks and reconnects.
This requirement for durable state storage imposes significant operational burdens on server implementations:
Architectural Complexity and Costs of Durable State
Implementing durable state storage requires substantial infrastructure and operational overhead:
-
Infrastructure Requirements: Servers must deploy and manage a persistent data store (e.g., PostgreSQL, Redis, DynamoDB) with high availability, replication, and backup mechanisms. This is not a simple in-memory cache but must survive server restarts and failures.
-
Operational Complexity: State synchronization in distributed deployments requires distributed locking or consensus protocols. Garbage collection logic is needed to clean up orphaned state, typically using TTL (time-to-live) mechanisms. However, TTLs create a fundamental tradeoff: short TTLs reduce storage costs but limit how long users have to respond to elicitation requests, while long TTLs accommodate slow users but increase storage requirements.
-
Scalability Limitations: The state store becomes a bottleneck, limiting horizontal scaling. Geographic distribution requires either expensive global replication or sticky routing with poor user experience.
-
Reliability Concerns: The state store becomes a critical dependency and single point of failure. State corruption or catastrophic failures require complex recovery procedures.
-
Development Burden: Developers must write and maintain state storage abstraction layers, connection pooling, transaction handling, and migration scripts. Testing requires integration tests with real databases and complex fixtures.
The Stateless Alternative
With a state meta field, servers can encode all necessary context directly in any ServerRequest. This would allow the server to break the SSE connection (discussed in other SEPs). When the client returns this opaque state in the ServerResponse, the server can immediately process the response without consulting any external storage. This transforms server-initiated requests from stateful, distributed transactions into simple, self-contained request-response pairs.
Benefits include:
- Trivial horizontal scaling - any server can handle any response
- Simplified deployments - no state migration concerns
- Improved reliability - no state store to fail
- Reduced latency - no database queries
- Lower development costs - no state management code to write or maintain
Specification
Protocol Requirements
-
Server Behavior:
- Servers MAY include a state meta field (
modelcontextprotocol.io/state/request/) in any ServerRequest (including elicitation/create, sampling/createMessage, etc.).
- The state value MUST be treated as an opaque string by clients.
- Servers SHOULD encode all context needed to process the response in the state meta field when using stateless mode.
-
Client Behavior:
- Clients MUST echo back the exact state value received in any ServerRequest when sending the corresponding ServerResponse.
- Clients MUST NOT inspect, parse, modify, or make any assumptions about the state contents.
- If a ServerRequest does not contain a state meta field, the client MUST NOT include one in the ServerResponse.
-
State Content:
- The state meta field contains an opaque string that is meaningful only to the server.
- Servers are free to encode the state in any format (e.g., base64-encoded JSON, encrypted JWT, serialized binary, etc.).
- The state MAY contain sensitive information, so servers SHOULD encrypt or sign the state to prevent tampering or information disclosure.
Note on Meta Field Format: This SEP uses the modelcontextprotocol.io/state/request/ meta field format to align with the pattern established in #1655 for standardized metadata keys in MCP.
Capabilities
Similar to #1655, clients that support the state meta field MUST declare this capability during initialization. The state capability includes a scopes array that indicates which types of messages support state.
Client Capability Declaration:
{
"capabilities": {
"state": {
"scopes": ["request"]
}
}
}
The "request" scope indicates that the client supports state in ServerRequest and ServerResponse messages. By declaring this capability, the client indicates that it:
- Can echo back the state meta field in ServerResponses
- Supports stateless operation for ServerRequests
Servers SHOULD check client capabilities before including state in ServerRequests. If a client does not declare state capability support, the server MUST fall back to traditional state management approaches.
Example Usage
Stateless Elicitation Flow
// Server sends request with encoded state
{
"method": "elicitation/create",
"params": {
"_meta": {
"modelcontextprotocol.io/state/request/": "eyJjb250ZXh0IjoiZGVsZXRlLWZpbGUiLCJmaWxlIjoiL3RtcC9kYXRhLnR4dCIsInRpbWVzdGFtcCI6MTcwOTU2ODAwMH0="
},
"message": "Are you sure you want to delete /tmp/data.txt?",
"requestedSchema": {
"type": "object",
"properties": {
"confirm": {
"type": "boolean"
}
}
}
}
}
// Client responds with exact state
{
"jsonrpc": "2.0",
"id": 1,
"result": {
"_meta": {
"modelcontextprotocol.io/state/request/": "eyJjb250ZXh0IjoiZGVsZXRlLWZpbGUiLCJmaWxlIjoiL3RtcC9kYXRhLnR4dCIsInRpbWVzdGFtcCI6MTcwOTU2ODAwMH0="
},
"action": "accept",
"content": {
"confirm": true
}
}
}
// Server decodes state and processes response without database lookup
Rationale
Design Decisions
1. Meta Field Instead of Direct Property
The state is placed in the _meta field rather than as a direct property of params to separate protocol-level concerns from domain-specific elicitation parameters. The _meta field is already established in MCP for protocol-level metadata, making it the natural location for state management information.
2. Namespaced Key
The key modelcontextprotocol.io/state/request/ uses domain namespacing to avoid collisions with other meta fields and clearly indicates this is an official MCP protocol feature for request state management. This format aligns with the metadata standardization pattern established in #1655.
3. Optional Field
The state meta field is optional because servers can choose not to use it. Existing servers that use traditional state management can continue to operate without changes. New servers can opt into stateless mode by including the state meta field. Servers may also choose to use state selectively—for example, only for elicitation requests where response times are unbounded, while continuing to use traditional state management for faster ServerRequests like sampling.
4. Opaque String
The state is defined as an opaque string rather than a structured object to give servers maximum flexibility in how they encode context. Servers can choose:
- Simple JSON where tampering of the state is irrelevant or visibility of state is useful
- Encrypted tokens for security
- Signed JWTs to prevent tampering
- Binary serialization for compactness
5. Strict Echo Requirement
Clients MUST return the exact state without modification. This is critical because:
- Any modification could corrupt the encoded context
- Servers may include checksums or signatures for tamper detection
- It keeps the client implementation simple—no parsing or logic required
6. Server-Controlled
Only servers can initiate stateless mode (by including state in the request). Clients cannot force a server to be stateless. This gives servers full control over their architecture.
Backward Compatibility
This proposal is fully backward compatible:
-
Existing Servers: Can continue to ignore the state meta field and use traditional state management for all ServerRequests. No changes required.
-
Existing Clients: Will pass through any state meta field they receive (as required by the spec's allowance for additional fields in _meta). Client implementations should be updated to explicitly handle the state meta field in ServerResponses, but the protocol itself doesn't break when they don't have the request state capability.
Security Implications
1. State Tampering
Since the state field is controlled by the client (who echoes it back), malicious clients could attempt to modify it to:
- Gain unauthorized access
- Bypass authorization checks
- Corrupt server logic
Mitigation: Servers SHOULD use cryptographic signatures or encryption to prevent tampering.
2. Information Disclosure
The state field may contain sensitive information (file paths, user IDs, etc.) that could be exposed to malicious clients.
Mitigation: Servers SHOULD encrypt the state field if it contains sensitive data. Alternatively, use opaque identifiers that have no inherent meaning.
3. Replay Attacks
A malicious actor could capture a valid state value and replay it to trigger unintended operations.
Mitigation: Include timestamps or nonces in the state and reject expired or duplicate states. Implement short time-to-live (TTL) for state values.
4. Size Limits
Unbounded state sizes could break a client.
Mitigation: The specification SHOULD define a maximum state size (e.g., 8KB). Clients SHOULD reject requests with excessively large state fields. Servers SHOULD design state encoding to be compact.
Migration Path
For servers transitioning from stateful to stateless server requests:
-
Phase 1 - Dual Mode: Implement state encoding/decoding but continue writing to the database as well. This allows validation that the stateless approach works correctly.
-
Phase 2 - Stateless Primary: Use the state meta field as the primary mechanism, falling back to database lookup only if state is missing or invalid.
-
Phase 3 - Stateless Only: Remove database dependencies entirely once confident in the stateless implementation.
-
Phase 4 - Cleanup: Remove state management infrastructure, reduce operational costs.
Note: Servers may choose to implement stateless mode only for elicitation requests initially, as these have unbounded response times and benefit most from stateless architecture. Other ServerRequests like sampling can continue to use traditional state management if response times are predictably short.
Open Questions
-
Maximum State Size: Should the specification mandate a maximum state size?
-
State Format Recommendations: Should the spec recommend specific encoding schemes (JWT, encrypted JSON, etc.)? Recommendation: Provide non-normative examples but don't mandate a format.
-
Error Codes: Should we define specific error codes for state validation failures? Recommendation: Yes, add a new error code like -32010 for "Invalid or tampered state".
Conclusion
Adding an optional state meta field to ServerRequest and ServerResponse messages is a simple protocol change that enables significant architectural simplification. While this applies to all server-initiated requests, the primary benefit is for elicitation requests where response times are unbounded. Servers can eliminate the need for durable state storage, reducing costs by hundreds or thousands of dollars monthly while improving scalability, reliability, and operational simplicity. The change is fully backward compatible and gives servers the flexibility to choose between stateful and stateless implementations based on their requirements.
Authors: Gabriel Zimmerman (gjz22)
Status: Draft
Type: Standards Track
Abstract
This SEP proposes adding an optional state meta field (
modelcontextprotocol.io/state/request/) to bothServerRequestandServerResponsemessages in the Model Context Protocol (MCP). The client MUST echo back the exact state value received in the request when sending the response. This change enables servers to implement stateless request flows, eliminating the requirement for durable state storage and significantly reducing operational complexity and cost. While this applies to all ServerRequests, the primary motivation is elicitation requests, where response times are unbounded and can range from seconds to hours, making stateful implementations particularly challenging for remote MCPs.Motivation
Primary Use Case: Elicitation Requests in Remote Servers
While this SEP proposes adding state to all ServerRequest and ServerResponse messages, the primary driver is elicitation requests in remote MCP servers. Unlike other server requests such as sampling (which typically receive responses in seconds), elicitation requests have unbounded response times. A user may take minutes, hours, or even longer to respond to an elicitation prompt, making stateful server implementations particularly burdensome for remote MCP servers to the point of making them impractical for many architectures.
However, other server request flows may benefit from passing state to ensure the response does not need to go back to the original server that made the request and to remove the need for a durable store in case of disconnects.
Elicitation Protocol Flow
When a Remote MCP Server needs additional information from a user, it initiates an elicitation request through the following flow:
Server sends elicitation request: The server sends a ServerRequest using the
elicitation/createmethod over an SSE (Server-Sent Events) stream, specifying what information is needed and the expected schema.Client presents UI: The client displays an interface to the user (dialog, form, etc.) and waits for the user to respond. This may take seconds, minutes, or even longer depending on user availability.
Client sends response: Once the user provides input, the client sends a ServerResponse back to the server with the collected data.
Server processes response: The server uses the elicited information to continue its operation.
During this flow, if the SSE connection breaks (due to network issues, load balancer timeouts, or other failures), the Streamable HTTP transport supports reconnection using the
Last-Event-IDmechanism. The client can reconnect and indicate the last event it successfully received, allowing the server to resume the stream.The Problem with Stateful Server Requests
The current MCP protocol requires servers to maintain state for server-initiated requests (ServerRequests) that await client responses (ServerResponses). This is particularly problematic for elicitation requests in Remote MCP Servers for two key reasons:
First, the SSE connection must remain open for the duration of the request. The server must maintain this connection from when it sends the request until it receives the response. For elicitation requests, this could be an arbitrary amount of time.
Second, to support reconnection via the
Last-Event-IDmechanism, the server must retain enough context to process the eventual ServerResponse even if the connection breaks and reconnects.This requirement for durable state storage imposes significant operational burdens on server implementations:
Architectural Complexity and Costs of Durable State
Implementing durable state storage requires substantial infrastructure and operational overhead:
Infrastructure Requirements: Servers must deploy and manage a persistent data store (e.g., PostgreSQL, Redis, DynamoDB) with high availability, replication, and backup mechanisms. This is not a simple in-memory cache but must survive server restarts and failures.
Operational Complexity: State synchronization in distributed deployments requires distributed locking or consensus protocols. Garbage collection logic is needed to clean up orphaned state, typically using TTL (time-to-live) mechanisms. However, TTLs create a fundamental tradeoff: short TTLs reduce storage costs but limit how long users have to respond to elicitation requests, while long TTLs accommodate slow users but increase storage requirements.
Scalability Limitations: The state store becomes a bottleneck, limiting horizontal scaling. Geographic distribution requires either expensive global replication or sticky routing with poor user experience.
Reliability Concerns: The state store becomes a critical dependency and single point of failure. State corruption or catastrophic failures require complex recovery procedures.
Development Burden: Developers must write and maintain state storage abstraction layers, connection pooling, transaction handling, and migration scripts. Testing requires integration tests with real databases and complex fixtures.
The Stateless Alternative
With a state meta field, servers can encode all necessary context directly in any ServerRequest. This would allow the server to break the SSE connection (discussed in other SEPs). When the client returns this opaque state in the ServerResponse, the server can immediately process the response without consulting any external storage. This transforms server-initiated requests from stateful, distributed transactions into simple, self-contained request-response pairs.
Benefits include:
Specification
Protocol Requirements
Server Behavior:
modelcontextprotocol.io/state/request/) in any ServerRequest (includingelicitation/create,sampling/createMessage, etc.).Client Behavior:
State Content:
Note on Meta Field Format: This SEP uses the
modelcontextprotocol.io/state/request/meta field format to align with the pattern established in #1655 for standardized metadata keys in MCP.Capabilities
Similar to #1655, clients that support the state meta field MUST declare this capability during initialization. The state capability includes a
scopesarray that indicates which types of messages support state.Client Capability Declaration:
{ "capabilities": { "state": { "scopes": ["request"] } } }The
"request"scope indicates that the client supports state in ServerRequest and ServerResponse messages. By declaring this capability, the client indicates that it:Servers SHOULD check client capabilities before including state in ServerRequests. If a client does not declare state capability support, the server MUST fall back to traditional state management approaches.
Example Usage
Stateless Elicitation Flow
Rationale
Design Decisions
1. Meta Field Instead of Direct Property
The state is placed in the
_metafield rather than as a direct property ofparamsto separate protocol-level concerns from domain-specific elicitation parameters. The_metafield is already established in MCP for protocol-level metadata, making it the natural location for state management information.2. Namespaced Key
The key
modelcontextprotocol.io/state/request/uses domain namespacing to avoid collisions with other meta fields and clearly indicates this is an official MCP protocol feature for request state management. This format aligns with the metadata standardization pattern established in #1655.3. Optional Field
The state meta field is optional because servers can choose not to use it. Existing servers that use traditional state management can continue to operate without changes. New servers can opt into stateless mode by including the state meta field. Servers may also choose to use state selectively—for example, only for elicitation requests where response times are unbounded, while continuing to use traditional state management for faster ServerRequests like sampling.
4. Opaque String
The state is defined as an opaque string rather than a structured object to give servers maximum flexibility in how they encode context. Servers can choose:
5. Strict Echo Requirement
Clients MUST return the exact state without modification. This is critical because:
6. Server-Controlled
Only servers can initiate stateless mode (by including state in the request). Clients cannot force a server to be stateless. This gives servers full control over their architecture.
Backward Compatibility
This proposal is fully backward compatible:
Existing Servers: Can continue to ignore the state meta field and use traditional state management for all ServerRequests. No changes required.
Existing Clients: Will pass through any state meta field they receive (as required by the spec's allowance for additional fields in
_meta). Client implementations should be updated to explicitly handle the state meta field in ServerResponses, but the protocol itself doesn't break when they don't have the request state capability.Security Implications
1. State Tampering
Since the
statefield is controlled by the client (who echoes it back), malicious clients could attempt to modify it to:Mitigation: Servers SHOULD use cryptographic signatures or encryption to prevent tampering.
2. Information Disclosure
The
statefield may contain sensitive information (file paths, user IDs, etc.) that could be exposed to malicious clients.Mitigation: Servers SHOULD encrypt the state field if it contains sensitive data. Alternatively, use opaque identifiers that have no inherent meaning.
3. Replay Attacks
A malicious actor could capture a valid
statevalue and replay it to trigger unintended operations.Mitigation: Include timestamps or nonces in the state and reject expired or duplicate states. Implement short time-to-live (TTL) for state values.
4. Size Limits
Unbounded state sizes could break a client.
Mitigation: The specification SHOULD define a maximum state size (e.g., 8KB). Clients SHOULD reject requests with excessively large state fields. Servers SHOULD design state encoding to be compact.
Migration Path
For servers transitioning from stateful to stateless server requests:
Phase 1 - Dual Mode: Implement state encoding/decoding but continue writing to the database as well. This allows validation that the stateless approach works correctly.
Phase 2 - Stateless Primary: Use the state meta field as the primary mechanism, falling back to database lookup only if state is missing or invalid.
Phase 3 - Stateless Only: Remove database dependencies entirely once confident in the stateless implementation.
Phase 4 - Cleanup: Remove state management infrastructure, reduce operational costs.
Note: Servers may choose to implement stateless mode only for elicitation requests initially, as these have unbounded response times and benefit most from stateless architecture. Other ServerRequests like sampling can continue to use traditional state management if response times are predictably short.
Open Questions
Maximum State Size: Should the specification mandate a maximum state size?
State Format Recommendations: Should the spec recommend specific encoding schemes (JWT, encrypted JSON, etc.)? Recommendation: Provide non-normative examples but don't mandate a format.
Error Codes: Should we define specific error codes for state validation failures? Recommendation: Yes, add a new error code like
-32010for "Invalid or tampered state".Conclusion
Adding an optional state meta field to ServerRequest and ServerResponse messages is a simple protocol change that enables significant architectural simplification. While this applies to all server-initiated requests, the primary benefit is for elicitation requests where response times are unbounded. Servers can eliminate the need for durable state storage, reducing costs by hundreds or thousands of dollars monthly while improving scalability, reliability, and operational simplicity. The change is fully backward compatible and gives servers the flexibility to choose between stateful and stateless implementations based on their requirements.