Skip to content

[EPIC][TESTING][PROTOCOL]: MCP 2025-11-25 Protocol Compliance Test Suite #2525

@crivetimihai

Description

@crivetimihai

[EPIC][TESTING][PROTOCOL]: MCP 2025-11-25 Protocol Compliance Test Suite (Schema-Aligned)

Goal

Implement an automated compliance suite that validates MCP Gateway against MCP 2025-11-25 using the authoritative TypeScript schema and normative spec text.

  • Target protocol version string: "2025-11-25"
  • Priority: prevent protocol regressions and provide manual release-gating compliance evidence
  • Version scope for this epic: latest spec only (2025-11-25), no previous-version compatibility lane yet

Authoritative Sources (Checked 2026-02-15)

Source-of-Truth Policy

  • For protocol message structure and method inventory, schema.ts is authoritative.
  • Spec markdown pages are normative for behavioral requirements not fully expressible in types (transport semantics, security constraints, lifecycle timing, task behavior details).
  • The suite must include a schema-drift guard that fails the manual compliance run when schema method set changes are not reflected in test manifest updates.

Scope

In Scope (Core 2025-11-25 Compliance)

  • Base MCP message envelope and lifecycle
  • Core transports: stdio and Streamable HTTP
  • Server features: tools, resources, prompts, completion, logging
  • Client features: roots, sampling, elicitation
  • Utilities: ping, cancellation, progress, pagination
  • Tasks (introduced in 2025-11-25; experimental but in-schema)
  • Error semantics and capability-gated behavior

Conditional Scope

  • Authorization (optional in protocol): tested when auth is enabled/configured

Out of Scope (This Epic)

  • Previous protocol versions (including 2024-11-05 compatibility behavior)
  • Deprecated transport compatibility paths from older revisions
  • WebSocket/custom transport compliance lanes

Explicitly Out of Core Compliance Gating

  • JSON-RPC batch request support as an MCP 2025-11-25 requirement
    • Rationale: MCP 2025-11-25 message model and Streamable HTTP rules are single-message JSON-RPC objects, not batch arrays.

Why This Update

The previous epic draft had strong coverage intent but diverged from the 2025-11-25 authoritative schema in key places:

  • Missing Tasks method/capability coverage (tasks/get, tasks/result, tasks/list, tasks/cancel, notifications/tasks/status)
  • Missing list-changed notifications coverage for tools/prompts/resources
  • JSON-RPC request ID semantics not aligned with MCP envelope (RequestId is string | number; requests are not nullable IDs)
  • Batch tests framed as required MCP compliance
    • Core transport framing mixed with deprecated/extension transport expectations
  • Several field cardinality mismatches (tools.description, tools/call.arguments, initialized.params, elicitation response content conditions)

Test Organization

tests/
├── compliance/
│   └── mcp_2025_11_25/
│       ├── conftest.py
│       ├── manifest/
│       │   ├── schema_methods.yaml              # generated/verified from schema.ts
│       │   └── requirements_map.yaml            # MUST/SHOULD/MAY traceability
│       ├── base/
│       │   ├── test_jsonrpc_envelope.py
│       │   ├── test_error_shapes.py
│       │   ├── test_method_inventory_guard.py
│       │   └── test_no_batch_payloads.py
│       ├── lifecycle/
│       │   ├── test_initialize.py
│       │   ├── test_version_negotiation.py
│       │   ├── test_capabilities.py
│       │   └── test_initialized_notification.py
│       ├── transport_core/
│       │   ├── test_stdio.py
│       │   ├── test_streamable_http_post.py
│       │   ├── test_streamable_http_get_sse.py
│       │   ├── test_streamable_http_session.py
│       │   └── test_streamable_http_protocol_header.py
│       ├── server_features/
│       │   ├── tools/
│       │   │   ├── test_tools_list.py
│       │   │   ├── test_tools_call.py
│       │   │   ├── test_tools_output_schema.py
│       │   │   └── test_tools_list_changed_notification.py
│       │   ├── resources/
│       │   │   ├── test_resources_list.py
│       │   │   ├── test_resources_read.py
│       │   │   ├── test_resources_templates.py
│       │   │   ├── test_resources_subscribe.py
│       │   │   └── test_resources_list_changed_notification.py
│       │   ├── prompts/
│       │   │   ├── test_prompts_list.py
│       │   │   ├── test_prompts_get.py
│       │   │   └── test_prompts_list_changed_notification.py
│       │   ├── test_completion.py
│       │   └── test_logging.py
│       ├── client_features/
│       │   ├── test_roots.py
│       │   ├── test_sampling.py
│       │   └── test_elicitation.py
│       ├── utilities/
│       │   ├── test_ping.py
│       │   ├── test_cancellation.py
│       │   ├── test_progress.py
│       │   └── test_pagination.py
│       ├── tasks/
│       │   ├── test_capability_negotiation.py
│       │   ├── test_task_augmented_requests.py
│       │   ├── test_tasks_get_result_list_cancel.py
│       │   ├── test_tasks_status_notification.py
│       │   ├── test_related_task_metadata.py
│       │   └── test_task_error_semantics.py
│       ├── authorization/
│       │   ├── test_protected_resource_metadata.py
│       │   ├── test_www_authenticate_discovery.py
│       │   ├── test_token_transport_rules.py
│       │   └── test_pkce_requirements.py
│       ├── security/
│       │   ├── test_input_validation.py
│       │   ├── test_root_boundary_controls.py
│       │   ├── test_sensitive_data_handling.py
│       │   └── test_session_security.py

Pytest Markers

markers = [
    "mcp20251125: MCP 2025-11-25 compliance",
    "mcp_core: core protocol requirements",
    "mcp_required: MUST requirement",
    "mcp_recommended: SHOULD requirement",
    "mcp_optional: MAY requirement",
    "mcp_base: JSON-RPC/MCP envelope",
    "mcp_lifecycle: initialize/initialized/version/capabilities",
    "mcp_transport_core: stdio + streamable-http",
    "mcp_server_features: tools/resources/prompts/completion/logging",
    "mcp_client_features: roots/sampling/elicitation",
    "mcp_utilities: ping/cancel/progress/pagination",
    "mcp_tasks: task-augmented execution and task APIs",
    "mcp_auth: authorization rules (conditional)",
    "mcp_security: security best-practice requirements",
]

Schema Method Coverage Matrix (Must Be 100% Mapped)

Requests

Method Direction Capability Gate
initialize client -> server none
ping both none
completion/complete client -> server server.capabilities.completions
logging/setLevel client -> server server.capabilities.logging
prompts/list client -> server server.capabilities.prompts
prompts/get client -> server server.capabilities.prompts
resources/list client -> server server.capabilities.resources
resources/templates/list client -> server server.capabilities.resources
resources/read client -> server server.capabilities.resources
resources/subscribe client -> server server.capabilities.resources.subscribe
resources/unsubscribe client -> server server.capabilities.resources.subscribe
tools/list client -> server server.capabilities.tools
tools/call client -> server server.capabilities.tools
roots/list server -> client client.capabilities.roots
sampling/createMessage server -> client client.capabilities.sampling
elicitation/create server -> client client.capabilities.elicitation
tasks/get both *.capabilities.tasks
tasks/result both *.capabilities.tasks
tasks/list both *.capabilities.tasks.list
tasks/cancel both *.capabilities.tasks.cancel

Notifications

Method Direction
notifications/initialized client -> server
notifications/cancelled both
notifications/progress both
notifications/message server -> client
notifications/resources/updated server -> client
notifications/resources/list_changed server -> client
notifications/prompts/list_changed server -> client
notifications/tools/list_changed server -> client
notifications/roots/list_changed client -> server
notifications/elicitation/complete server -> client
notifications/tasks/status both (optional)

Implementation Phases and Requirements

Phase 1: Base Envelope and Error Semantics

  • Validate MCP JSON-RPC envelope:
    • Requests require jsonrpc: "2.0", method, id: string | number
    • Notifications require jsonrpc: "2.0", method, and must not include id
    • Responses require either result or error, never both
  • Enforce request ID constraints:
    • id must not be null for requests
    • requestor must not reuse request IDs within the same session
  • Validate standard JSON-RPC/MCP error codes:
    • -32700, -32600, -32601, -32602, -32603
    • MCP URL elicitation required error -32042 (URLElicitationRequiredError) with schema-conformant data.elicitations
  • Add explicit non-batch tests for MCP message handling (core lane)

Phase 2: Lifecycle and Capability Negotiation

  • initialize request/response contract:
    • request params must include protocolVersion, capabilities, clientInfo
    • response must include protocolVersion, capabilities, serverInfo
  • Implementation object validation:
    • required: name, version
    • optional: title, description, icons, websiteUrl
  • Version negotiation behavior:
    • client sends supported version, should send latest supported
    • server returns requested version if supported, otherwise supported alternative
    • client should disconnect on unsupported returned version
  • notifications/initialized handling:
    • method required
    • params optional (do not require empty object only)
  • Pre-initialization constraints:
    • client should not send non-ping requests before initialize response
    • server should not send non-ping/non-logging requests before initialized
  • Capability-gated behavior:
    • no method/notification usage outside negotiated capabilities
    • include full tasks capability subtree checks (list, cancel, per-request task support)

Phase 3: Core Transport Compliance (stdio, Streamable HTTP)

  • stdio:
    • UTF-8 JSON-RPC messages only
    • newline delimited, no embedded newlines
    • stdout must contain only valid MCP messages
    • stderr is non-protocol channel
  • Streamable HTTP:
    • single MCP endpoint supports POST and GET
    • POST body is a single JSON-RPC request/notification/response
    • client POST Accept must include application/json and text/event-stream
    • response/notification input accepted as 202, request input answered by JSON or SSE stream
    • origin validation and 403 behavior for invalid origin
    • GET SSE behavior and optional 405 path
    • SSE resumability semantics (id, Last-Event-ID, replay constraints)
    • multiple simultaneous streams without duplicate message broadcast
    • session semantics via MCP-Session-Id
    • 404-with-session => client starts new session
  • Protocol version header:
    • MCP-Protocol-Version must be present on subsequent HTTP requests
    • invalid/unsupported header must return HTTP 400

Phase 4: Server Features (Tools, Resources, Prompts, Completion, Logging)

  • Tools:
    • tools/list, tools/call, notifications/tools/list_changed
    • Tool fields:
      • required: name, inputSchema
      • optional: title, description, icons, annotations, execution, outputSchema, _meta
    • tools/call.params.arguments is optional and must be object-shaped when present
    • inputSchema must be non-null JSON Schema object
    • structuredContent + outputSchema conformance when output schema is declared
    • tool execution errors represented in CallToolResult with isError, not protocol error when tool was found/invoked
  • Resources:
    • resources/list, resources/templates/list, resources/read, resources/subscribe, resources/unsubscribe
    • notifications/resources/updated, notifications/resources/list_changed
    • list/read/template object schema including title, icons, annotations, and content variants (text or blob)
  • Prompts:
    • prompts/list, prompts/get, notifications/prompts/list_changed
    • prompt message content types include text/image/audio/embedded resources with correct shape
  • Completion:
    • completion/complete request with valid ref union and argument object
    • completion response values array limit (<= 100) and optional total/hasMore
  • Logging:
    • logging/setLevel, notifications/message
    • valid RFC5424 levels and required data field

Phase 5: Client Features (Roots, Sampling, Elicitation)

  • Roots:
    • roots/list request/response and notifications/roots/list_changed
    • root URI must currently be file://
  • Sampling:
    • sampling/createMessage request/response contract
    • tools and toolChoice allowed only when client declares sampling.tools
    • includeContext behavior aligned with sampling.context and soft-deprecation rules
    • tool loop constraints:
      • tool-result-only user messages
      • tool use/result ID matching and sequencing
  • Elicitation:
    • elicitation/create form mode and URL mode contracts
    • supported schema subset validation for form mode
    • response action model: accept | decline | cancel
    • content required for accepted form mode; omitted for accepted URL mode
    • notifications/elicitation/complete semantics
    • URL elicitation required error (-32042) shape and gating

Phase 6: Utilities (Ping, Cancellation, Progress, Pagination)

  • ping request must receive prompt {} result
  • notifications/cancelled semantics:
    • in-flight request reference rules
    • requestId is required for non-task cancellation; task cancellation must use tasks/cancel
    • initialize cannot be cancelled
  • notifications/progress semantics:
    • token type/uniqueness
    • monotonic progress
    • optional total and message
    • stop progress after completion/terminal task status
  • Pagination semantics:
    • cursor treated as opaque
    • invalid cursor behavior (-32602 recommended)
    • no cursor persistence across sessions

Phase 7: Tasks (2025-11-25)

  • Capability negotiation:
    • tasks.list, tasks.cancel
    • per-request task capability trees
      • server: tasks.requests.tools.call
      • client: tasks.requests.sampling.createMessage, tasks.requests.elicitation.create
  • Tool-level task support:
    • Tool.execution.taskSupport semantics (forbidden|optional|required)
    • enforce invocation mode requirements and expected protocol errors
  • Task-augmented request flow:
    • request includes params.task
    • immediate CreateTaskResult return with Task metadata
  • Task APIs:
    • tasks/get, tasks/result, tasks/list, tasks/cancel
    • task status lifecycle and terminal-state rules
    • tasks/result blocking semantics for non-terminal status
    • tasks/result must return the exact underlying final outcome (success or JSON-RPC error)
  • Metadata rules:
    • io.modelcontextprotocol/related-task inclusion rules and exceptions
    • task ID source-of-truth precedence over metadata in task operations
  • Task notifications:
    • notifications/tasks/status optional behavior and reliability expectations
  • Task error semantics:
    • invalid/nonexistent task IDs => -32602
    • terminal task cancel attempts => -32602
    • required task augmentation not provided => -32600 (when receiver enforces)

Phase 8: Authorization (Conditional Lane)

When authorization is enabled:

  • HTTP implementations should conform to MCP auth specification
  • Protected Resource Metadata (RFC9728) requirements:
    • MCP servers must implement protected resource metadata discovery
    • MCP clients must use protected resource metadata for authorization server discovery
    • WWW-Authenticate handling and fallback well-known behavior
  • Token transport rules:
    • access token must be in Authorization header
    • token must not appear in query string
  • Discovery requirements:
    • MCP authorization servers must provide at least one: OAuth AS metadata or OIDC discovery
    • MCP clients must support both discovery mechanisms
  • PKCE and redirect-uri security requirements
    • clients must verify PKCE support from authorization server metadata and refuse auth flow if unsupported
  • Token passthrough anti-pattern checks (MCP server must not accept tokens not issued for itself)

Phase 9: Security and Abuse Resistance

  • Input validation for all protocol messages/method params
  • URI and root boundary validation (path traversal and boundary escape)
  • Sensitive data controls:
    • no credentials in logs/error payloads
    • URL elicitation safety requirements
  • Streamable HTTP session security:
    • session ID handling and hijacking mitigations
  • Optional icon-security checks where UI consumes icon metadata

Fixtures and Tooling

  • Build reusable fixtures for:
    • negotiated sessions (server role and client role)
    • capability matrices
    • transport matrix (stdio, streamable HTTP JSON response, streamable HTTP SSE response)
    • task lifecycle harness (polling + status + cancellation)
  • Add a schema manifest generator that extracts:
    • all method string literals
    • capability object paths
    • key type constraints (request IDs, task status values, logging levels)
  • Fail manual compliance guard runs if generated manifest and committed manifest diverge.

Manual Makefile Targets (No CI Yet)

make test-mcp-2025
make test-mcp-2025-core
make test-mcp-2025-tasks
make test-mcp-2025-auth
make test-mcp-2025-report

Implementation status (2026-02-15)

  • Manual target scaffolding is in place in Makefile:
    • test-mcp-2025
    • test-mcp-2025-core
    • test-mcp-2025-tasks
    • test-mcp-2025-auth
    • test-mcp-2025-report
  • The compliance suite directory itself (tests/compliance/mcp_2025_11_25) is not yet present in this branch.
  • CI integration for these targets remains intentionally deferred.

Current target behavior

  • Target root path defaults to: tests/compliance/mcp_2025_11_25
  • Marker defaults to: mcp20251125
  • Report target emits:
    • artifacts/mcp-2025-11-25/junit.xml
    • artifacts/mcp-2025-11-25/report.md
  • All targets fail fast with a clear message if the suite path is missing
  • CI wiring is intentionally deferred in this epic

Useful local overrides

# Override suite path while scaffolding
make test-mcp-2025 MCP_2025_TEST_DIR=tests/compliance/mcp_2025_11_25

# Add extra pytest filters or flags
make test-mcp-2025-core MCP_2025_PYTEST_ARGS="-k initialize -x"

Definition of Done

Per Test

  • Linked to at least one normative source location
  • Tagged with requirement level (MUST/SHOULD/MAY)
  • Deterministic and isolated
  • Clear pass/fail diagnostics

Epic Complete

  • 100% schema method inventory coverage (all request + notification methods mapped)
  • 100% MUST requirements mapped and passing in core lane
  • SHOULD/MAY requirements mapped with explicit pass/skip policy
  • Tasks lane implemented and passing
  • Core transport lane passes on stdio + Streamable HTTP
  • Conditional auth lane implemented (gated by config)
  • Compliance report artifacts generated by manual Makefile run

Success Criteria

Metric Target
Schema method coverage 100%
MUST requirement pass rate (core) 100%
SHOULD requirement test presence 100%
Flaky rate < 1%
Core suite runtime < 8 minutes

Related Code Areas

  • mcpgateway/validation/jsonrpc.py
  • mcpgateway/transports/
  • mcpgateway/main.py
  • mcpgateway/services/
  • mcpgateway/handlers/
  • tests/unit/mcpgateway/
  • tests/integration/

Resolved Decisions

  1. This epic targets only the latest MCP spec version (2025-11-25).
  2. No previous-version compatibility scope in this epic.
  3. Execution model is manual via Makefile targets; CI target integration is deferred.

Metadata

Metadata

Assignees

Labels

SHOULDP2: Important but not vital; high-value items that are not crucial for the immediate releaseenhancementNew feature or requestmcp-protocolAlignment with MCP protocol or specificationtest-automationAutomated testingtestingTesting (unit, e2e, manual, automated, etc)

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions