Skip to content

[EPIC][TESTING]: Slow Time Server - configurable-latency MCP server for timeout, resilience, and load testing #2783

@crivetimihai

Description

@crivetimihai

⏱ Epic: Slow Time Server - Configurable-Latency MCP Server for Timeout, Resilience, and Load Testing

Goal

Create a Go MCP server (mcp-servers/go/slow-time-server) modelled on the existing fast-time-server that introduces configurable artificial latency on every tool call, resource fetch, and prompt render. This server serves as a first-class testing target for validating gateway timeout enforcement, circuit breaker behaviour, session pool resilience, and load testing under realistic slow-tool conditions.

Why Now?

Issue #2781 exposed that users with long-running tools hit the gateway's 60-second default timeout and receive a confusing empty error (Tool invocation failed: ). While the timeout behaviour is correct, we lack a reproducible, self-contained test target for:

  1. Timeout Verification: Validate that TOOL_TIMEOUT, per-tool timeout_ms, MCP_SESSION_POOL_TRANSPORT_TIMEOUT, and HTTPX_READ_TIMEOUT all behave correctly and interact as documented
  2. Circuit Breaker Testing: Exercise the CircuitBreakerPlugin with deterministic slow/failing tools — half-open recovery, retry headers, threshold tuning
  3. Session Pool Resilience: Test MCP_SESSION_POOL_* settings under sustained slow-tool load — pool exhaustion, acquire timeouts, stale session eviction
  4. Load Testing: Extend the existing Locust test suite (tests/loadtest/) with slow-tool scenarios that model real-world MCP servers with non-trivial latency
  5. Error Message Propagation: Provide a test harness for [ENHANCEMENT][OBSERVABILITY]: Preserve timeout error message through ExceptionGroup unwrapping in tool invocations #2782 (timeout error message lost through ExceptionGroup wrapping)
  6. Regression Prevention: PR Enforce per-tool timeouts and enhanced circuit breaker plugin #2569 introduced timeout enforcement — a dedicated slow server catches regressions in timeout handling across all transports (SSE, StreamableHTTP, stdio)

The fast-time-server is excellent for throughput benchmarks but always returns instantly. Real MCP servers (compliance screening, data pipelines, LLM inference) routinely take 30-300+ seconds. We need a counterpart that simulates this.


📖 User Stories

US-1: Gateway Developer - Validate Timeout Enforcement

As a Gateway Developer
I want an MCP server with configurable response latency
So that I can validate that timeout enforcement works correctly across all transports

Acceptance Criteria:

Given the slow-time-server is running with default latency 5s
And the gateway has TOOL_TIMEOUT=3
When I invoke the get_slow_time tool via StreamableHTTP
Then the gateway should return ToolTimeoutError after ~3s
And the error message should include the timeout value
And the structured log should contain event=tool_timeout with timeout_seconds=3

Technical Requirements:

  • Configurable default latency via -latency flag and DEFAULT_LATENCY env var
  • Per-tool latency override via tool arguments (delay_seconds parameter)
  • Support all transports: stdio, SSE, StreamableHTTP, dual, REST
  • Latency applied via time.Sleep() before returning result
US-2: QA Engineer - Test Circuit Breaker with Slow Tools

As a QA Engineer
I want tools that intermittently fail or timeout
So that I can test circuit breaker open/half-open/closed state transitions

Acceptance Criteria:

Given the slow-time-server has failure_rate=0.5 and latency=10s
And the CircuitBreakerPlugin has threshold=3 and reset=30s
When I invoke the tool 5 times rapidly
Then ~2-3 invocations should timeout
And the circuit breaker should open after 3 failures
And subsequent calls should fail fast with "circuit open"
And after 30s the circuit should enter half-open state

Technical Requirements:

  • Configurable failure rate (-failure-rate flag, 0.0-1.0)
  • Failure modes: timeout (sleep beyond gateway timeout), error (return error), panic (simulate crash)
  • Deterministic mode (-seed flag) for reproducible test runs
US-3: Performance Engineer - Load Test with Realistic Latency

As a Performance Engineer
I want to run Locust load tests against slow tools
So that I can measure gateway behaviour under sustained slow-tool load

Acceptance Criteria:

Given the slow-time-server is running in dual mode
And a Locust test is configured with 50 concurrent users
When each user invokes get_slow_time with delay_seconds=2
Then the gateway should handle concurrent requests within pool limits
And session pool metrics should show pool utilisation
And no requests should fail due to pool exhaustion (within configured limits)

Technical Requirements:

  • Locust test file: tests/loadtest/locustfile_slow_time_server.py
  • Test scenarios: gradual ramp-up, sustained load, spike, timeout storms
  • Metrics: p50/p95/p99 latency, error rate, pool utilisation
US-4: Developer - Variable Latency Distribution

As a Developer
I want tools with variable latency distributions (uniform, normal, exponential)
So that I can simulate realistic tool response time patterns

Acceptance Criteria:

Given the slow-time-server is configured with:
  -latency-distribution=normal -latency-mean=5 -latency-stddev=2
When I invoke the tool 100 times
Then response times should follow a normal distribution (mean ~5s, stddev ~2s)
And some calls should naturally exceed the gateway timeout
And the distribution should be observable in metrics

Technical Requirements:

  • Distributions: fixed (default), uniform (min/max), normal (mean/stddev), exponential (lambda)
  • Per-invocation override via delay_seconds tool argument (always fixed)
  • /metrics endpoint exposing latency histogram
US-5: Integration Tester - Multi-Tool Latency Profiles

As an Integration Tester
I want multiple tools with different latency profiles on the same server
So that I can test per-tool timeout_ms overrides and mixed-latency scenarios

Acceptance Criteria:

Given the slow-time-server exposes:
  - get_slow_time (configurable delay)
  - convert_slow_time (configurable delay)
  - get_instant_time (always instant, 0ms)
  - get_timeout_time (always exceeds 5min, for timeout testing)
When I register tools with different timeout_ms values:
  - get_slow_time: timeout_ms=10000
  - get_timeout_time: timeout_ms=5000
Then each tool should respect its per-tool timeout independently
US-6: DevOps Engineer - Docker Compose Integration

As a DevOps Engineer
I want the slow-time-server in the performance testing Docker Compose stack
So that I can run timeout and resilience tests in CI/CD

Acceptance Criteria:

Given docker-compose-performance.yml includes the slow-time-server
When I run: docker compose -f docker-compose-performance.yml up
Then the slow-time-server should be accessible on port 8081
And the gateway should have it pre-registered as a gateway
And Locust tests can target it alongside the fast-time-server

🏗 Architecture

Tool Inventory

Tool Description Latency Behaviour Purpose
get_slow_time Get current time with configurable delay Respects delay_seconds arg, falls back to server default General timeout testing
convert_slow_time Convert time between timezones with delay Same as above Per-tool timeout_ms testing
get_instant_time Get current time with zero delay Always instant (0ms) Baseline / control tool
get_timeout_time Get current time with extreme delay Always sleeps 10 minutes Guaranteed timeout testing
get_flaky_time Get current time with random failures Fails based on -failure-rate Circuit breaker testing

Server Flags

Usage: slow-time-server [flags]

Flags:
  -transport string     Transport: stdio, http, sse, dual, rest (default "stdio")
  -addr string          Bind address (default "0.0.0.0")
  -port int             Port (default 8081)
  -auth-token string    Bearer auth token
  -log-level string     Log level: debug, info, warn, error (default "info")

Latency Configuration:
  -latency duration           Default tool latency (default 5s)
  -latency-distribution string  Distribution: fixed, uniform, normal, exponential (default "fixed")
  -latency-min duration       Min latency for uniform distribution (default 1s)
  -latency-max duration       Max latency for uniform distribution (default 10s)
  -latency-mean duration      Mean for normal distribution (default 5s)
  -latency-stddev duration    Stddev for normal distribution (default 2s)

Failure Simulation:
  -failure-rate float    Probability of failure for flaky tool (0.0-1.0, default 0.0)
  -failure-mode string   Failure type: timeout, error, panic (default "timeout")
  -seed int              Random seed for reproducibility (default: time-based)

Environment Variables:
  DEFAULT_LATENCY              Override -latency (e.g., "5s", "30s", "2m")
  FAILURE_RATE                 Override -failure-rate
  AUTH_TOKEN                   Override -auth-token

Tool Input Schema

{
  "name": "get_slow_time",
  "description": "Get current system time with configurable artificial delay. Use delay_seconds to control response latency for timeout testing.",
  "inputSchema": {
    "type": "object",
    "properties": {
      "timezone": {
        "type": "string",
        "description": "IANA timezone name (default: UTC)",
        "default": "UTC"
      },
      "delay_seconds": {
        "type": "number",
        "description": "Override delay in seconds. If omitted, uses server default latency.",
        "minimum": 0,
        "maximum": 600
      }
    }
  },
  "annotations": {
    "readOnlyHint": true,
    "destructiveHint": false,
    "openWorldHint": false
  }
}

Data Flow

sequenceDiagram
    participant Client as MCP Client / Locust
    participant Gateway as ContextForge Gateway
    participant Plugin as CircuitBreakerPlugin
    participant Server as slow-time-server

    Client->>Gateway: tools/call: get_slow_time {delay_seconds: 30}
    Gateway->>Plugin: tool_pre_invoke
    Plugin-->>Gateway: continue (circuit closed)
    Gateway->>Server: session.call_tool("get_slow_time", {delay_seconds: 30})

    Note over Server: time.Sleep(30s)

    alt Gateway timeout fires first (TOOL_TIMEOUT=10)
        Gateway-->>Gateway: asyncio.wait_for → TimeoutError after 10s
        Gateway->>Plugin: tool_post_invoke (timeout failure)
        Plugin-->>Plugin: Increment failure count
        Gateway-->>Client: ToolTimeoutError: "timed out after 10s"
    else Server responds within timeout
        Server-->>Gateway: {time: "2026-02-09T...", delayed_by: "30s"}
        Gateway->>Plugin: tool_post_invoke (success)
        Gateway-->>Client: ToolResult {time, delayed_by}
    end
Loading

📋 Implementation Tasks

Phase 1: Core Server Implementation

  • Scaffold project structure

    • Create mcp-servers/go/slow-time-server/ directory
    • go.mod with github.com/mark3labs/mcp-go dependency (same version as fast-time-server)
    • main.go — server implementation
    • Makefile — mirroring fast-time-server targets
    • Dockerfile — multi-stage build (scratch base)
    • .gitignore, .golangci.yml, staticcheck.conf
  • Implement latency engine

    • Fixed latency: time.Sleep(duration)
    • Uniform distribution: rand.Float64() * (max - min) + min
    • Normal distribution: rand.NormFloat64() * stddev + mean (clamped to 0)
    • Exponential distribution: rand.ExpFloat64() / lambda
    • Per-call delay_seconds override (always fixed, bypasses distribution)
    • Context-aware sleep: respect context.Done() for clean cancellation
  • Implement MCP tools (5 tools)

    • get_slow_time — configurable delay, timezone support
    • convert_slow_time — configurable delay, timezone conversion
    • get_instant_time — zero delay (control/baseline)
    • get_timeout_time — 10-minute fixed delay (guaranteed timeout)
    • get_flaky_time — probabilistic failure based on failure_rate
  • Implement MCP resources (2 resources)

    • latency://config — current server latency configuration (JSON)
    • latency://stats — invocation count, avg/p50/p95/p99 latency, failure count
  • Implement MCP prompts (1 prompt)

    • test_timeout — generates a prompt instructing the LLM to invoke slow tools with specific delay to test timeout behaviour
  • Multi-transport support

    • stdio, SSE, HTTP (StreamableHTTP), dual, REST — same as fast-time-server
    • Auth middleware reuse
    • CORS and logging middleware

Phase 2: Failure Simulation

  • Failure modes for get_flaky_time

    • timeout — sleep for 10x the configured latency (exceed any reasonable timeout)
    • error — return mcp.NewToolResultError("simulated failure")
    • panicpanic("simulated crash") with recovery middleware
    • Deterministic seeding (-seed flag) for reproducible test runs
  • Response metadata

    • Include delayed_by, server_default_latency, distribution, failure_simulated in tool response
    • This allows test assertions to verify the server behaved as configured

Phase 3: REST API & Observability

  • REST endpoints (in dual and rest modes)

    • GET /api/v1/time?timezone=X&delay=Y — slow time with delay
    • GET /api/v1/config — current latency config
    • GET /api/v1/stats — invocation statistics
    • POST /api/v1/config — runtime latency reconfiguration (hot-reload)
    • GET /health — health check (always instant, no delay)
    • GET /version — version info
  • Prometheus metrics endpoint

    • GET /metrics — tool_invocations_total, tool_latency_seconds (histogram), tool_failures_total
    • Labels: tool_name, failure_mode, transport

Phase 4: Testing & Quality

  • Unit tests (main_test.go)

    • Test each tool handler with various delay values
    • Test latency distributions produce expected ranges
    • Test failure rate simulation (deterministic seed)
    • Test context cancellation during sleep
    • Test REST endpoints
    • Test auth middleware
  • Locust load test (tests/loadtest/locustfile_slow_time_server.py)

    • Scenario: gradual ramp-up (1→50 users, 2s delay per tool call)
    • Scenario: timeout storm (all users hit 120s delay with TOOL_TIMEOUT=60)
    • Scenario: mixed latency (instant + slow + timeout tools)
    • Scenario: circuit breaker exercise (50% failure rate)
    • Assertions on p95 latency, error rate, timeout rate
  • Integration test with gateway

    • Register slow-time-server as gateway
    • Test TOOL_TIMEOUT enforcement via StreamableHTTP
    • Test per-tool timeout_ms override
    • Test MCP_SESSION_POOL_TRANSPORT_TIMEOUT interaction
    • Test circuit breaker plugin activation on repeated timeouts

Phase 5: Docker & CI Integration

  • Dockerfile — multi-stage, scratch base, ~2 MiB image
  • Add to docker-compose-performance.yml
    • Service: slow-time-server on port 8081
    • Default latency: 5s
    • Environment: DEFAULT_LATENCY=5s, FAILURE_RATE=0.1
  • CI workflow (optional)
    • Build and test in GitHub Actions
    • Publish container image alongside fast-time-server

Phase 6: Documentation

  • README.md — usage, flags, examples, Docker, tool reference
  • Update mcp-servers/AGENTS.md — add slow-time-server entry
  • Update docs/docs/manage/tuning.md — reference slow-time-server for timeout tuning validation

⚙️ Configuration Examples

Basic: 5-second latency for timeout testing

./slow-time-server -transport=dual -port=8081 -latency=5s

Circuit breaker testing: 30% failure rate

./slow-time-server -transport=dual -port=8081 \
  -latency=2s -failure-rate=0.3 -failure-mode=error -seed=42

Realistic distribution: normal with occasional outliers

./slow-time-server -transport=dual -port=8081 \
  -latency-distribution=normal -latency-mean=5s -latency-stddev=3s

Docker Compose

# docker-compose-performance.yml
services:
  slow-time-server:
    build:
      context: ./mcp-servers/go/slow-time-server
    ports:
      - "8081:8081"
    environment:
      DEFAULT_LATENCY: "5s"
      FAILURE_RATE: "0.1"
    command: ["-transport=dual", "-port=8081", "-listen=0.0.0.0"]

Gateway registration

# Register the slow-time-server as a gateway
curl -X POST http://localhost:4444/api/v1/gateways \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "slow-time-server",
    "url": "http://slow-time-server:8081",
    "transport": "streamablehttp"
  }'

# Override timeout for the guaranteed-timeout tool
curl -X PATCH http://localhost:4444/api/v1/tools/<get_timeout_time_id> \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"timeout_ms": 5000}'

✅ Success Criteria

  • Functionality: All 5 tools work across all transports (stdio, SSE, StreamableHTTP, dual, REST)
  • Latency: Configurable via flags, env vars, and per-call delay_seconds argument
  • Distributions: Fixed, uniform, normal, and exponential latency distributions produce correct ranges
  • Failure Simulation: get_flaky_time fails at configured rate with deterministic seeding
  • Gateway Integration: Successfully registered and invoked through ContextForge gateway
  • Timeout Testing: Gateway TOOL_TIMEOUT, per-tool timeout_ms, and pool transport timeout all validated
  • Circuit Breaker: CircuitBreakerPlugin correctly opens/closes with configurable failure rates
  • Load Testing: Locust scenarios run successfully with expected latency profiles
  • Docker: Builds as ~2 MiB scratch image, runs in docker-compose-performance.yml
  • Testing: Unit tests pass with race detection (go test -race), >80% coverage
  • Documentation: README with full flag reference, examples, and Docker instructions
  • Quality: Passes make lint, make vet, make staticcheck

🏁 Definition of Done

  • Go module scaffolded in mcp-servers/go/slow-time-server/
  • 5 MCP tools implemented with configurable latency
  • 2 MCP resources (config, stats) implemented
  • 4 latency distributions (fixed, uniform, normal, exponential) working
  • Failure simulation with configurable rate and mode
  • All 5 transports supported (stdio, SSE, HTTP, dual, REST)
  • Unit tests with race detection and >80% coverage
  • Locust load test file created in tests/loadtest/
  • Dockerfile producing scratch-based ~2 MiB image
  • Added to docker-compose-performance.yml
  • README.md with complete documentation
  • Passes make lint vet staticcheck test

📝 Additional Notes

🔹 Relationship to fast-time-server: The slow-time-server shares the same MCP SDK, transport infrastructure, auth middleware, and code patterns as fast-time-server. It adds latency injection and failure simulation on top. Consider extracting shared code into a common package if duplication becomes significant.

🔹 Context-aware sleep: Use select { case <-ctx.Done(): return ctx.Err(); case <-time.After(delay): } rather than bare time.Sleep() so that tool cancellations propagate cleanly and don't leave goroutines hanging.

🔹 Runtime reconfiguration: The POST /api/v1/config endpoint allows changing latency parameters without restarting the server. This is useful for scripted test scenarios that need different latency profiles in sequence.

🔹 Deterministic mode: The -seed flag ensures get_flaky_time and distribution-based latencies produce the same sequence across runs, enabling reproducible CI test assertions.

🔹 Port convention: fast-time-server uses 8080, slow-time-server uses 8081, avoiding conflicts in compose stacks.


🔗 Related Issues

Metadata

Metadata

Assignees

Labels

SHOULDP2: Important but not vital; high-value items that are not crucial for the immediate releaseenhancementNew feature or requestepicLarge feature spanning multiple issuesgoGo programmingmcp-serversMCP Server SamplestestingTesting (unit, e2e, manual, automated, etc)

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions