Skip to content

[compliance] Compliance Gap: Health Monitoring Missing Periodic Checks and Auto-Restart (Section 8) #2988

@github-actions

Description

@github-actions

MCP Gateway Compliance Review — 2026-04-01

Summary

Found 1 ongoing compliance gap (SHOULD violation) during daily review of commit 1631b99 (refactor: Go SDK usage improvements from module review #2967).

Note: The recent commit introduced no new compliance issues. It contains code quality improvements: TTL-based eviction for filteredServerCache, a paginateAll pagination helper, elimination of an intermediate resourceContents type, and documentation additions. All reviewed against the specification — no regressions detected.


Recent Changes Reviewed

File Change
internal/server/routed.go TTL-based cache eviction for filteredServerCache (30-min TTL matching SDK SessionTimeout)
internal/launcher/connection.go Generic paginateAll[T]() helper replacing duplicate cursor-loop pagination
internal/launcher/tool_result.go Removed intermediate resourceContents type, use sdk.ResourceContents directly
internal/launcher/mcptest/server.go Explicit &sdk.ServerOptions{} instead of nil to sdk.NewServer()
internal/server/http_transport.go Lifecycle/ownership documentation for transportConnector

Important Issues (SHOULD violations)

Issue: Missing Periodic Health Checks and Automatic Restart Logic

Specification Section: 8 — Health Monitoring
Deep Link: https://github.com/github/gh-aw/blob/main/docs/src/content/docs/reference/mcp-gateway.md#8-health-monitoring

Requirements:

  • Periodic health checks (every 30 seconds recommended)
  • Automatic restart of failed stdio servers

Current State:

The /health endpoint (internal/server/health.go) correctly reports server status by querying GetServerStatus(), which now uses launcher.GetServerState() to return real state (status + uptime). This was improved — the prior regression where status was always "running" with Uptime=0 has been addressed.

However, the following SHOULD requirements remain unimplemented:

  1. No periodic health checks — There is no background goroutine or ticker in internal/server/ or internal/launcher/ that periodically polls backend server health. Health is only evaluated lazily when /health is requested.

  2. No automatic restart — Failed stdio servers (status "error" in launcher.GetServerState()) are not automatically restarted. Once a backend enters error state, it stays there until the gateway is restarted.

File References:

  • internal/server/health.go/health handler, no periodic scheduling
  • internal/launcher/launcher.go:325-344GetServerState() returns state but no restart is triggered
  • internal/server/tool_registry.go:77-95 — Server launch is one-shot at startup, no watchdog

Severity: Important (SHOULD violation)


Compliance Status

Section Aspect Status
§3.2.1 Containerization Requirement ✅ Compliant
§4 Configuration Validation ✅ Compliant
§4.2.2 Variable Expression Expansion (fail-fast) ✅ Compliant
§5 Protocol Behavior (JSON-RPC 2.0, routing) ✅ Compliant
§6 Server Isolation (per-container, per-session) ✅ Compliant
§7 Authentication (Authorization header, 401, no plaintext logging) ✅ Compliant
§8 Health Monitoring — state reporting ✅ Compliant
§8 Health Monitoring — periodic checks + auto-restart ⚠️ Partial
§9 Error Handling ✅ Compliant

Suggested Remediation Task

Task: Implement Periodic Health Monitoring and Restart Logic

Description:

  1. Add a background goroutine (ticker, every 30s) in the UnifiedServer or Launcher that calls GetServerState() for each configured backend.
  2. If a server is in "error" state, attempt to relaunch it (call the same initialization path used at startup).
  3. Log events using logger.LogWarn("backend", ...) for restart attempts and logger.LogError("backend", ...) for failures.

Files to Modify:

  • internal/launcher/launcher.go — Add restart method + watchdog goroutine
  • internal/server/tool_registry.go — Wire up watchdog at startup
  • internal/server/unified.go — Coordinate watchdog lifecycle with shutdown

Specification Reference:
https://github.com/github/gh-aw/blob/main/docs/src/content/docs/reference/mcp-gateway.md#8-health-monitoring

Estimated Effort: Medium (4–8 hours)


References

  • MCP Gateway Specification §8
  • Commits reviewed: 1631b99 (HEAD, only commit in shallow clone)
  • Previous compliance run: 2026-03-31 (commit 3b4c53f)

Note

🔒 Integrity filter blocked 1 item

The following item were blocked because they don't meet the GitHub integrity level.

  • get_file_contents get_file_contents: has lower integrity than agent requires. The agent cannot read data with integrity below "unapproved".

To allow these resources, lower min-integrity in your GitHub frontmatter:

tools:
  github:
    min-integrity: approved  # merged | approved | unapproved | none

Generated by Daily Compliance Checker ·

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions