-
Notifications
You must be signed in to change notification settings - Fork 20
[compliance] Compliance Gap: Health Monitoring Missing Periodic Checks and Auto-Restart (Section 8) #2988
Description
MCP Gateway Compliance Review — 2026-04-01
Summary
Found 1 ongoing compliance gap (SHOULD violation) during daily review of commit 1631b99 (refactor: Go SDK usage improvements from module review #2967).
Note: The recent commit introduced no new compliance issues. It contains code quality improvements: TTL-based eviction for filteredServerCache, a paginateAll pagination helper, elimination of an intermediate resourceContents type, and documentation additions. All reviewed against the specification — no regressions detected.
Recent Changes Reviewed
| File | Change |
|---|---|
internal/server/routed.go |
TTL-based cache eviction for filteredServerCache (30-min TTL matching SDK SessionTimeout) |
internal/launcher/connection.go |
Generic paginateAll[T]() helper replacing duplicate cursor-loop pagination |
internal/launcher/tool_result.go |
Removed intermediate resourceContents type, use sdk.ResourceContents directly |
internal/launcher/mcptest/server.go |
Explicit &sdk.ServerOptions{} instead of nil to sdk.NewServer() |
internal/server/http_transport.go |
Lifecycle/ownership documentation for transportConnector |
Important Issues (SHOULD violations)
Issue: Missing Periodic Health Checks and Automatic Restart Logic
Specification Section: 8 — Health Monitoring
Deep Link: https://github.com/github/gh-aw/blob/main/docs/src/content/docs/reference/mcp-gateway.md#8-health-monitoring
Requirements:
- Periodic health checks (every 30 seconds recommended)
- Automatic restart of failed stdio servers
Current State:
The /health endpoint (internal/server/health.go) correctly reports server status by querying GetServerStatus(), which now uses launcher.GetServerState() to return real state (status + uptime). This was improved — the prior regression where status was always "running" with Uptime=0 has been addressed.
However, the following SHOULD requirements remain unimplemented:
-
No periodic health checks — There is no background goroutine or ticker in
internal/server/orinternal/launcher/that periodically polls backend server health. Health is only evaluated lazily when/healthis requested. -
No automatic restart — Failed stdio servers (status
"error"inlauncher.GetServerState()) are not automatically restarted. Once a backend enters error state, it stays there until the gateway is restarted.
File References:
internal/server/health.go—/healthhandler, no periodic schedulinginternal/launcher/launcher.go:325-344—GetServerState()returns state but no restart is triggeredinternal/server/tool_registry.go:77-95— Server launch is one-shot at startup, no watchdog
Severity: Important (SHOULD violation)
Compliance Status
| Section | Aspect | Status |
|---|---|---|
| §3.2.1 | Containerization Requirement | ✅ Compliant |
| §4 | Configuration Validation | ✅ Compliant |
| §4.2.2 | Variable Expression Expansion (fail-fast) | ✅ Compliant |
| §5 | Protocol Behavior (JSON-RPC 2.0, routing) | ✅ Compliant |
| §6 | Server Isolation (per-container, per-session) | ✅ Compliant |
| §7 | Authentication (Authorization header, 401, no plaintext logging) | ✅ Compliant |
| §8 | Health Monitoring — state reporting | ✅ Compliant |
| §8 | Health Monitoring — periodic checks + auto-restart | |
| §9 | Error Handling | ✅ Compliant |
Suggested Remediation Task
Task: Implement Periodic Health Monitoring and Restart Logic
Description:
- Add a background goroutine (ticker, every 30s) in the
UnifiedServerorLauncherthat callsGetServerState()for each configured backend. - If a server is in
"error"state, attempt to relaunch it (call the same initialization path used at startup). - Log events using
logger.LogWarn("backend", ...)for restart attempts andlogger.LogError("backend", ...)for failures.
Files to Modify:
internal/launcher/launcher.go— Add restart method + watchdog goroutineinternal/server/tool_registry.go— Wire up watchdog at startupinternal/server/unified.go— Coordinate watchdog lifecycle with shutdown
Specification Reference:
https://github.com/github/gh-aw/blob/main/docs/src/content/docs/reference/mcp-gateway.md#8-health-monitoring
Estimated Effort: Medium (4–8 hours)
References
- MCP Gateway Specification §8
- Commits reviewed:
1631b99(HEAD, only commit in shallow clone) - Previous compliance run: 2026-03-31 (commit
3b4c53f)
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- get_file_contents
get_file_contents: has lower integrity than agent requires. The agent cannot read data with integrity below "unapproved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | noneGenerated by Daily Compliance Checker · ◷