Problem
OpenClaw exposes /health and /healthz endpoints, but these only indicate that the gateway process is alive. They do not reflect whether the system is actually ready to process messages -- i.e., whether channel providers are connected, model providers are reachable, and the memory database is open.
This distinction matters for:
- Docker/Kubernetes deployments: liveness vs readiness probes are fundamentally different concepts. Using
/health for both means containers are marked "ready" before channels are connected, leading to dropped messages during startup.
- LaunchAgent/systemd monitoring: Knowing the process is alive is not the same as knowing it can serve requests. A gateway with a crashed Discord provider but a live HTTP listener reports healthy but silently drops all Discord messages.
- Automated update pipelines: After
openclaw update, a post-update script needs to verify the gateway is fully operational, not just that it started without crashing.
Proposal
Add a GET /ready endpoint that returns:
{
"ready": true,
"channels": {
"discord": { "connected": true, "latencyMs": 45 },
"telegram": { "connected": true, "latencyMs": 120 }
},
"memory": {
"open": true,
"chunks": 1247
},
"uptime": 3600
}
When any critical subsystem is not ready, return HTTP 503 with "ready": false and the failing component.
Design considerations
- Auth: Should require the same auth as other gateway RPC endpoints (not unauthenticated like
/healthz).
- Lightweight: Must not trigger expensive operations (no embedding calls, no LLM probes). Just check connection state and database handle validity.
- Extensible: Plugins/extensions should be able to register their own readiness checks.
Use cases
- Kubernetes:
readinessProbe: httpGet: path: /ready ensures pods only receive traffic when fully initialized.
- Post-update verification:
curl -f http://localhost:PORT/ready in update scripts to confirm successful restart.
- Monitoring dashboards: Display per-subsystem health status (memory DB, Discord connection, Telegram polling, etc.).
- Multi-agent setups: Orchestrator can check if a specific agent's gateway is ready before routing tasks.
References
Problem
OpenClaw exposes
/healthand/healthzendpoints, but these only indicate that the gateway process is alive. They do not reflect whether the system is actually ready to process messages -- i.e., whether channel providers are connected, model providers are reachable, and the memory database is open.This distinction matters for:
/healthfor both means containers are marked "ready" before channels are connected, leading to dropped messages during startup.openclaw update, a post-update script needs to verify the gateway is fully operational, not just that it started without crashing.Proposal
Add a
GET /readyendpoint that returns:{ "ready": true, "channels": { "discord": { "connected": true, "latencyMs": 45 }, "telegram": { "connected": true, "latencyMs": 120 } }, "memory": { "open": true, "chunks": 1247 }, "uptime": 3600 }When any critical subsystem is not ready, return HTTP 503 with
"ready": falseand the failing component.Design considerations
/healthz).Use cases
readinessProbe: httpGet: path: /readyensures pods only receive traffic when fully initialized.curl -f http://localhost:PORT/readyin update scripts to confirm successful restart.References
/healthand/healthzcan return Control UI HTML200instead of machine health payload #18446 (/healthzreturns incorrect content type)