-
-
Notifications
You must be signed in to change notification settings - Fork 52.6k
Description
Summary
OpenClaw does not validate config keys at write time. Any process (including agents themselves) can write an unrecognised or invalid key to openclaw.json, and it will be silently accepted. The gateway only validates the config schema on startup, so the invalid key sits undetected until the next restart — at which point the gateway crashes with a generic error that does not identify the offending key. In a systemd-managed deployment, this triggers a restart loop that can run for minutes, with all connected services (Telegram bots, etc.) going completely unresponsive and no obvious cause surfaced to the user.
Steps to Reproduce
- Start the OpenClaw gateway normally — confirm it's running and healthy.
- Add an unrecognised key to
openclaw.jsonby any method (direct file edit,jq, or an agent writing via a script):
{
"agents": {
"defaults": {
"suppressToolErrorWarnings": true
}
}
}- Observe: no error is returned, no warning is logged. The gateway continues running with the old (valid) config.
- Restart the gateway (
openclaw gateway restartorsystemctl --user restart openclaw-gateway). - The gateway fails to start. The error message is generic and does not name the invalid key.
- If running under systemd with
Restart=on-failure, the service enters a crash loop.
Expected Behaviour
- Writing an invalid or unrecognised config key should be rejected immediately with a clear error message naming the offending key and its location in the config tree.
- At minimum,
openclaw config setshould validate against the current schema before persisting changes. - The gateway startup error should identify exactly which key failed validation, not just report a generic schema error.
Actual Behaviour
- The invalid key is silently written to
openclaw.jsonwith no error or warning. - The gateway continues running on the previously loaded (valid) config — no indication anything is wrong.
- On next restart, the gateway crashes during config validation.
- The error message does not identify which key is invalid.
- Under systemd, the crash triggers automatic restarts, creating a crash loop with sustained CPU usage and complete service unavailability.
Impact
Operational: In one incident (Mar 2 2026), the gateway entered a crash loop of 28 restart cycles over ~14 minutes. CPU overheated from rapid restart cycling; all five Telegram bots were completely unresponsive. Fix was removing a single unrecognised key — gateway started cleanly on the first attempt.
Agentic use — this is a safety issue: In agentic deployments where agents modify gateway config programmatically, this bug is significantly worse. An agent writes what it believes is correct, receives no error, and has no way to know the write was invalid until the gateway crashes — potentially hours later. The agent may retry the same invalid write, compounding the problem. Four incidents in six days across one deployment:
| Date | Key written | Result |
|---|---|---|
| Feb 24 | allowBots: true (unrecognised) |
Gateway crash on restart |
| Feb 25 | allowBots: true (agent retry — unaware first attempt failed) |
Second crash |
| Mar 2 | agents.defaults.suppressToolErrorWarnings: true (unrecognised in strict schema) |
28-cycle crash loop, ~14 min downtime |
Documentation saying 'don't write config directly' is insufficient when the config writer is an autonomous agent.
Proposed Fix
Option 1 — Minimal: Document openclaw config set as the only supported config write path. Add a warning when openclaw.json is modified externally.
Option 2 — Better: Add openclaw config validate or openclaw doctor --check — validates current config against the active schema without starting the gateway. Agents can run this as a post-write check and roll back if validation fails.
Option 3 — Best: Atomic config updates with validation before write. If validation fails, reject the write, return the exact failing key path, leave existing config untouched. Keep .openclaw.json.bak for auto-rollback on bad startup config.
Version
- OpenClaw v2026.2.25 and v2026.2.26 (commit
bc50708) - Issue present in both versions
Environment
- WSL2 Ubuntu (Windows Subsystem for Linux)
- systemd user service (
openclaw-gateway.service) withRestart=on-failure - Telegram channel provider
- Multi-agent deployment (5 agents, shared gateway config)