Skip to content

2026.5.2 upgrade silently wipes operator-customized gateway.systemd.env contents (EnvironmentFile contents reverted to bundled defaults) #76860

@liemnhoang

Description

@liemnhoang

Summary

Upgrading from 2026.4.29 to 2026.5.2 (and/or running openclaw doctor post-upgrade) silently rewrites ~/.openclaw/gateway.systemd.env — the file declared as the gateway service's EnvironmentFile= in the systemd user unit. Operator-populated secrets are removed; only a small bundled-default subset (in our case: ALPACA_API_KEY, ALPACA_SECRET_KEY, SEATS_AERO_API_KEY) is preserved.

This is destructive and silent: no warning during the upgrade, no .bak left next to the wiped file, no log entry indicating the rewrite. Operators discover it only when downstream auth resolution fails.

Reproducer

  1. On 2026.4.29 (or earlier), populate ~/.openclaw/gateway.systemd.env with custom secrets — e.g. via the documented "move secrets out of openclaw.json env block" pattern that the systemd unit's EnvironmentFile=-/home/<user>/.openclaw/gateway.systemd.env directive enables. Note the file's mode is 0600 and the gateway expects to load it at spawn time.
  2. Verify the keys are loaded via cat /proc/$(systemctl --user show openclaw-gateway -p MainPID --value)/environ | tr '\0' '\n' | grep <YOUR_KEY>.
  3. Upgrade to 2026.5.2 (npm install -g openclaw@latest).
  4. Wait for the post-upgrade doctor cycle (or run openclaw doctor manually).
  5. Inspect ~/.openclaw/gateway.systemd.env again.

Expected: operator-populated keys preserved; perhaps a notice that the file was inspected.
Actual: file has been overwritten; only a small bundled-default subset of keys remains. No backup was created.

Evidence (from our incident, 2026-05-03)

Pre-upgrade gateway.systemd.env had 25 keys. Post-upgrade: only 3 remained (ALPACA_API_KEY, ALPACA_SECRET_KEY, SEATS_AERO_API_KEY). 22 of 25 secrets silently removed.

Post-upgrade wizard state in openclaw.json confirms doctor ran at upgrade time:

"wizard": {
  "lastRunAt": "2026-05-03T05:12:20.790Z",
  "lastRunVersion": "2026.5.2",
  "lastRunCommand": "doctor",
  "lastRunMode": "local"
}

Impact

  • Every provider that depends on env-sourced auth fails until secrets are restored. In our case: deepseek, anthropic, openrouter all returned FailoverError: No API key found for provider "..." for ~3 hours of production traffic before discovery.
  • Cascading agent failures: model fallbacks trigger when primary is spotty; the fallback's auth then fails because the secret is gone; the agent's whole turn errors out.
  • No recovery path without an operator backup.

Suggested fix

  1. Preserve operator-customized keys. When the upgrade or doctor rewrites gateway.systemd.env, MERGE existing keys with new bundled defaults rather than replace. Existing keys win.
  2. Always back up before rewrite. Drop a .bak.<timestamp> next to the file before overwriting.
  3. Notify on rewrite. Log to journalctl that operator-customized keys were removed.

Environment

  • OpenClaw version: 2026.4.29 → 2026.5.2 (build a448042)
  • Node: v22.22.0
  • OS: WSL2 Ubuntu on Windows 11 Pro 26200
  • Install: npm global at /home/micha/.npm-global/lib/node_modules/openclaw
  • Service: systemd user unit (openclaw-gateway.service)

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions