Skip to content

[Bug] Upgrade: entry.js hardcode + non-atomic service restart + TOKEN plist pollution #66038

@octaviojuli

Description

@octaviojuli

Problem Summary

Three structural issues in the upgrade/service lifecycle that cause recurring failures on Homebrew + LaunchAgent installs.


Issue 1: update-cli still hardcodes dist/entry.js while doctor expects dist/index.js

File: dist/update-cli-*.js (found in 2026.4.12 at dist/update-cli-C4j9u3np.js)

Lines ~820 and ~1055 both hardcode:

const entryPath = path.join(verifiedPackageRoot, "dist", "entry.js");

However, openclaw doctor warns:

Gateway service entrypoint does not match the current install.
(/opt/homebrew/lib/node_modules/openclaw/dist/entry.js -> /opt/homebrew/lib/node_modules/openclaw/dist/index.js)

The update pipeline and the service audit are not using the same single source of truth for the gateway entrypoint. This causes doctor to report mismatches after every upgrade, and can cause service startup failures when launchd reloads a plist pointing to a stale entry.


Issue 2: Service restart is non-atomic —“先拆旧,再装新”

Upgrade sequence in dist/update-cli-*.js (~lines 977-996):

  1. Upgrade succeeds (npm/pnpm install new version)
  2. SIGTERM sent to old gateway → process shuts down
  3. refreshGatewayServiceEnv() / runRestartScript() / runDaemonRestart() attempt to reload LaunchAgent

If step 3 fails or races, the system is left in a half-broken state:

  • Old gateway is already killed
  • LaunchAgent is not restored to loaded/running
  • User sees "gateway unreachable" with no automatic recovery

There is no transaction-style rollback or "confirm new service is healthy before killing old one" logic.


Issue 3: LaunchAgent generator re-embeds OPENCLAW_GATEWAY_TOKEN into plist, conflicting with doctor audit

After manually removing OPENCLAW_GATEWAY_TOKEN from ~/Library/LaunchAgents/ai.openclaw.gateway.plist, running any official install/refresh command (openclaw gateway install, openclaw update, etc.) writes it back into the plist.

openclaw doctor then reports:

Gateway service embeds OPENCLAW_GATEWAY_TOKEN and should be reinstalled.

The installer writes it, the auditor flags it — directly contradictory. Every service regeneration pollutes the plist again.


Environment

  • macOS Darwin arm64
  • Homebrew install (/opt/homebrew)
  • LaunchAgent managed gateway
  • OpenClaw 2026.4.12 (also present in 2026.4.11 and earlier)
  • Node.js 22.22.0 via Homebrew

Suggested Fixes

  1. entry.js → index.js: Consolidate all update/doctor/install code paths to use dist/index.js as the single canonical entrypoint. Remove the entry.js hardcode entirely.

  2. Atomic service restart: Implement a "start new before killing old" strategy, or add a rollback mechanism if the LaunchAgent reload fails. Add a health-check probe after restart before declaring success.

  3. TOKEN injection alignment: Decide whether tokens should live in the plist env or the config file — then make the installer and auditor consistent. Doctor's "should be reinstalled" message should reflect the actual installer behavior, not an idealized state.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions