Skip to content

Commit 8aa7b7a

Browse files
Tolerate corrupt plugins during update (#77706)
* fix(update): tolerate corrupt plugin state * fix(update): preserve corrupt plugin proof state * fix(update): narrow corrupt plugin warnings --------- Co-authored-by: Peter Steinberger <steipete@gmail.com>
1 parent d94e7f5 commit 8aa7b7a

19 files changed

Lines changed: 504 additions & 107 deletions

.github/workflows/openclaw-release-checks.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -595,7 +595,7 @@ jobs:
595595
artifact_name: ${{ needs.prepare_release_package.outputs.artifact_name }}
596596
package_sha256: ${{ needs.prepare_release_package.outputs.package_sha256 }}
597597
suite_profile: custom
598-
docker_lanes: doctor-switch update-channel-switch upgrade-survivor published-upgrade-survivor update-restart-auth plugins-offline plugin-update
598+
docker_lanes: doctor-switch update-channel-switch update-corrupt-plugin upgrade-survivor published-upgrade-survivor update-restart-auth plugins-offline plugin-update
599599
published_upgrade_survivor_baselines: ${{ needs.resolve_target.outputs.run_release_soak == 'true' && 'last-stable-4 2026.4.23 2026.5.2 2026.4.15' || '' }}
600600
published_upgrade_survivor_scenarios: ${{ needs.resolve_target.outputs.run_release_soak == 'true' && 'reported-issues' || '' }}
601601
telegram_mode: mock-openai

.github/workflows/package-acceptance.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -386,10 +386,10 @@ jobs:
386386
docker_lanes="npm-onboard-channel-agent gateway-network config-reload"
387387
;;
388388
package)
389-
docker_lanes="npm-onboard-channel-agent doctor-switch update-channel-switch upgrade-survivor published-upgrade-survivor update-restart-auth plugins-offline plugin-update"
389+
docker_lanes="npm-onboard-channel-agent doctor-switch update-channel-switch update-corrupt-plugin upgrade-survivor published-upgrade-survivor update-restart-auth plugins-offline plugin-update"
390390
;;
391391
product)
392-
docker_lanes="npm-onboard-channel-agent doctor-switch update-channel-switch upgrade-survivor published-upgrade-survivor update-restart-auth plugins plugin-update mcp-channels cron-mcp-cleanup openai-web-search-minimal openwebui"
392+
docker_lanes="npm-onboard-channel-agent doctor-switch update-channel-switch update-corrupt-plugin upgrade-survivor published-upgrade-survivor update-restart-auth plugins plugin-update mcp-channels cron-mcp-cleanup openai-web-search-minimal openwebui"
393393
include_openwebui=true
394394
;;
395395
full)

CHANGELOG.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -245,7 +245,7 @@ Docs: https://docs.openclaw.ai
245245
- Doctor/sessions: clear auto-created stale session routing state from the sessions store when `doctor --fix` sees plugin-owned model/runtime/auth/session bindings outside the current configured route, while leaving explicit user model choices for manual review. Refs #68615.
246246
- CLI/sessions: prune old unreferenced transcript, compaction checkpoint, and trajectory artifacts during normal `sessions cleanup`, so gateway restart or crash orphans do not accumulate indefinitely outside `sessions.json`. Fixes #77608. Thanks @slideshow-dingo.
247247
- CLI/sessions: cap `openclaw sessions` output to the newest 100 rows by default and add `--limit <n|all>` plus JSON pagination metadata, so repeated machine polling of large session stores cannot fan out into unbounded per-row enrichment/output work. Fixes #77500. Thanks @Kaotic3.
248-
- CLI/update: disable and skip plugins that fail package-update plugin sync, so a broken npm/ClawHub/git/marketplace plugin cannot turn a successful OpenClaw package update into a failed update result. Thanks @vincentkoc.
248+
- CLI/update: report corrupt or unloadable managed plugins as post-update warnings instead of disabling them or turning a successful OpenClaw package update into a failed update result. Thanks @vincentkoc and @Patrick-Erichsen.
249249
- CLI/update: use an absolute POSIX npm script shell during package-manager updates, so restricted PATH environments can still run dependency lifecycle scripts while updating from `--tag main`. Fixes #77530. Thanks @PeterTremonti.
250250
- CLI/update: make package-update follow-up processes write completion results and exit explicitly, so Windows packaged upgrades do not hang after the new package finishes post-core plugin work. Thanks @vincentkoc.
251251
- CLI/update: stage pnpm-detected npm-layout global package updates through a clean npm prefix swap, keep plugin install runtime imports behind a stable alias, and ship legacy install-runtime aliases back to `2026.3.22`, preventing stale overlay chunks from breaking plugin post-update sync. Thanks @vincentkoc.

docs/cli/update.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -38,8 +38,9 @@ openclaw --update
3838
- `--tag <dist-tag|version|spec>`: override the package target for this update only. For package installs, `main` maps to `github:openclaw/openclaw#main`.
3939
- `--dry-run`: preview planned update actions (channel/tag/target/restart flow) without writing config, installing, syncing plugins, or restarting.
4040
- `--json`: print machine-readable `UpdateRunResult` JSON, including
41-
`postUpdate.plugins.integrityDrifts` when npm plugin artifact drift is
42-
detected during post-update plugin sync.
41+
`postUpdate.plugins.warnings` when corrupt or unloadable managed plugins need
42+
repair after the core update succeeds, and `postUpdate.plugins.integrityDrifts`
43+
when npm plugin artifact drift is detected during post-update plugin sync.
4344
- `--timeout <seconds>`: per-step timeout (default is 1800s).
4445
- `--yes`: skip confirmation prompts (for example downgrade confirmation).
4546

@@ -177,7 +178,7 @@ If an exact pinned npm plugin update resolves to an artifact whose integrity dif
177178
</Warning>
178179

179180
<Note>
180-
Post-update plugin sync failures fail the update result and stop restart follow-up work. Fix the plugin install or update error, then rerun `openclaw update`.
181+
Post-update plugin sync failures that are scoped to a managed plugin are reported as warnings after the core update succeeds. The JSON result keeps the top-level update `status: "ok"` and reports `postUpdate.plugins.status: "warning"` with `openclaw doctor --fix` and `openclaw plugins inspect <id> --runtime --json` guidance. Unexpected updater or sync exceptions still fail the update result. Fix the plugin install or update error, then rerun `openclaw doctor --fix` or `openclaw update`.
181182

182183
When the updated Gateway starts, plugin loading is verify-only: startup does not run package managers or mutate dependency trees. Package-manager `update.run` restarts bypass the normal idle deferral and restart cooldown after the package tree has been swapped, so the old process cannot keep lazy-loading removed chunks.
183184

docs/help/testing-updates-plugins.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -172,7 +172,7 @@ targets the shipped npm package instead.
172172
Release checks call Package Acceptance with the package/update/restart/plugin set:
173173

174174
```text
175-
doctor-switch update-channel-switch upgrade-survivor published-upgrade-survivor update-restart-auth plugins-offline plugin-update
175+
doctor-switch update-channel-switch update-corrupt-plugin upgrade-survivor published-upgrade-survivor update-restart-auth plugins-offline plugin-update
176176
```
177177

178178
When release soak is enabled, they also pass:
@@ -183,10 +183,10 @@ published_upgrade_survivor_scenarios=reported-issues
183183
telegram_mode=mock-openai
184184
```
185185

186-
This keeps package migration, update channel switching, stale plugin dependency
187-
cleanup, offline plugin coverage, plugin update behavior, and Telegram package
188-
QA on the same resolved artifact without making the default release package gate
189-
walk every published release.
186+
This keeps package migration, update channel switching, corrupt managed-plugin
187+
tolerance, stale plugin dependency cleanup, offline plugin coverage, plugin
188+
update behavior, and Telegram package QA on the same resolved artifact without
189+
making the default release package gate walk every published release.
190190

191191
`last-stable-4` resolves to the four latest stable npm-published OpenClaw
192192
releases. Release package acceptance pins `2026.4.23` as the first plugin-update

package.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1570,6 +1570,7 @@
15701570
"test:docker:session-runtime-context": "bash scripts/e2e/session-runtime-context-docker.sh",
15711571
"test:docker:timings": "node scripts/docker-e2e-timings.mjs",
15721572
"test:docker:update-channel-switch": "bash scripts/e2e/update-channel-switch-docker.sh",
1573+
"test:docker:update-corrupt-plugin": "bash scripts/e2e/update-corrupt-plugin-docker.sh",
15731574
"test:docker:update-migration": "env OPENCLAW_UPGRADE_SURVIVOR_PUBLISHED_BASELINE=1 OPENCLAW_UPGRADE_SURVIVOR_BASELINE_SPEC=${OPENCLAW_UPGRADE_SURVIVOR_BASELINE_SPEC:-openclaw@2026.4.23} OPENCLAW_UPGRADE_SURVIVOR_SCENARIO=${OPENCLAW_UPGRADE_SURVIVOR_SCENARIO:-plugin-deps-cleanup} bash scripts/e2e/upgrade-survivor-docker.sh",
15741575
"test:docker:update-restart-auth": "env OPENCLAW_UPGRADE_SURVIVOR_UPDATE_RESTART_MODE=auto-auth OPENCLAW_UPGRADE_SURVIVOR_DOCKER_RUN_TIMEOUT=${OPENCLAW_UPGRADE_SURVIVOR_DOCKER_RUN_TIMEOUT:-1500s} bash scripts/e2e/upgrade-survivor-docker.sh",
15751576
"test:docker:upgrade-survivor": "bash scripts/e2e/upgrade-survivor-docker.sh",
Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
#!/usr/bin/env bash
2+
set -euo pipefail
3+
4+
source scripts/lib/openclaw-e2e-instance.sh
5+
source scripts/e2e/lib/plugins/fixtures.sh
6+
7+
openclaw_e2e_eval_test_state_from_b64 "${OPENCLAW_TEST_STATE_SCRIPT_B64:?missing OPENCLAW_TEST_STATE_SCRIPT_B64}"
8+
9+
export npm_config_loglevel=error
10+
export npm_config_fund=false
11+
export npm_config_audit=false
12+
export npm_config_prefix=/tmp/npm-prefix
13+
export NPM_CONFIG_PREFIX=/tmp/npm-prefix
14+
export PATH="/tmp/npm-prefix/bin:$PATH"
15+
export CI=true
16+
export OPENCLAW_DISABLE_BUNDLED_PLUGINS=1
17+
export OPENCLAW_NO_ONBOARD=1
18+
export OPENCLAW_NO_PROMPT=1
19+
20+
baseline="${OPENCLAW_UPDATE_CORRUPT_PLUGIN_BASELINE:-openclaw@latest}"
21+
echo "Installing baseline OpenClaw package: $baseline"
22+
if ! npm install -g --prefix /tmp/npm-prefix --omit=optional "$baseline" >/tmp/openclaw-update-corrupt-baseline-install.log 2>&1; then
23+
cat /tmp/openclaw-update-corrupt-baseline-install.log >&2 || true
24+
exit 1
25+
fi
26+
27+
package_root="$(openclaw_e2e_package_root /tmp/npm-prefix)"
28+
entry="$(openclaw_e2e_package_entrypoint "$package_root")"
29+
export OPENCLAW_ENTRY="$entry"
30+
31+
npm_pack_dir="$(mktemp -d "/tmp/openclaw-corrupt-plugin-pack.XXXXXX")"
32+
npm_registry_dir="$(mktemp -d "/tmp/openclaw-corrupt-plugin-registry.XXXXXX")"
33+
pack_fixture_plugin "$npm_pack_dir" /tmp/demo-corrupt-plugin.tgz demo-corrupt-plugin 0.0.1 demo.corrupt "Demo Corrupt Plugin"
34+
start_npm_fixture_registry "@openclaw/demo-corrupt-plugin" "0.0.1" /tmp/demo-corrupt-plugin.tgz "$npm_registry_dir"
35+
36+
echo "Installing managed external plugin..."
37+
node "$entry" plugins install "npm:@openclaw/demo-corrupt-plugin@0.0.1" >/tmp/openclaw-corrupt-plugin-install.log 2>&1
38+
node "$entry" plugins inspect demo-corrupt-plugin --runtime --json >/tmp/openclaw-corrupt-plugin-before.json
39+
unset NPM_CONFIG_REGISTRY npm_config_registry
40+
41+
plugin_dir="$(
42+
node -e '
43+
const fs = require("node:fs");
44+
const payload = JSON.parse(fs.readFileSync(process.argv[1], "utf8"));
45+
const installPath = payload.install?.installPath ?? payload.plugin?.rootDir;
46+
if (!installPath) {
47+
throw new Error("missing plugin install path in inspect output");
48+
}
49+
process.stdout.write(installPath);
50+
' /tmp/openclaw-corrupt-plugin-before.json
51+
)"
52+
rm -f "$plugin_dir/package.json"
53+
if [ -f "$plugin_dir/package.json" ]; then
54+
echo "Expected corrupt plugin package.json to be removed before update." >&2
55+
exit 1
56+
fi
57+
58+
echo "Updating OpenClaw with corrupt plugin present..."
59+
set +e
60+
node "$entry" update --channel beta --tag "${OPENCLAW_CURRENT_PACKAGE_TGZ:?missing OPENCLAW_CURRENT_PACKAGE_TGZ}" --yes --no-restart --json >/tmp/openclaw-update-corrupt-plugin.json 2>/tmp/openclaw-update-corrupt-plugin.err
61+
update_status=$?
62+
set -e
63+
if [ "$update_status" -ne 0 ]; then
64+
if ! node scripts/e2e/lib/plugin-update/probe.mjs assert-legacy-post-update-plugin-failure /tmp/openclaw-update-corrupt-plugin.json; then
65+
echo "openclaw update failed with corrupt plugin present" >&2
66+
cat /tmp/openclaw-update-corrupt-plugin.err >&2 || true
67+
cat /tmp/openclaw-update-corrupt-plugin.json >&2 || true
68+
exit "$update_status"
69+
fi
70+
echo "Legacy updater reported post-update plugin failure after installing the new core; verifying updated entrypoint..."
71+
set +e
72+
OPENCLAW_UPDATE_POST_CORE=1 \
73+
OPENCLAW_UPDATE_POST_CORE_CHANNEL=beta \
74+
OPENCLAW_UPDATE_POST_CORE_RESULT_PATH=/tmp/openclaw-update-corrupt-plugin-post-core.json \
75+
node "$entry" update --yes --no-restart --json >/tmp/openclaw-update-corrupt-plugin-post-core.stdout 2>/tmp/openclaw-update-corrupt-plugin-post-core.err
76+
post_core_status=$?
77+
set -e
78+
if [ "$post_core_status" -ne 0 ]; then
79+
echo "updated OpenClaw entry failed post-core plugin verification" >&2
80+
cat /tmp/openclaw-update-corrupt-plugin-post-core.err >&2 || true
81+
cat /tmp/openclaw-update-corrupt-plugin-post-core.stdout >&2 || true
82+
cat /tmp/openclaw-update-corrupt-plugin-post-core.json >&2 || true
83+
exit "$post_core_status"
84+
fi
85+
node scripts/e2e/lib/plugin-update/probe.mjs assert-corrupt-plugin-result /tmp/openclaw-update-corrupt-plugin-post-core.json demo-corrupt-plugin
86+
exit 0
87+
fi
88+
89+
node scripts/e2e/lib/plugin-update/probe.mjs assert-corrupt-update /tmp/openclaw-update-corrupt-plugin.json demo-corrupt-plugin

scripts/e2e/lib/plugin-update/probe.mjs

Lines changed: 76 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -112,14 +112,89 @@ function assertOutput(logPath) {
112112
}
113113
}
114114

115-
const [command, arg] = process.argv.slice(2);
115+
function assertCorruptUpdate(updateJsonPath, pluginId) {
116+
const payload = readJson(updateJsonPath);
117+
if (payload.status !== "ok") {
118+
throw new Error(`expected core update status ok, got ${JSON.stringify(payload.status)}`);
119+
}
120+
const plugins = payload.postUpdate?.plugins;
121+
if (!plugins) {
122+
throw new Error(`missing postUpdate.plugins in update output: ${JSON.stringify(payload)}`);
123+
}
124+
if (plugins.status !== "warning") {
125+
throw new Error(
126+
`expected post-update plugin status warning, got ${JSON.stringify(plugins.status)}`,
127+
);
128+
}
129+
assertCorruptPluginDetails(plugins, pluginId);
130+
}
131+
132+
function assertCorruptPluginResult(pluginJsonPath, pluginId) {
133+
const plugins = readJson(pluginJsonPath);
134+
if (plugins.status !== "warning") {
135+
throw new Error(
136+
`expected post-update plugin status warning, got ${JSON.stringify(plugins.status)}`,
137+
);
138+
}
139+
assertCorruptPluginDetails(plugins, pluginId);
140+
}
141+
142+
function assertCorruptPluginDetails(plugins, pluginId) {
143+
const outcomes = plugins.npm?.outcomes ?? [];
144+
const outcome = outcomes.find((entry) => entry?.pluginId === pluginId);
145+
if (!outcome || outcome.status !== "error") {
146+
throw new Error(
147+
`expected error outcome for ${pluginId}, got ${JSON.stringify({
148+
outcomes,
149+
warnings: plugins.warnings ?? [],
150+
sync: plugins.sync,
151+
integrityDrifts: plugins.integrityDrifts ?? [],
152+
})}`,
153+
);
154+
}
155+
const warnings = plugins.warnings ?? [];
156+
const warning = warnings.find((entry) => entry?.pluginId === pluginId);
157+
if (!warning) {
158+
throw new Error(`expected warning for ${pluginId}, got ${JSON.stringify(warnings)}`);
159+
}
160+
const text = JSON.stringify({ outcome, warning });
161+
for (const expected of [
162+
"package.json is missing",
163+
"Run openclaw doctor --fix to attempt automatic repair.",
164+
`Run openclaw plugins inspect ${pluginId} --runtime --json for details.`,
165+
]) {
166+
if (!text.includes(expected)) {
167+
throw new Error(`expected update output to include ${expected}: ${text}`);
168+
}
169+
}
170+
}
171+
172+
function assertLegacyPostUpdatePluginFailure(updateJsonPath) {
173+
const payload = readJson(updateJsonPath);
174+
if (payload.status !== "error" || payload.reason !== "post-update-plugins") {
175+
throw new Error(
176+
`expected legacy post-update plugin failure, got ${JSON.stringify({
177+
status: payload.status,
178+
reason: payload.reason,
179+
})}`,
180+
);
181+
}
182+
if (!payload.after?.version) {
183+
throw new Error(`expected core update to install a new version: ${JSON.stringify(payload)}`);
184+
}
185+
}
186+
187+
const [command, arg, arg2] = process.argv.slice(2);
116188
const commands = {
117189
"legacy-compat": () => console.log(legacyPackageAcceptanceCompat(arg || "") ? "1" : "0"),
118190
seed: seedInstallState,
119191
"wait-registry": waitRegistry,
120192
snapshot: () => process.stdout.write(JSON.stringify(pluginRecordSnapshot(), null, 2)),
121193
"assert-snapshot": () => assertSnapshot(arg),
122194
"assert-output": () => assertOutput(arg),
195+
"assert-corrupt-update": () => assertCorruptUpdate(arg, arg2),
196+
"assert-corrupt-plugin-result": () => assertCorruptPluginResult(arg, arg2),
197+
"assert-legacy-post-update-plugin-failure": () => assertLegacyPostUpdatePluginFailure(arg),
123198
};
124199
const run = commands[command];
125200
await (
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
#!/usr/bin/env bash
2+
# Verifies `openclaw update` succeeds when a managed external plugin is corrupt.
3+
# The lane installs an older published OpenClaw package, corrupts an npm-managed
4+
# plugin payload, then updates to the prepared package artifact.
5+
set -euo pipefail
6+
7+
ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
8+
source "$ROOT_DIR/scripts/lib/docker-e2e-image.sh"
9+
source "$ROOT_DIR/scripts/lib/docker-e2e-package.sh"
10+
11+
IMAGE_NAME="$(docker_e2e_resolve_image "openclaw-update-corrupt-plugin-e2e" OPENCLAW_UPDATE_CORRUPT_PLUGIN_E2E_IMAGE)"
12+
SKIP_BUILD="${OPENCLAW_UPDATE_CORRUPT_PLUGIN_E2E_SKIP_BUILD:-0}"
13+
PACKAGE_TGZ="$(docker_e2e_prepare_package_tgz update-corrupt-plugin "${OPENCLAW_CURRENT_PACKAGE_TGZ:-}")"
14+
# Bare lanes mount the package artifact instead of baking app sources into the image.
15+
docker_e2e_package_mount_args "$PACKAGE_TGZ"
16+
17+
docker_e2e_build_or_reuse "$IMAGE_NAME" update-corrupt-plugin "$ROOT_DIR/scripts/e2e/Dockerfile" "$ROOT_DIR" "bare" "$SKIP_BUILD"
18+
OPENCLAW_TEST_STATE_SCRIPT_B64="$(docker_e2e_test_state_shell_b64 update-corrupt-plugin empty)"
19+
20+
echo "Running corrupt plugin update tolerance E2E..."
21+
docker_e2e_run_with_harness \
22+
-e COREPACK_ENABLE_DOWNLOAD_PROMPT=0 \
23+
-e OPENCLAW_SKIP_CHANNELS=1 \
24+
-e OPENCLAW_SKIP_PROVIDERS=1 \
25+
-e "OPENCLAW_TEST_STATE_SCRIPT_B64=$OPENCLAW_TEST_STATE_SCRIPT_B64" \
26+
"${DOCKER_E2E_PACKAGE_ARGS[@]}" \
27+
"$IMAGE_NAME" \
28+
bash scripts/e2e/lib/plugin-update/corrupt-update-scenario.sh
29+
30+
echo "Corrupt plugin update tolerance Docker E2E passed."

scripts/lib/docker-e2e-scenarios.mjs

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -273,6 +273,15 @@ export const mainLanes = [
273273
npmLane("plugin-update", "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:plugin-update", {
274274
stateScenario: "empty",
275275
}),
276+
npmLane(
277+
"update-corrupt-plugin",
278+
"OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:update-corrupt-plugin",
279+
{
280+
stateScenario: "empty",
281+
timeoutMs: 30 * 60 * 1000,
282+
weight: 3,
283+
},
284+
),
276285
npmLane(
277286
"plugin-lifecycle-matrix",
278287
"OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:plugin-lifecycle-matrix",

0 commit comments

Comments
 (0)