Skip to content

[AI-assisted] fix(update): clear stale plugin refs after failed updates#81512

Merged
steipete merged 2 commits into
openclaw:mainfrom
IWhatsskill:fix-plugin-disable-stale-config
May 13, 2026
Merged

[AI-assisted] fix(update): clear stale plugin refs after failed updates#81512
steipete merged 2 commits into
openclaw:mainfrom
IWhatsskill:fix-plugin-disable-stale-config

Conversation

@IWhatsskill

@IWhatsskill IWhatsskill commented May 13, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Problem: plugin update failures with disableOnFailure only disabled plugins.entries.<id> and could leave stale plugins.allow, plugins.deny, and selected plugin slots pointing at the disabled plugin.
  • Why it matters: after a failed external plugin update, operators could see a half-disabled config where the context-engine slot still referenced a plugin that OpenClaw had just disabled.
  • What changed: failed-update disable now also removes the disabled plugin from allow/deny lists and resets selected memory / contextEngine slots to their defaults while preserving the install record for later repair.
  • What did NOT change: no plugin install, package scanning, fallback-channel selection, or uninstall behavior changed.

Change Type

  • Bug fix

Scope

  • API / contracts
  • UI / DX

Linked Issue/PR

Real behavior proof

  • Behavior or issue addressed: failed plugin updates that disable a plugin now leave config policy and selected slots consistent with the disabled entry.
  • Real environment tested: local OpenClaw repo checkout on Windows, Node 24.14, using the real updateNpmInstalledPlugins production helper with a synthetic local config and no external credentials.
  • Exact steps or command run after this patch:
node_modules\.bin\tsx.cmd -e "import { updateNpmInstalledPlugins } from './src/plugins/update.ts'; void (async () => { const result = await updateNpmInstalledPlugins({ config: { plugins: { allow: ['demo', 'keep'], deny: ['demo', 'blocked'], slots: { memory: 'demo', contextEngine: 'demo' }, entries: { demo: { enabled: true } }, installs: { demo: { source: 'npm', spec: 'not a valid spec?', installPath: '/tmp/demo' } } } }, pluginIds: ['demo'], disableOnFailure: true, logger: { warn: () => undefined } }); console.log(JSON.stringify({ changed: result.changed, entry: result.config.plugins?.entries?.demo, allow: result.config.plugins?.allow, deny: result.config.plugins?.deny, slots: result.config.plugins?.slots, installPreserved: Boolean(result.config.plugins?.installs?.demo), outcome: result.outcomes[0]?.status }, null, 2)); })();"
  • Evidence after fix:
{
  "changed": true,
  "entry": {
    "enabled": false
  },
  "allow": [
    "keep"
  ],
  "deny": [
    "blocked"
  ],
  "slots": {
    "memory": "memory-core",
    "contextEngine": "legacy"
  },
  "installPreserved": true,
  "outcome": "skipped"
}
  • Observed result after fix: the failed-update disable path disables the plugin entry, removes stale allow/deny references, resets both selected slots to defaults, preserves the install record, and records the update as skipped.
  • What was not tested: a live beta update against the public npm registry and the specific third-party lossless-claw package from v2026.5.12-beta.5 plugin update disables lossless-claw after @beta fallback and leaves stale contextEngine config #81499.
  • Before evidence: source inspection on current main showed disablePluginConfigEntry() only set entries.<id>.enabled = false and did not touch allow/deny or slots.

Root Cause

  • Root cause: the failed-update disable helper only updated the plugin entry, unlike explicit uninstall config cleanup which already removes policy references and resets selected slots.
  • Missing detection / guardrail: no regression test covered disableOnFailure with a plugin selected in policy lists and plugin slots.
  • Contributing context: plugin update failure handling preserves install records so repair/reinstall remains possible, but dependent runtime config still needs to stop selecting the disabled plugin.

Regression Test Plan

  • Coverage level that should have caught this:
    • Unit test
  • Target test or file: src/plugins/update.test.ts
  • Scenario the test should lock in: an npm plugin update failure with disableOnFailure clears stale allow/deny references and resets selected memory/context-engine slots while preserving the install record.
  • Why this is the smallest reliable guardrail: the bug is pure config mutation in the update failure path, so a focused plugin update test covers the contract without registry access.
  • Existing related coverage: existing disable coverage checked the entry and install record only, not policy lists or slots.

User-visible / Behavior Changes

After an external plugin update failure disables a plugin, plugin policy and selected plugin slots are kept consistent with that disabled state. A context-engine plugin disabled during update now leaves the slot reset to legacy instead of continuing to point at the disabled plugin.

Diagram

Before:
plugin update failure -> entries.demo.enabled=false -> allow/deny/slots may still point at demo

After:
plugin update failure -> entries.demo.enabled=false -> allow/deny remove demo -> slots reset to defaults

Security Impact

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Command/tool execution surface changed? No
  • Data access scope changed? No
  • Risk + mitigation: N/A

Repro + Verification

Environment

  • OS: Windows
  • Runtime/container: local OpenClaw checkout, Node 24.14
  • Model/provider: N/A
  • Integration/channel: plugin update config path
  • Relevant config: synthetic plugin config with demo selected in allow, deny, slots.memory, and slots.contextEngine

Steps

  1. Run the local after-fix tsx proof above.
  2. Run the focused regression test for stale policy/slot cleanup.
  3. Run the adjacent existing disable regression test.
  4. Run formatting and diff checks.

Expected

  • Failed-update disable marks the plugin entry disabled.
  • Stale allow/deny references are removed.
  • Selected memory/context-engine slots reset to defaults.
  • The install record remains available for later repair.

Actual

  • Local proof produced entry.enabled=false, allow=["keep"], deny=["blocked"], slots.memory="memory-core", slots.contextEngine="legacy", installPreserved=true, and outcome="skipped".
  • Focused stale policy/slot regression test passed.
  • Existing disable regression test passed.
  • git diff --check and oxfmt passed.

Evidence

  • Trace/log snippets

Human Verification

  • Verified scenarios: failed-update disable cleans policy lists and resets memory / contextEngine slots while preserving the install record.
  • Edge cases checked: existing entry config is still preserved by the adjacent disable test.
  • What you did not verify: live third-party package update or public npm registry fallback behavior.

Review Conversations

  • Addressed bot review conversations: N/A
  • Unresolved conversations needing reviewer or maintainer judgment: N/A

Compatibility / Migration

  • Backward compatible? Yes
  • Config/env changes? No
  • Migration needed? No
  • Exact upgrade steps: N/A

Risks and Mitigations

  • Risk: resetting a selected slot during failed-update disable changes the active fallback sooner than before.
    • Mitigation: this only happens after OpenClaw has already disabled the plugin due to an update failure; resetting to the existing default prevents a stale selection from pointing at the disabled plugin.

@openclaw-barnacle openclaw-barnacle Bot added size: S proof: supplied External PR includes structured after-fix real behavior proof. labels May 13, 2026
@IWhatsskill IWhatsskill marked this pull request as ready for review May 13, 2026 19:16
@clawsweeper

clawsweeper Bot commented May 13, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge.

Summary
The branch updates failed plugin-update disablement to remove the disabled plugin from allow/deny lists, reset memory/contextEngine slots to defaults, and add a focused regression test.

Reproducibility: yes. Source inspection on current main shows disableOnFailure only disables plugins.entries., and the PR proof/test scenario gives a concrete stale allow/deny/slot config state.

Real behavior proof
Sufficient (live_output): The PR body includes copied after-fix output from a real local OpenClaw checkout invoking the production update helper, which is sufficient for this non-visual config mutation.

Next step before merge
No repair lane is needed because the open PR already contains the focused implementation and regression coverage; it needs normal maintainer review and CI.

Security
Cleared: Cleared: the diff only changes plugin config mutation and focused tests, with no new dependency, workflow, network, credential, package-resolution, or code-execution surface.

Review details

Best possible solution:

Land the focused updater cleanup after normal CI, keeping install, package scanning, fallback selection, and uninstall behavior unchanged.

Do we have a high-confidence way to reproduce the issue?

Yes. Source inspection on current main shows disableOnFailure only disables plugins.entries., and the PR proof/test scenario gives a concrete stale allow/deny/slot config state.

Is this the best way to solve the issue?

Yes. Cleaning dependent config inside disablePluginConfigEntry is the narrow maintainable fix and aligns failed-update disablement with the existing uninstall cleanup contract while preserving install records.

What I checked:

  • Current main only disables the entry: On current main, disablePluginConfigEntry only writes plugins.entries[pluginId].enabled = false and does not touch allow, deny, or slots. (src/plugins/update.ts:763, 26da4edbe150)
  • Failure path uses the helper under review: recordFailure calls disablePluginConfigEntry when disableOnFailure is set, so failed update cleanup belongs on this path. (src/plugins/update.ts:870, 26da4edbe150)
  • Uninstall cleanup supports the same config shape: removePluginFromConfig already removes allow/deny references and resets selected memory/contextEngine slots when a plugin is removed, which supports the PR's cleanup contract for failed-update disablement while preserving installs. (src/plugins/uninstall.ts:374, 26da4edbe150)
  • Slot defaults are canonical: defaultSlotIdForKey defines memory as memory-core and contextEngine as legacy, matching the PR's reset values. (src/plugins/slots.ts:17, 26da4edbe150)
  • Existing test coverage missed policy and slot refs: The existing disableOnFailure regression preserves the entry config and install record but does not assert allow/deny or slot cleanup. (src/plugins/update.test.ts:1681, 26da4edbe150)
  • PR diff adds the focused cleanup and regression test: The provided PR patch imports defaultSlotIdForKey, adds helpers to clear allow/deny and reset selected slots in disablePluginConfigEntry, and adds a regression test covering stale allow/deny/slots with install preservation. (src/plugins/update.ts:761, 5373bc2b2b44)

Likely related people:

  • Altay: Available current-main blame for disablePluginConfigEntry and defaultSlotIdForKey points to this boundary commit on the plugin update/slot path. (role: recent area contributor; confidence: low; commits: a40499b21a2b; files: src/plugins/update.ts, src/plugins/slots.ts)
  • Vincent Koc: Recent uninstall cleanup work owns the adjacent config cleanup pattern that removes allow/deny refs and resets memory/contextEngine slots. (role: adjacent area contributor; confidence: medium; commits: 2a67a7f65e26; files: src/plugins/uninstall.ts)

Remaining risk / open question:

  • This read-only review did not execute the focused Vitest test locally; CI or a maintainer run should cover it before merge.
  • The supplied real behavior proof uses a synthetic local config rather than the live beta update and third-party package, but the changed path is pure config mutation after an update failure.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 26da4edbe150.

@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 13, 2026
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 13, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 13, 2026
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 13, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 13, 2026
@steipete steipete force-pushed the fix-plugin-disable-stale-config branch from 5373bc2 to cabef33 Compare May 13, 2026 23:21
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 13, 2026
@steipete steipete merged commit 8046b5e into openclaw:main May 13, 2026
114 checks passed
@steipete

Copy link
Copy Markdown
Contributor

Landed via rebase onto main.

  • Gate: pnpm test src/plugins/update.test.ts -- --reporter=verbose
  • Gate: pnpm test:changed
  • Gate: pnpm check:changed on Blacksmith Testbox tbx_01krht53kxdha4gxy129b9jkde
  • Gate: git diff --check
  • GitHub checks: green on cabef33
  • Fix commit: 5214f16
  • Changelog commit: 8046b5e

Thanks @JARVIS-Glasses!

github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 24, 2026
jameslcowan pushed a commit to jameslcowan/openclaw that referenced this pull request Jun 2, 2026
sablehead pushed a commit to sablehead/openclaw that referenced this pull request Jun 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

proof: supplied External PR includes structured after-fix real behavior proof. size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants