Skip to content

fix(doctor): surface plugin version drift warnings in doctor output (fixes #90891)#90917

Closed
zenglingbiao wants to merge 2 commits into
openclaw:mainfrom
zenglingbiao:fix/issue-90891-doctor-plugin-version-drift
Closed

fix(doctor): surface plugin version drift warnings in doctor output (fixes #90891)#90917
zenglingbiao wants to merge 2 commits into
openclaw:mainfrom
zenglingbiao:fix/issue-90891-doctor-plugin-version-drift

Conversation

@zenglingbiao

@zenglingbiao zenglingbiao commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

Summary

openclaw doctor --non-interactive now surfaces official managed plugin version drift warnings, reusing the same detectPluginVersionDrift function already used by openclaw gateway status --deep.

Before: Doctor only reported plugin counts (loaded/imported/disabled/errors), missing plugin version mismatches.
After: Doctor detects and reports active official plugin version drift with exact version gaps and repair commands.

Changes: 3 files, +87 lines

  • src/commands/doctor-workspace-status.ts — add async drift detection to noteWorkspaceStatus
  • src/flows/doctor-health-contributions.ts — pass ctx.env ?? process.env so drift detection works in normal runs
  • src/commands/doctor-workspace-status.test.ts — add regression test with seeded drifted install records

Closes #90891

Root Cause

Invariant violated: openclaw doctor should provide production health visibility equivalent to openclaw gateway status --deep. Official managed plugin version drift — where a runtime-critical plugin (e.g. codex) is on a different version than the gateway — is a production health issue, but doctor did not surface it.

Code path: The detectPluginVersionDrift function (src/plugins/plugin-version-drift.ts) has existed since 4a285d529a and is correctly used by src/cli/daemon-cli/status.gather.ts for gateway status --deep, but noteWorkspaceStatus in src/commands/doctor-workspace-status.ts only called buildPluginRegistrySnapshotReport (which reports counts) without invoking drift detection.

v2 fix: The initial version passed ctx.env which was undefined in the normal doctor path (the doctor flow context builder in doctor-health.ts never sets env). Fixed by using ctx.env ?? process.env so the drift check works for both daemon-mode and CLI-mode doctor runs.

Sibling Surface Audit:

  • src/cli/daemon-cli/status.gather.tsuses detectPluginVersionDrift
  • src/commands/doctor-workspace-status.tsdid not use detectPluginVersionDrift ❌ (fixed)
  • No other callers in the codebase

Scope:

Verification

pnpm build && pnpm test src/commands/doctor-workspace-status.test.ts src/plugins/plugin-version-drift.test.ts src/commands/doctor-workspace.test.ts src/commands/doctor-plugin-manifests.test.ts
  • pnpm build: ✅ passed
  • src/commands/doctor-workspace-status.test.ts: ✅ 7 tests (including new drift regression)
  • src/commands/doctor-workspace.test.ts: ✅ 3 tests
  • src/commands/doctor-plugin-manifests.test.ts: ✅ 5 tests
  • src/plugins/plugin-version-drift.test.ts: ✅ 17 tests
  • Total: 4 files, 32 tests passed
  • pnpm exec oxlint on changed files: ✅ clean

Real behavior proof

Behavior or issue addressed: openclaw doctor --non-interactive now detects and reports active official plugin version drift that was previously only visible through openclaw gateway status --deep. The env wiring uses ctx.env ?? process.env so the drift check works in all doctor invocation modes.

Real environment tested: Linux x86_64, Node v22.22.0, pnpm 10.25.0, OpenClaw main @ 74331f632b

Exact steps or command run after this patch:

pnpm build
pnpm test src/commands/doctor-workspace-status.test.ts src/plugins/plugin-version-drift.test.ts src/commands/doctor-workspace.test.ts src/commands/doctor-plugin-manifests.test.ts
pnpm openclaw doctor --non-interactive

Evidence after fix:

  1. 32 tests pass across 4 test files (CI-level)
  2. New regression test: reports plugin version drift when an official plugin version mismatches the gateway — seeds drifted codex install record (2026.5.30-beta.1 vs gateway 2026.6.1), calls noteWorkspaceStatus with env, asserts "Plugin version drift" note appears with exact version gap and repair command
  3. Existing tests unaffected: all 6 original tests continue to pass
  4. pnpm openclaw doctor --non-interactive runs without errors (no crash, no regression in output)
  5. Source evidence: drift detector is the same detectPluginVersionDrift from plugin-version-drift.ts (17 unit tests cover normalization, exact-version skip, official install comparison, build-qualifier matching, etc.)
{
  "behaviorAddressed": "Doctor now detects and reports official plugin version drift using ctx.env ?? process.env, with regression test proving drift note emission",
  "assertions": [
    { "path": "src/plugins/plugin-version-drift.test.ts", "behavior": "17 tests verify drift detection logic (normalization, exact-version skip, official comparison, build-qualifier)" },
    { "path": "src/commands/doctor-workspace-status.test.ts", "behavior": "7 tests verify doctor workspace status output, including new regression that seeds drifted records and asserts drift note" },
    { "path": "src/commands/doctor-workspace-status.test.ts (new test)", "behavior": "reports plugin version drift when an official plugin version mismatches the gateway — seeds codex@2026.5.30-beta.1, asserts note with version gap and repair command" }
  ],
  "observedResult": "pnpm test exited 0 for all 4 test files (32 tests total)",
  "testSummary": "4 test files, 32 tests passed including 1 new drift regression"
}

Observed result after fix: Doctor runs successfully. With process.env fallback, the drift check now loads real install records and calls detectPluginVersionDrift. If drift is detected, it emits a "Plugin version drift" note with version gaps and repair commands. If install records are unavailable, detection is gracefully skipped (try-catch).

What was not tested: End-to-end test with an actual version-drifted plugin on a live production gateway (requires managing real plugin installs to create the drifted state). The drift detection function itself is fully covered by 17 unit tests, and the doctor integration path is covered by the new regression test with mocked install records.

Review Findings Addressed

  • [P1] Pass a real env to the drift check (src/flows/doctor-health-contributions.ts:731): Fixed. Changed from ctx.env to ctx.env ?? process.env so the drift check works in normal doctor CLI runs where the context builder doesn't set env.
  • [P1] Add regression test: Added reports plugin version drift when an official plugin version mismatches the gateway — seeds drifted codex install record, calls noteWorkspaceStatus with env, asserts drift note with exact version gap and repair command.

Regression Test Plan

pnpm test src/commands/doctor-workspace-status.test.ts
pnpm test src/plugins/plugin-version-drift.test.ts
pnpm test src/commands/doctor-workspace.test.ts
pnpm test src/commands/doctor-plugin-manifests.test.ts

To verify in a live environment:

pnpm openclaw doctor --non-interactive
# Observe that no crash occurs and plugin version drift section appears if applicable

Merge Risk

Risk: Low. The change is entirely additive — a best-effort drift detection block wrapped in try-catch. Uses process.env fallback so it works in all doctor invocation modes. The only signature change is making noteWorkspaceStatus async, which is transparent to its single caller that was already in an async context. Regression test proves the new code path executes correctly.

Note on CI failures: The build-artifacts / core-support-boundary CI failure is a pre-existing infrastructure issue affecting multiple PRs today (e.g., also failing on #90907). Not caused by these changes.

…ixes openclaw#90891)

Reuse the existing detectPluginVersionDrift from gateway status --deep
to report active official plugin version mismatches in doctor --non-interactive.
Previously doctor only reported plugin counts; drifted plugins were only
visible through openclaw gateway status --deep.
@openclaw-barnacle openclaw-barnacle Bot added commands Command implementations size: XS proof: supplied External PR includes structured after-fix real behavior proof. labels Jun 6, 2026
@clawsweeper

clawsweeper Bot commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge. Reviewed June 6, 2026, 8:17 AM ET / 12:17 UTC.

Summary
The PR makes doctor workspace status async, adds official managed plugin version drift reporting through the existing detector, passes an environment from the doctor flow, and adds a seeded regression test.

PR surface: Source +30, Tests +52. Total +82 across 3 files.

Reproducibility: yes. Source inspection shows current main's doctor workspace status lacks the existing drift detector while gateway status uses it, and the PR-specific remote-mode issue is source-reproducible because the new block is only env-gated.

Review metrics: 1 noteworthy metric.

  • Doctor diagnostic scope: 1 new install-record drift check added. The new check reads local plugin install records, so local-vs-remote gateway scope matters before merge.

Merge readiness
Overall: 🧂 unranked krab
Proof: 🧂 unranked krab
Patch quality: 🦐 gold shrimp
Result: blocked until real behavior proof from a real setup is added.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • [P1] Add the remote-gateway skip and a regression proving doctor does not use local install records in remote mode.
  • [P1] Add redacted terminal/live output from doctor against an actual or isolated seeded drifted plugin state.
  • Keep the existing seeded drift regression for the local doctor path.

Proof guidance:

  • [P1] Needs real behavior proof before merge: The drifted-state proof is currently a mocked regression test plus a no-crash doctor run; contributor proof still needs redacted terminal/live output from actual or isolated seeded drifted doctor output, and updating the PR body should trigger re-review. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Risk before merge

  • [P1] Remote-mode users can receive a local plugin drift warning and local update/restart advice even though their configured gateway is remote.
  • [P1] The PR still lacks after-fix real behavior proof showing doctor output for an actual or isolated seeded drifted plugin state.

Maintainer options:

  1. Mirror the local-gateway boundary (recommended)
    Add the same remote-mode skip used by gateway status before doctor reads local install records, then cover that path with a focused regression test.
  2. Accept the diagnostic false-positive risk
    Maintainers could intentionally allow doctor to warn from local install records in remote mode, but that should be an explicit product decision before merge.
  3. Pause for maintainer-owned proof
    Because the linked issue reporter asked to provide the proof, maintainers can pause this PR until the desired seeded or live proof path is clear.

Next step before merge

  • [P1] A narrow code fix is needed, but missing contributor real behavior proof remains a human merge gate that automation cannot supply for the author.

Security
Cleared: The diff adds local diagnostic reads and terminal output only; I found no concrete security or supply-chain regression.

Review findings

  • [P2] Skip local drift checks in remote gateway mode — src/commands/doctor-workspace-status.ts:133
Review details

Best possible solution:

Reuse the existing drift detector in doctor, but mirror the gateway status local-only boundary, add a remote-mode regression, and provide redacted terminal or live-output proof from a drifted plugin state.

Do we have a high-confidence way to reproduce the issue?

Yes. Source inspection shows current main's doctor workspace status lacks the existing drift detector while gateway status uses it, and the PR-specific remote-mode issue is source-reproducible because the new block is only env-gated.

Is this the best way to solve the issue?

No. Reusing the existing detector is the right fix shape, but the doctor integration should also mirror the gateway status remote/local boundary and prove the user-visible drift output.

Full review comments:

  • [P2] Skip local drift checks in remote gateway mode — src/commands/doctor-workspace-status.ts:133
    gateway status --deep deliberately skips local install-record drift detection for remote gateways because local records do not describe the remote runtime. This new doctor path runs whenever an env is passed, so remote-mode users with stale local @openclaw/* plugin records can get a misleading Plugin version drift warning and local plugins update/gateway restart advice. Please mirror the local-only guard and cover remote mode.
    Confidence: 0.86

Overall correctness: patch is incorrect
Overall confidence: 0.86

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 6b2af6c1ee01.

Label changes

Label changes:

  • add merge-risk: 🚨 other: Merging as-is can add misleading doctor repair advice for remote-gateway configurations, a concrete operator-facing diagnostic risk outside the specific merge-risk taxonomy.

Label justifications:

  • P2: This is a bounded doctor diagnostics bugfix with operator-facing upgrade health impact, but no crash, data loss, or security emergency.
  • merge-risk: 🚨 other: Merging as-is can add misleading doctor repair advice for remote-gateway configurations, a concrete operator-facing diagnostic risk outside the specific merge-risk taxonomy.
  • rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🧂 unranked krab and patch quality is 🦐 gold shrimp.
  • status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: The drifted-state proof is currently a mocked regression test plus a no-crash doctor run; contributor proof still needs redacted terminal/live output from actual or isolated seeded drifted doctor output, and updating the PR body should trigger re-review. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.
Evidence reviewed

PR surface:

Source +30, Tests +52. Total +82 across 3 files.

View PR surface stats
Area Files Added Removed Net
Source 2 32 2 +30
Tests 1 53 1 +52
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 3 85 3 +82

What I checked:

Likely related people:

  • hussein1362: Introduced the shared plugin version drift detector and the gateway status drift reporting that this PR reuses. (role: feature introducer; confidence: high; commits: 4a285d529a00; files: src/plugins/plugin-version-drift.ts, src/cli/daemon-cli/status.gather.ts)
  • Tak Hoffman: Authored the prior doctor plugin status alignment commit touching the central doctor workspace status file. (role: doctor plugin-status history; confidence: medium; commits: 7da92cc61866; files: src/commands/doctor-workspace-status.ts)
  • vincentkoc: Recent history includes plugin status tooling work and doctor health contribution refactors adjacent to the changed path. (role: recent doctor/plugin diagnostics contributor; confidence: medium; commits: def5b954a869, 37fdfa0e0bcf; files: src/commands/doctor-workspace-status.ts, src/flows/doctor-health-contributions.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P2 Normal backlog priority with limited blast radius. labels Jun 6, 2026
…ion test

The previous version passed ctx.env to noteWorkspaceStatus, but the
top-level doctor flow never sets env on the context, so drift
detection was unreachable in normal runs.

- Use ctx.env ?? process.env so drift detection works for normal runs
- Add regression test that seeds drifted install records and asserts
  doctor emits the Plugin version drift note
@zenglingbiao

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

Fixed the env wiring and added regression test:

  • so drift detection works for normal doctor runs
  • Regression test seeds drifted install records and asserts the Plugin version drift note appears

All 32 tests pass across 4 test files.

@clawsweeper

clawsweeper Bot commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

@clawsweeper clawsweeper Bot added the merge-risk: 🚨 other 🚨 Merging this PR has meaningful risk outside the owned taxonomy. label Jun 6, 2026
@brokemac79

Copy link
Copy Markdown
Contributor

@zenglingbiao - please close this PR. I clearly stated on the issue ticket (that I was the author of) , that I will create the fix for this, as I am best placed to give test proof etc.

@zenglingbiao

Copy link
Copy Markdown
Contributor Author

@brokemac79 done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

commands Command implementations merge-risk: 🚨 other 🚨 Merging this PR has meaningful risk outside the owned taxonomy. P2 Normal backlog priority with limited blast radius. proof: supplied External PR includes structured after-fix real behavior proof. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. size: S status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Doctor does not report official managed plugin version drift after core upgrade

2 participants