Skip to content

fix(ui): ignore stale running session rows#87017

Merged
steipete merged 4 commits into
mainfrom
fix/webchat-stale-running-status-86939
May 26, 2026
Merged

fix(ui): ignore stale running session rows#87017
steipete merged 4 commits into
mainfrom
fix/webchat-stale-running-status-86939

Conversation

@steipete

@steipete steipete commented May 26, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Fixes a Control UI race where a stale session row with status running could revive the WebChat in-progress indicator after the chat run already completed.
  • Treats explicit hasActiveRun from the gateway as the live abortability signal while keeping terminal status authoritative and legacy running-only rows active.
  • Normalizes lifecycle start events so a fresh run can still revive active state after a stale idle row.
  • Keeps the sessions table badge aligned with the same active-run semantics.
  • Stabilizes npm shrinkwrap generation so unrelated checks do not drift to newer registry-resolved AWS transitives while dependency-policy edits still exercise raw generation.
  • Adds a Windows Node setup fallback for CI runners that start with the wrong preinstalled Node version.

Linked context

Closes #86939

Requested by maintainer triage in this session.

Real behavior proof

Behavior addressed: WebChat run-status label no longer stays on In progress after a completed run when sessions.list or sessions.changed returns a stale status running row with hasActiveRun false.
Real environment tested: Local source checkout, delegated Blacksmith Testbox changed-check lanes before the final clean rebase, and fresh GitHub PR CI on head fa52b25.
Exact steps or command run after this patch:

  • pnpm test ui/src/ui/session-run-state.test.ts ui/src/ui/controllers/sessions.test.ts ui/src/ui/app-chat.test.ts ui/src/ui/views/sessions.test.ts -- --reporter=verbose
  • pnpm test:changed
  • node scripts/run-vitest.mjs test/scripts/generate-npm-shrinkwrap.test.ts ui/src/ui/session-run-state.test.ts ui/src/ui/controllers/sessions.test.ts ui/src/ui/app-chat.test.ts ui/src/ui/views/sessions.test.ts src/agents/pi-embedded-runner.sanitize-session-history.test.ts
  • pnpm deps:shrinkwrap:check
  • bash -n .github/actions/setup-pnpm-store-cache/ensure-node.sh
  • pnpm check:changed via Blacksmith Testbox tbx_01ksjxadjfz5zg7qx0bs1vxaqt and tbx_01ksk1nxxnjefwckgt9zwgcpv7
  • GitHub PR checks on fa52b25: CI, CodeQL, CodeQL Critical Quality, TUI PTY, Workflow Sanity, OpenGrep, Real behavior proof, Dependency Change Awareness, ClawSweeper Dispatch, Labeler, Auto response
  • /Users/steipete/Projects/agent-scripts/skills/autoreview/scripts/autoreview --mode local
    Evidence after fix: Focused UI Vitest passed 2 shards, 115 tests before CI fixups; combined focused Vitest passed 7 files, 262 tests after the final rebase. Full shrinkwrap check passed across all packages. Fresh GitHub PR CI passed on head fa52b25. Earlier delegated Blacksmith Testbox changed checks completed exit 0 for tbx_01ksjxadjfz5zg7qx0bs1vxaqt and tbx_01ksk1nxxnjefwckgt9zwgcpv7. Autoreview reported no accepted/actionable findings.
    Observed result after fix: Explicit hasActiveRun false now prevents stale status running rows from being treated as abortable/live; terminal statuses still suppress stale active flags; lifecycle start patches revive active state for new runs. Shrinkwrap checks stay pinned to lockfile versions unless package dependency inputs changed.
    What was not tested: A live browser/gateway model-run reproduction was not rerun after the patch; the race is covered with focused controller/render predicate regressions.
    Proof limitations or environment constraints: The reported race depends on gateway/store timing, so unit coverage pins the state transitions directly instead of relying on nondeterministic timing. The final explicit Testbox rerun was blocked by the local Testbox hydrate workflow mismatch, so final broad proof is GitHub PR CI on the exact rebased head.

Tests and validation

Commands run:

  • pnpm test ui/src/ui/session-run-state.test.ts ui/src/ui/controllers/sessions.test.ts ui/src/ui/app-chat.test.ts ui/src/ui/views/sessions.test.ts -- --reporter=verbose
  • pnpm test:changed
  • node scripts/run-vitest.mjs test/scripts/generate-npm-shrinkwrap.test.ts ui/src/ui/session-run-state.test.ts ui/src/ui/controllers/sessions.test.ts ui/src/ui/app-chat.test.ts ui/src/ui/views/sessions.test.ts src/agents/pi-embedded-runner.sanitize-session-history.test.ts
  • pnpm deps:shrinkwrap:check
  • bash -n .github/actions/setup-pnpm-store-cache/ensure-node.sh
  • git diff --check
  • pnpm check:changed via Blacksmith Testbox tbx_01ksjxadjfz5zg7qx0bs1vxaqt
  • pnpm check:changed via Blacksmith Testbox tbx_01ksk1nxxnjefwckgt9zwgcpv7
  • GitHub PR checks on fa52b25
  • /Users/steipete/Projects/agent-scripts/skills/autoreview/scripts/autoreview --mode local

Regression coverage added:

  • isSessionRunActive precedence for stale running rows, terminal states, and legacy rows.
  • sessions.list stale running row with hasActiveRun false.
  • sessions.changed stale running event after terminal local reconcile.
  • lifecycle start event after stale idle state.
  • WebChat abortability for stale running rows.
  • sessions table idle badge for stale running rows.
  • shrinkwrap generator current-pin gating for dependency input changes.
  • Windows setup-pnpm-store-cache Node fallback syntax proof.

Known before-fix failure: status running won over hasActiveRun false, allowing the composer to show In progress permanently after the terminal chat status cleared.

Risk checklist

Did user-visible behavior change? Yes.
Did config, environment, or migration behavior change? Yes, package-manager dependency overrides now pin AWS transitive versions already present in the pnpm lock so npm shrinkwrap generation is deterministic.
Did security, auth, secrets, network, or tool execution behavior change? No.
Highest-risk area: Preserving reconnect/Stop behavior when the browser has no local chatRunId.
Mitigation: Legacy status running rows remain active when hasActiveRun is absent, and lifecycle start events set hasActiveRun true if an older row had false.

Current review state

Next action: Merge.
Waiting on: Nothing.
Bot/reviewer comments addressed: None.

@openclaw-barnacle openclaw-barnacle Bot added app: web-ui App: web-ui size: S maintainer Maintainer-authored PR labels May 26, 2026
@clawsweeper

clawsweeper Bot commented May 26, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge. Reviewed May 26, 2026, 5:13 PM ET / 21:13 UTC.

Summary
The PR changes Control UI session activity semantics to respect explicit hasActiveRun, adds focused regressions, stabilizes shrinkwrap generation with current-shrinkwrap pins, reorders AWS override entries, and pins one agent replay-policy test.

PR surface: Source +12, Tests +242, Config 0, Other +120. Total +374 across 12 files.

Reproducibility: yes. from source inspection and the linked reporter proof: current main treats status: "running" as active before checking hasActiveRun: false, and WebChat uses that predicate for abortable UI. I did not run a live browser/gateway reproduction in this read-only review.

Review metrics: 1 noteworthy metric.

  • Shrinkwrap check fallback: 1 fail-open catch added. The new --check reuse-pins path depends on changed-path discovery, so swallowing a git error can hide stale shrinkwrap outputs.

Merge readiness
Overall: 🧂 unranked krab
Proof: 🧂 unranked krab
Patch quality: 🧂 unranked krab
Result: blocked until real behavior proof from a real setup is added.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • Make changed-path discovery fail closed or treat discovery errors as dependency-input changes, with a regression test.
  • Add redacted live browser/gateway proof that WebChat does not return to In progress after the stale row arrives.
  • Get owner review or explicit waiver for the dependency/shrinkwrap check surface after the fix.

Proof guidance:
Needs real behavior proof before merge: The PR body has tests/Testbox only and explicitly did not rerun live browser/gateway; add redacted screenshot, recording, live output, or logs showing the fixed race, redacting IPs, API keys, phone numbers, and private endpoints, then update the PR body or ask @clawsweeper re-review.

Mantis proof suggestion
A short visible UI proof would materially help because the changed behavior is the WebChat run-status label after a timing race. A maintainer can ask Mantis to capture proof by posting a new PR comment that starts with the OpenClaw Mantis account mention, followed by:

visual task: verify WebChat does not return to In progress after a completed run when a stale sessions row reports status running with hasActiveRun false.

Risk before merge

  • The new shrinkwrap --check path can swallow changed-path discovery failures and reuse existing shrinkwrap pins, letting stale package locks be reported current in shallow or detached checkouts.
  • The PR body still lacks live browser/gateway proof for the visible WebChat status race, and the body explicitly says that path was not rerun after the patch.
  • Changing session activity semantics can affect reconnect and Stop behavior when the browser has no local chatRunId; the tests are useful, but the runtime path still needs maintainer-visible proof or waiver.

Maintainer options:

  1. Fail closed before merge (recommended)
    Change listCheckChangedPaths so git discovery errors either fail --check or mark dependency inputs as changed, and add a focused regression for that failure mode.
  2. Owner-approved waiver
    A maintainer can explicitly accept the shrinkwrap-check fallback and live-proof gap after independent validation of the UI race and dependency gate behavior.
  3. Split the script sidecar
    If the shrinkwrap policy change is not needed for this bug fix, pause this branch or replace it with a narrower UI-only PR and track shrinkwrap stabilization separately.

Next step before merge
Human handling remains needed because the PR has a protected maintainer label, needs contributor/live proof or waiver, and the remaining blocker touches dependency security tooling.

Security
Needs attention: Needs attention because the PR changes the shrinkwrap dependency gate and currently lets a git discovery failure hide stale package-lock state.

Review findings

  • [P1] Fail closed when changed-path discovery fails — scripts/generate-npm-shrinkwrap.mjs:716-717
Review details

Best possible solution:

Land the UI predicate fix only after the shrinkwrap check fails closed or treats discovery failure as dependency-input changes, and after live browser/gateway proof or an explicit maintainer waiver is recorded.

Do we have a high-confidence way to reproduce the issue?

Yes, from source inspection and the linked reporter proof: current main treats status: "running" as active before checking hasActiveRun: false, and WebChat uses that predicate for abortable UI. I did not run a live browser/gateway reproduction in this read-only review.

Is this the best way to solve the issue?

No, not as submitted: the Control UI predicate fix is the right narrow direction, but the PR also introduces a shrinkwrap check that can fail open and still needs live UI proof or a maintainer waiver.

Full review comments:

  • [P1] Fail closed when changed-path discovery fails — scripts/generate-npm-shrinkwrap.mjs:716-717
    listCheckChangedPaths catches every git error and returns an empty path list, but --check uses that list to decide whether to reuse current shrinkwrap pins. In shallow or detached checkouts where origin/main is unavailable, this can mark every package as dependency-inputs-unchanged and let pnpm deps:shrinkwrap:check pass with stale shrinkwraps. Re-throw here or treat the package inputs as changed so the dependency gate cannot fail open.
    Confidence: 0.88

Overall correctness: patch is incorrect
Overall confidence: 0.87

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 8523d3268a6e.

Label changes

Label changes:

  • add merge-risk: 🚨 automation: The shrinkwrap check can pass incorrectly when git changed-path discovery fails, weakening CI/release dependency automation.
  • add rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🧂 unranked krab and patch quality is 🧂 unranked krab.
  • remove rating: 🦪 silver shellfish: Current PR rating is rating: 🧂 unranked krab, so this older rating label is no longer current.
  • remove merge-risk: 🚨 auth-provider: Current PR review merge-risk labels are merge-risk: 🚨 compatibility, merge-risk: 🚨 session-state, merge-risk: 🚨 automation.

Label justifications:

  • P2: This is a normal-priority visible Control UI session-state bug fix with limited blast radius, plus merge-sensitive tooling changes.
  • merge-risk: 🚨 compatibility: The PR changes package-manager shrinkwrap checking and active-run semantics that existing installs or upgrade checks may rely on.
  • merge-risk: 🚨 session-state: The UI patch changes how stale session rows clear or revive active-run state after list and websocket updates.
  • merge-risk: 🚨 automation: The shrinkwrap check can pass incorrectly when git changed-path discovery fails, weakening CI/release dependency automation.
  • rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🧂 unranked krab and patch quality is 🧂 unranked krab.
  • status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: The PR body has tests/Testbox only and explicitly did not rerun live browser/gateway; add redacted screenshot, recording, live output, or logs showing the fixed race, redacting IPs, API keys, phone numbers, and private endpoints, then update the PR body or ask @clawsweeper re-review.
Evidence reviewed

PR surface:

Source +12, Tests +242, Config 0, Other +120. Total +374 across 12 files.

View PR surface stats
Area Files Added Removed Net
Source 3 17 5 +12
Tests 6 242 0 +242
Docs 0 0 0 0
Config 1 1 1 0
Generated 0 0 0 0
Other 2 128 8 +120
Total 12 388 14 +374

Security concerns:

  • [medium] Shrinkwrap check can fail open — scripts/generate-npm-shrinkwrap.mjs:717
    npm-shrinkwrap.json and lockfile checks are supply-chain surfaces; returning [] on changed-path discovery failure can make stale generated package locks look current instead of failing or requiring regeneration.
    Confidence: 0.84

What I checked:

  • Repository policy applied: Root AGENTS.md and scoped ui/, scripts/, and src/agents/ guides were read; the root policy makes session state, dependency/shrinkwrap, proof, and CODEOWNER/security surfaces review-sensitive. (AGENTS.md:26, 8523d3268a6e)
  • Current-main source reproduction: Current main returns from status first, so status: "running" is treated as active before hasActiveRun: false can suppress a stale row. (ui/src/ui/session-run-state.ts:8, 8523d3268a6e)
  • WebChat uses the predicate for abortability: hasAbortableSessionRun scans the current session row through isSessionRunActive, which makes the stale row capable of reviving the in-progress UI. (ui/src/ui/app-chat.ts:139, 8523d3268a6e)
  • Linked issue has concrete reporter proof: The linked issue describes the race, includes the stale status: "running" with hasActiveRun: false debug state, and says the reporter reproduced and locally confirmed a fix shape.
  • PR addresses the central UI predicate: The PR changes isSessionRunActive so terminal statuses stay false, explicit boolean hasActiveRun wins, and legacy status-only running rows remain active, with tests for list, event, chat, and sidebar paths. (ui/src/ui/session-run-state.ts:8, 401c0343d34d)
  • Blocking shrinkwrap-check issue remains in latest patch: The PR adds listCheckChangedPaths with a catch-all fallback to []; in --check mode that empty list is then used to decide current shrinkwrap pins are safe to reuse. (scripts/generate-npm-shrinkwrap.mjs:716, 401c0343d34d)

Likely related people:

  • steipete: Git blame/log tie the current Control UI session predicate/controller code and recent shrinkwrap stabilization to Peter Steinberger, and the PR author also appears in that current-main history. (role: recent area contributor; confidence: high; commits: 174cd49f78c1, d8a14e77c35b; files: ui/src/ui/session-run-state.ts, ui/src/ui/controllers/sessions.ts, ui/src/ui/app-chat.ts)
  • openclaw/openclaw-secops: CODEOWNERS assigns both pnpm-lock.yaml and scripts/generate-npm-shrinkwrap.mjs to this owner group, which is relevant to the shrinkwrap check and dependency surface. (role: CODEOWNERS review owner; confidence: high; files: .github/CODEOWNERS, pnpm-lock.yaml, scripts/generate-npm-shrinkwrap.mjs)
  • scotthuang: The linked issue supplies the concrete stale-session race, debug evidence, and locally verified fix shape for validating the user-visible behavior. (role: reporter with reproduction context; confidence: medium; files: ui/src/ui/session-run-state.ts, ui/src/ui/controllers/sessions.ts, ui/src/ui/views/chat.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. P2 Normal backlog priority with limited blast radius. labels May 26, 2026
@clawsweeper

clawsweeper Bot commented May 26, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper PR egg

🎁 Pass real behavior proof to wake the egg and unlock a hatchable treat.

Where did the egg go?
  • The egg game starts only after the PR passes the real-behavior proof check.
  • Before that, no creature or rarity is rolled. The treat waits for real proof.
  • This is still just collectible flavor: proof affects review readiness, not creature quality.

@steipete steipete requested a review from a team as a code owner May 26, 2026 20:32
@github-actions github-actions Bot added the dependencies-changed PR changes dependency-related files label May 26, 2026
@github-actions

github-actions Bot commented May 26, 2026

Copy link
Copy Markdown
Contributor

Dependency Changes Detected

This PR changes dependency-related files. Maintainers should confirm these changes are intentional.

Changed files:

  • pnpm-lock.yaml
  • pnpm-workspace.yaml

Maintainer follow-up:

  • Review whether the dependency changes are intentional.
  • Inspect resolved package deltas when lockfile, shrinkwrap, or workspace dependency policy changes are present.
  • Treat package-lock.json and npm-shrinkwrap.json diffs as security-review surfaces.
  • Run pnpm deps:changes:report -- --base-ref origin/main --markdown /tmp/dependency-changes.md --json /tmp/dependency-changes.json locally for detailed release-style evidence.

@socket-security

socket-security Bot commented May 26, 2026

Copy link
Copy Markdown

No dependency changes detected. Learn more about Socket for GitHub.

👍 No dependency changes detected in pull request

@clawsweeper clawsweeper Bot added rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. merge-risk: 🚨 auth-provider 🚨 May break OAuth, tokens, provider routing, model choice, or credentials. and removed rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. labels May 26, 2026
@openclaw-barnacle openclaw-barnacle Bot added the agents Agent runtime and tooling label May 26, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d9d34dfee4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +677 to +678
} catch {
return [];

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Fail closed when changed-path discovery fails

listCheckChangedPaths swallows all git errors and returns an empty path list, but --check mode uses that list to decide whether to reuse current shrinkwrap pins (useCurrentShrinkwrapOverrides). In check runs where origin/main is unavailable (for example shallow or detached CI checkouts), this marks every package as “dependency inputs unchanged” and can let pnpm deps:shrinkwrap:check pass even when lock/manifests actually require regenerated shrinkwraps. Treat this failure as an error (or as “inputs changed”) so stale shrinkwraps cannot be reported as current.

Useful? React with 👍 / 👎.

@steipete steipete force-pushed the fix/webchat-stale-running-status-86939 branch from d9d34df to 401c034 Compare May 26, 2026 21:05
@openclaw-barnacle openclaw-barnacle Bot removed the channel: tlon Channel integration: tlon label May 26, 2026
@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. merge-risk: 🚨 automation 🚨 May affect CI, automerge, proof capture, label sync, or maintainer automation. and removed rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. merge-risk: 🚨 auth-provider 🚨 May break OAuth, tokens, provider routing, model choice, or credentials. labels May 26, 2026
@steipete steipete force-pushed the fix/webchat-stale-running-status-86939 branch from 9b58078 to fa52b25 Compare May 26, 2026 21:23
@steipete steipete merged commit bb48fcf into main May 26, 2026
106 checks passed
@steipete steipete deleted the fix/webchat-stale-running-status-86939 branch May 26, 2026 21:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling app: web-ui App: web-ui dependencies-changed PR changes dependency-related files maintainer Maintainer-authored PR merge-risk: 🚨 automation 🚨 May affect CI, automerge, proof capture, label sync, or maintainer automation. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. P2 Normal backlog priority with limited blast radius. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. scripts Repository scripts size: M status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(ui): webchat run-status-label stuck on "In progress" after model call ends

1 participant