fix(ui): ignore stale running session rows#87017
Conversation
|
Codex review: needs real behavior proof before merge. Reviewed May 26, 2026, 5:13 PM ET / 21:13 UTC. Summary PR surface: Source +12, Tests +242, Config 0, Other +120. Total +374 across 12 files. Reproducibility: yes. from source inspection and the linked reporter proof: current main treats Review metrics: 1 noteworthy metric.
Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Proof guidance: Mantis proof suggestion Risk before merge
Maintainer options:
Next step before merge Security Review findings
Review detailsBest possible solution: Land the UI predicate fix only after the shrinkwrap check fails closed or treats discovery failure as dependency-input changes, and after live browser/gateway proof or an explicit maintainer waiver is recorded. Do we have a high-confidence way to reproduce the issue? Yes, from source inspection and the linked reporter proof: current main treats Is this the best way to solve the issue? No, not as submitted: the Control UI predicate fix is the right narrow direction, but the PR also introduces a shrinkwrap check that can fail open and still needs live UI proof or a maintainer waiver. Full review comments:
Overall correctness: patch is incorrect AGENTS.md: found and applied where relevant. Codex review notes: model gpt-5.5, reasoning high; reviewed against 8523d3268a6e. Label changesLabel changes:
Label justifications:
Evidence reviewedPR surface: Source +12, Tests +242, Config 0, Other +120. Total +374 across 12 files. View PR surface stats
Security concerns:
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
|
ClawSweeper PR egg 🎁 Pass real behavior proof to wake the egg and unlock a hatchable treat. Where did the egg go?
|
Dependency Changes DetectedThis PR changes dependency-related files. Maintainers should confirm these changes are intentional. Changed files:
Maintainer follow-up:
|
|
No dependency changes detected. Learn more about Socket for GitHub. 👍 No dependency changes detected in pull request |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d9d34dfee4
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| } catch { | ||
| return []; |
There was a problem hiding this comment.
Fail closed when changed-path discovery fails
listCheckChangedPaths swallows all git errors and returns an empty path list, but --check mode uses that list to decide whether to reuse current shrinkwrap pins (useCurrentShrinkwrapOverrides). In check runs where origin/main is unavailable (for example shallow or detached CI checkouts), this marks every package as “dependency inputs unchanged” and can let pnpm deps:shrinkwrap:check pass even when lock/manifests actually require regenerated shrinkwraps. Treat this failure as an error (or as “inputs changed”) so stale shrinkwraps cannot be reported as current.
Useful? React with 👍 / 👎.
d9d34df to
401c034
Compare
9b58078 to
fa52b25
Compare
Summary
Linked context
Closes #86939
Requested by maintainer triage in this session.
Real behavior proof
Behavior addressed: WebChat run-status label no longer stays on In progress after a completed run when sessions.list or sessions.changed returns a stale status running row with hasActiveRun false.
Real environment tested: Local source checkout, delegated Blacksmith Testbox changed-check lanes before the final clean rebase, and fresh GitHub PR CI on head fa52b25.
Exact steps or command run after this patch:
Evidence after fix: Focused UI Vitest passed 2 shards, 115 tests before CI fixups; combined focused Vitest passed 7 files, 262 tests after the final rebase. Full shrinkwrap check passed across all packages. Fresh GitHub PR CI passed on head fa52b25. Earlier delegated Blacksmith Testbox changed checks completed exit 0 for tbx_01ksjxadjfz5zg7qx0bs1vxaqt and tbx_01ksk1nxxnjefwckgt9zwgcpv7. Autoreview reported no accepted/actionable findings.
Observed result after fix: Explicit hasActiveRun false now prevents stale status running rows from being treated as abortable/live; terminal statuses still suppress stale active flags; lifecycle start patches revive active state for new runs. Shrinkwrap checks stay pinned to lockfile versions unless package dependency inputs changed.
What was not tested: A live browser/gateway model-run reproduction was not rerun after the patch; the race is covered with focused controller/render predicate regressions.
Proof limitations or environment constraints: The reported race depends on gateway/store timing, so unit coverage pins the state transitions directly instead of relying on nondeterministic timing. The final explicit Testbox rerun was blocked by the local Testbox hydrate workflow mismatch, so final broad proof is GitHub PR CI on the exact rebased head.
Tests and validation
Commands run:
Regression coverage added:
Known before-fix failure: status running won over hasActiveRun false, allowing the composer to show In progress permanently after the terminal chat status cleared.
Risk checklist
Did user-visible behavior change? Yes.
Did config, environment, or migration behavior change? Yes, package-manager dependency overrides now pin AWS transitive versions already present in the pnpm lock so npm shrinkwrap generation is deterministic.
Did security, auth, secrets, network, or tool execution behavior change? No.
Highest-risk area: Preserving reconnect/Stop behavior when the browser has no local chatRunId.
Mitigation: Legacy status running rows remain active when hasActiveRun is absent, and lifecycle start events set hasActiveRun true if an older row had false.
Current review state
Next action: Merge.
Waiting on: Nothing.
Bot/reviewer comments addressed: None.