Skip to content

fix(session): terminalize stale question blockers#947

Merged
Astro-Han merged 3 commits into
devfrom
codex/stale-question-recovery
May 27, 2026
Merged

fix(session): terminalize stale question blockers#947
Astro-Han merged 3 commits into
devfrom
codex/stale-question-recovery

Conversation

@Astro-Han

@Astro-Han Astro-Han commented May 27, 2026

Copy link
Copy Markdown
Owner

Summary

Terminalize stale question blockers when session history is read, so a persisted running question whose in-memory external-result Deferred was lost during restart no longer reopens as an active dock. There is no linked issue; this came from a user-provided exported session that reproduced a blocked conversation after app restart.

Why

External-result question prompts depend on an in-memory Deferred. After a full app/server restart, that Deferred is gone, but the persisted tool part can still say running with externalResultReady: true. The app then renders a submit-ready question dock that can only fail with no_pending_tool_call, leaving the session stuck.

The fix reconciles durable session history with the live external-result registry at the backend session-read boundary. If the registry no longer has a pending entry for a ready running question, PawWork persists the part as error with interrupted metadata instead of exposing it as a live blocker.

Related Issue

No GitHub issue yet. This PR is scoped to the exported-session bug report where a question dock survived restart and blocked the session from continuing.

Human Review Status

Pending

Review Focus

Please focus on whether Session.messages() is the right reconciliation boundary and whether the stale check is narrow enough: it only terminalizes question tool parts that are running, externalResultReady === true, and missing from ExternalResult.lookup().

Risk Notes

  • The change mutates stale session parts during message reads. That is intentional so all consumers see the same terminal state, but reviewers should check that live pending questions remain untouched.
  • Skipped visible UI/copy check: no visible UI or copy files changed; existing question interrupted rendering is reused.
  • Skipped platform check: no platform, packaging, updater, signing, path, shell, or permission surface changed.
  • Skipped docs/release/dependency check: no docs, release notes, dependencies, permissions, credentials, deletion behavior, generated content, or local file behavior changed.

How To Verify

RED: bun test test/session/session.test.ts --test-name-pattern "messages terminalizes stale running external-result questions" initially failed with running != error.
Focused backend: bun test test/session/session.test.ts passed, 25 tests, including stale terminalization plus live pending and unready preservation coverage.
Tool route regression: bun test test/server/tool-respond-route.test.ts passed, 7 tests.
Backend typecheck: bun run typecheck passed.
Related app unit tests: bun test src/pages/session/blockers/use-session-blockers.test.ts src/pages/session/composer/session-question-dock.test.ts passed, 25 tests.

Screenshots or Recordings

Not applicable; no visible UI changed.

Checklist

How to use this checklist:

  • Tick a box by replacing [ ] with [x]. Do not edit, add, or remove items.
  • The bot-applied label items can only be honestly ticked AFTER the PR is opened and the labeler / priority-triage bots have run — return to the PR description and tick them then.
  • Most items are required. The few that are conditional are explicitly marked (conditional); for those, leave unticked if they truly do not apply and explain why in Risk Notes. All other items must be ticked before requesting human review.
  • Type label — this PR carries exactly one of bug, enhancement, task, documentation. Type labels are author-added; the labeler bot does NOT assign them. Add the label in the GitHub UI, then tick this.
  • Routing labels — this PR carries at least one of app, ui, platform, harness, ci. The labeler bot assigns these on PR open based on changed paths. Confirm the bot's choice (or override if wrong), then tick this.
  • Priority label — this PR carries exactly one of P0, P1, P2, P3. The priority-triage bot suggests one on PR open. Confirm or override, then tick this.
  • Human Review Status above is set to Pending, Approved by @<reviewer>, or Not required: <reason> (default is Pending; "not required" is restricted to bot-authored low-risk PRs).
  • I linked the related issue, or stated in Summary why there is no issue.
  • I described the review focus and any meaningful risks.
  • I replaced the example block in How To Verify with the real verification steps and the key result for each.
  • I did not introduce unrelated refactors, dependencies, generated files, or file changes beyond the stated scope.
  • (conditional) I manually checked visible UI or copy changes when needed, with screenshots or recordings. Leave unticked only if no visible UI or copy changed.
  • (conditional) I considered macOS and Windows impact for platform, packaging, updater, signing, paths, shell, or permissions changes. Leave unticked only if no platform/packaging surface was touched.
  • (conditional) I called out docs, release notes, dependencies, permissions, credentials, deletion behavior, generated content, or local file changes when relevant. Leave unticked only if none of those surfaces was touched.
  • I reviewed the final diff for unrelated changes and suspicious dependency changes.
  • I am targeting dev, and my PR title and commit messages use Conventional Commits in English.

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Improved handling of unresolved questions by automatically marking stale external-result questions as errors with clear status indicators and interruption metadata.
  • Tests

    • Added test coverage for stale question terminalization behavior during session message retrieval.

Review Change Stack

@Astro-Han Astro-Han added bug Something isn't working P1 High priority harness Model harness, prompts, tool descriptions, and session mechanics labels May 27, 2026
@coderabbitai

coderabbitai Bot commented May 27, 2026

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

Session.messages now terminates stale external-result questions: running question tool parts without valid external results are converted to error state with cancellation metadata and interruption flags when their ExternalResult lookup returns not-found.

Changes

Stale Question Terminalization

Layer / File(s) Summary
Stale question terminalization helper and integration
packages/opencode/src/session/session.ts
Internal helper scans message parts for running question tool calls missing external results, consults ExternalResult.lookup, and converts not-found cases to error state with "Question cancelled before the user answered it." message, reason: "shutdown", and metadata flags interrupted: true and stale_external_result: true. Integration point updates Session.messages to apply this terminalization step to all returned messages before they are sent.
Stale question terminalization test
packages/opencode/test/session/session.test.ts
Imports ExternalResult and adds test that resets external-result state, creates session with messages including a running question tool part, verifies that loading messages marks the part as error with interruption and stale-external-result metadata, checks persisted part state, and cleans up external-result state after test.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • Astro-Han/pawwork#839: Complements server-side stale question termination by updating question-dock UI to disable interactions during pending and normalize stale/duplicate responses.

Suggested labels

P2, upstream

Poem

🐰 A question in limbo, no answer in sight,
Now caught and converted to cancellation's light,
With metadata marked: "stale, interrupted, complete,"
The session flows cleanly, no dangling receipt! 🎪

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title directly describes the main change: fixing session behavior for stale question blockers, which aligns with the PR's core objective of terminalizing stale external-result questions.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The PR description is comprehensive and well-structured, covering all required template sections with specific details about the change, problem, verification steps, and risk assessment.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/stale-question-recovery

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested priority: P2 (includes non-doc, non-test paths outside the low-risk bucket).

P1/P0 are reserved for maintainer confirmation. Please relabel manually if this is a release blocker, security issue, data-loss risk, or updater/runtime failure.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a mechanism to automatically terminalize stale, running external-result questions when retrieving session messages. It adds a helper function terminalizeStaleExternalResultQuestions that iterates through message parts, identifies stale questions, and updates their state to an error status. Additionally, a comprehensive unit test has been added to verify this behavior. The reviewer suggested refactoring the sequential for loops in the new helper function to use Effect.all for parallel processing, which would improve performance and better align with effect-ts idioms.

Comment thread packages/opencode/src/session/session.ts

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/opencode/test/session/session.test.ts`:
- Around line 740-825: The test calls ExternalResult.__resetForTests() only
before and after the Instance.provide block, which can leak state if the test
body throws; wrap the entire async callback passed to Instance.provide (the
function that creates the session and updates messages/parts using
SessionNs.create, SessionNs.updateMessage, SessionNs.updatePart) in a
try/finally and move the ExternalResult.__resetForTests() call into the finally
block so it always runs (keep or remove the pre-call as desired, but ensure the
finally calls ExternalResult.__resetForTests() to guarantee cleanup).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: ed03086f-ff00-429a-b00b-df9c405b4690

📥 Commits

Reviewing files that changed from the base of the PR and between e40b39e and 29c4d2e.

📒 Files selected for processing (2)
  • packages/opencode/src/session/session.ts
  • packages/opencode/test/session/session.test.ts

Comment thread packages/opencode/test/session/session.test.ts
@Astro-Han

Copy link
Copy Markdown
Owner Author

Addressed the P2 test coverage follow-up in 52c35ce. Session.messages now has regression coverage for the stale not_found case plus the two safety branches: live pending ExternalResult entries stay running, and unready external-result questions stay running.\n\nValidation run locally:\n- bun test test/session/session.test.ts — 25 passed\n- bun test test/server/tool-respond-route.test.ts — 7 passed\n- bun run typecheck — passed

@Astro-Han Astro-Han merged commit c3338bb into dev May 27, 2026
28 checks passed
@Astro-Han Astro-Han deleted the codex/stale-question-recovery branch May 27, 2026 08:16
Astro-Han added a commit that referenced this pull request May 28, 2026
Root cause:
The desktop message loader uses the paginated session message endpoint. PR #947 repaired stale external-result question parts through Session.messages(), but GET /session/:sessionID/message?limit=... still called MessageV2.page() directly, so reopened sessions could keep showing an unanswerable running question card after the in-memory ExternalResult was gone.

Change:
- Add Session.messagesPage() as the paginated session-layer reader.
- Reuse stale external-result question terminalization for paginated responses.
- Keep MessageV2.page() as a raw storage pagination helper with no repair side effects.
- Route paginated message responses through Session.messagesPage() while preserving cursor headers.
- Add route-level regressions for stale, live pending, unready, and older-cursor stale question cases.

Verification:
- RED: stale paginated route returned running before the fix.
- RED follow-up: older cursor page returned running when the route was temporarily restored to direct MessageV2.page().
- bun test test/server/session-messages.test.ts: 9 passed.
- bun test test/session/session.test.ts -t "messages": 3 passed.
- bun test test/server/tool-respond-route.test.ts: 7 passed.
- bun run typecheck: passed.
- PR CI: passed, including ci/check, unit-opencode, unit-app, desktop-smoke, e2e-artifacts, codeql, dependency-review, commit-lint, and pr-triage.

Review follow-up:
- Added the older before-cursor regression requested in review.
- Kept shared fixture extraction out of scope to avoid broad test-structure churn.

Risk:
Low. The behavior change is limited to the public paginated session message API and only terminalizes question tool parts that are already marked externalResultReady and have no live ExternalResult entry.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working harness Model harness, prompts, tool descriptions, and session mechanics P1 High priority

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant