fix: scope reasoning safe retry timeouts by attempt by Astro-Han · Pull Request #922 · Astro-Han/pawwork

Astro-Han · 2026-05-26T04:48:30Z

Summary

Scope reasoning-model safe-retry connect timeouts by processor attempt instead of changing the global reasoning stream timeout.

This PR also updates the safe-recovery retry/failure copy to the agreed short English and Chinese wording.

Why

Reasoning-model streams currently keep the global 120s connect watchdog everywhere. That protects legitimate slow starts, but it also makes users wait two minutes before PawWork can safely recover from a no-output/no-tool stalled connection.

#914 made the before-progress no-output/no-tool boundary safe to retry. This follow-up applies the #918 two-attempt strategy: fail fast on the first stalled reasoning-model main response, then protect the one automatic safe retry with the previous 120s ceiling.

Related Issue

Closes #918

Human Review Status

Pending

Review Focus

Please review the processor-level timeout merge order and the safe-recovery presentation predicate. The important invariant is that only the session processor main-response recovery flow gets 60s -> 120s; title generation and other non-recovery llm.stream() callers still use the global reasoning timeout.

Risk Notes

The intentional behavior change is that retryable no-output/no-tool failures that exhaust the one automatic recovery retry now render the safe_retry_failed notice instead of a generic assistant error.

Platform checklist left unticked: no platform, packaging, updater, signing, shell, or native path behavior changed.

How To Verify

Dependency setup: bun install --frozen-lockfile completed in this worktree.
RED check: processor attempt-timeout test first failed with [120000, 120000] before implementation.
Processor tests: bun test --timeout 30000 test/session/processor-effect.test.ts -> 30 passed.
Run observability/provider tests: bun test --timeout 30000 test/session/run-observability.test.ts test/provider/transform.test.ts -t "RunObservability|ProviderTransform.streamTimeouts" -> 62 passed.
UI contract: bun test --timeout 30000 src/components/session-safe-retry-contract.test.ts -> 3 passed.
Typecheck: bun run typecheck in packages/opencode -> passed.
Typecheck: bun run typecheck in packages/ui -> passed.
Lint: bun run lint -> passed.
Visual snap: bun run snap safe-retry -> 1 passed; grid written to docs/design/preview/screenshots/safe-retry.png.
Diff check: git diff --check -> passed.

Screenshots or Recordings

Safe-retry snap target was checked with bun run snap safe-retry.

Checklist

How to use this checklist:

Tick a box by replacing [ ] with [x]. Do not edit, add, or remove items.

The bot-applied label items can only be honestly ticked AFTER the PR is opened and the labeler / priority-triage bots have run — return to the PR description and tick them then.

Most items are required. The few that are conditional are explicitly marked (conditional); for those, leave unticked if they truly do not apply and explain why in Risk Notes. All other items must be ticked before requesting human review.

Summary by CodeRabbit

Bug Fixes
- Updated safe-retry error messaging to clarify when models aren't responding across multiple languages.
- Improved timeout handling for reasoning-capable models during recovery attempts.
- Enhanced safe recovery mechanism to handle additional failure scenarios.
Tests
- Extended test coverage for model retry and timeout behaviors.
- Added localization verification for retry messaging.

coderabbitai · 2026-05-26T04:48:46Z

Warning

Review limit reached

@Astro-Han, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 8 minutes and 22 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: f16de5e4-1c49-433b-b91a-4cbe5cb5a5df

📥 Commits

Reviewing files that changed from the base of the PR and between 1768085 and 7b4c99c.

📒 Files selected for processing (6)

packages/opencode/src/session/processor.ts
packages/opencode/src/session/run-incident/policy.ts
packages/opencode/src/session/run-observability/boundary.ts
packages/opencode/src/session/run-observability/index.ts
packages/opencode/test/session/processor-effect.test.ts
packages/ui/src/components/session-safe-retry-contract.test.ts

📝 Walkthrough

Walkthrough

This PR implements faster recovery from stalled reasoning-model connections by introducing per-attempt connection timeout tuning. Reasoning model first attempts use a shorter 60-second timeout to fail fast, while retry attempts use the longer 120-second timeout to protect legitimate slow starts. It broadens safe recovery conditions and updates observability to record per-attempt timeout values, with corresponding i18n messaging updates.

Changes

Reasoning model connect timeout recovery

Layer / File(s)	Summary
Timeout constants and per-attempt selection logic `packages/opencode/src/provider/transform.ts`, `packages/opencode/src/session/processor.ts`	Exports `REASONING_GLOBAL_CONNECT_TIMEOUT_MS` (120s), `REASONING_FIRST_ATTEMPT_CONNECT_TIMEOUT_MS` (60s), and `REASONING_SAFE_RETRY_CONNECT_TIMEOUT_MS` (120s). Implements `attemptStreamTimeouts()` to select per-attempt timeout based on `automaticStreamRetriesUsed`.
Observability types for per-attempt timeout capture `packages/opencode/src/session/run-observability/types.ts`, `packages/opencode/src/session/run-observability/recorder.ts`	Extends `AttemptSummary` and `BeginAttemptInput` with optional `connect_timeout_ms` / `connectTimeoutMs` fields and updates recorder to persist the timeout value for each attempt.
Per-attempt timeout wiring into stream execution `packages/opencode/src/session/processor.ts`	Computes `sessionTimeouts` during each attempt and wires the resolved `connectTimeoutMs` to both `ctx.runTrace.beginAttempt()` and `llm.stream()` for enforcement and observability.
Safe recovery condition broadening `packages/opencode/src/session/processor.ts`	Introduces `beforeProgressSafeRetry` condition and combines it with `reasoningOnlySafeRetry` into `safeRecoveryRetry`, then updates retry status messaging and safe-retry-failed notice gating to depend on the broader condition.
Processor effect test coverage `packages/opencode/test/session/processor-effect.test.ts`, `packages/opencode/test/provider/transform.test.ts`	Adds reasoning-model test fixture, refactors environment construction for per-test LLM injection, replaces reasoning watchdog timeout assertions with per-attempt timeout scoping verification, and rewrites two retry tests to expect `safe_retry_failed` notices instead of error propagation.
i18n and e2e validation `packages/ui/src/i18n/en.ts`, `packages/ui/src/i18n/zh.ts`, `packages/ui/src/i18n/zht.ts`, `packages/ui/src/components/session-safe-retry-contract.test.ts`, `packages/app/e2e/snap/safe-retry.snap.ts`	Updates safe-retry messaging from "network connection dropped" framing to "model temporarily not responding" framing across all three languages, adds i18n content verification test, and updates e2e snapshot expectations.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Astro-Han/pawwork#758: Both PRs modify reasoning-model streaming connect-timeout behavior and introduce per-attempt timeout tuning via ProviderTransform.streamTimeouts() and connectTimeoutMs values in provider and processor modules.
Astro-Han/pawwork#729: Both PRs modify LLM stream connection-timeout handling; this PR propagates per-attempt connectTimeoutMs into llm.stream() and observability, while the related PR changes when the connect-timeout timer is armed.
Astro-Han/pawwork#558: Both PRs extend LLM stream connectTimeoutMs handling; the related PR adds connect-timeout enforcement in session/llm.ts, while this PR updates retry logic to pass per-attempt connectTimeoutMs and record it.

Poem

🐇 Fast connections for the swift and wise,
Shorter waits on first attempt rise,
Then longer timeouts for safe retry's grace—
Models' breath returns at a measured pace. ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main change: scoping reasoning safe retry timeouts by attempt rather than globally.
Description check	✅ Passed	The description is comprehensive with all required sections: Summary, Why, Related Issue, Human Review Status, Review Focus, Risk Notes, How To Verify, and a complete checklist with most items ticked.
Linked Issues check	✅ Passed	The PR successfully implements the core requirements from `#918`: scopes reasoning-model connect timeouts by attempt (60s first attempt, 120s retry), preserves legitimate slow starts, limits scope to `#914` recovery boundary, updates UI copy, and provides comprehensive test coverage.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to the stated objective: timeout constants, attempt-scoped timeout logic, observability recording, and agreed UI/copy updates. No unrelated refactors or dependencies introduced.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch codex/i918-reasoning-safe-retry-timeouts

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions

Suggested priority: P2 (includes non-doc, non-test paths outside the low-risk bucket).

P1/P0 are reserved for maintainer confirmation. Please relabel manually if this is a release blocker, security issue, data-loss risk, or updater/runtime failure.

gemini-code-assist

Code Review

This pull request updates the connection timeout logic for reasoning-capable models, introducing a fast-fail timeout of 60 seconds on the first attempt and a longer 120-second timeout on the safe retry. It also updates the copy for safe retries and failures to be less technical across English, Simplified Chinese, and Traditional Chinese. The reviewer feedback suggests extending the contract tests to also assert the Traditional Chinese (zht) translation changes, ensuring they do not regress.

github-actions · 2026-05-26T04:55:45Z

Perf delta summary

Comparator: pass

Profile / Scenario	interaction median	interaction worst	long task max	tbt	frame gap p95	frame gap max	jank count	cls	status
default / homepage-cold	40 -> 24 (-16)	40 -> 56 (+16)	68 -> 71 (+3)	18 -> 21 (+3)	33.3 -> 16.8 (-16.5)	150.1 -> 150 (-0.1)	3 -> 3 (0)	0 -> 0 (0)	pass
default / long-session-input-lag	48 -> 48 (0)	48 -> 56 (+8)	0 -> 0 (0)	0 -> 0 (0)	16.7 -> 16.7 (0)	16.7 -> 16.8 (+0.1)	0 -> 0 (0)	0 -> 0 (0)	pass
default / session-streaming-long	48 -> 48 (0)	64 -> 64 (0)	0 -> 0 (0)	0 -> 0 (0)	16.8 -> 16.7 (-0.1)	33.4 -> 33.3 (-0.1)	0 -> 0 (0)	0 -> 0 (0)	pass
default / tool-call-expand	40 -> 40 (0)	40 -> 40 (0)	0 -> 0 (0)	0 -> 0 (0)	16.8 -> 16.7 (-0.1)	16.8 -> 16.7 (-0.1)	0 -> 0 (0)	0 -> 0 (0)	pass
default / tool-default-open-heavy-bash	24 -> 16 (-8)	24 -> 16 (-8)	84 -> 102 (+18)	47 -> 68 (+21)	50 -> 50.1 (+0.1)	116.7 -> 133.3 (+16.6)	3 -> 4 (+1)	0.004 -> 0.004 (0)	pass
default / terminal-side-panel-open	48 -> 48 (0)	48 -> 48 (0)	0 -> 0 (0)	0 -> 0 (0)	33.3 -> 33.4 (+0.1)	33.4 -> 33.4 (0)	0 -> 0 (0)	0 -> 0 (0)	pass
default / session-scroll-reading	16 -> 16 (0)	32 -> 16 (-16)	0 -> 0 (0)	0 -> 0 (0)	16.8 -> 16.7 (-0.1)	16.8 -> 16.7 (-0.1)	0 -> 0 (0)	0 -> 0 (0)	pass

…afe-retry-timeouts # Conflicts: # packages/opencode/src/session/run-incident/policy.ts

## Summary - Added a `retry-decision` metadata layer for model execution retry decisions. - Kept technical retryability, safety-gate output, stream attempts, safe-recovery replay budget, timeout policy, and presentation as separate facts. - Lightly wired `session.processor` to consume the new decision metadata without moving the existing retry loop or changing #922 behavior. - Addressed review feedback by using `modelStreamAttempt` instead of provider-budget wording and by representing blocked-boundary reasoning attempts as `reasoning_global_protected`. ## Verification - `bun test test/session/retry-decision.test.ts` -> 6 passed. - `bun test test/session/retry.test.ts` -> 32 passed. - `bun test test/session/processor-effect.test.ts` -> 33 passed. - `bun run typecheck` in `packages/opencode` -> passed. - PR CI passed: typecheck, lint, unit-app, unit-ui, unit-opencode, unit-desktop, codeql, dependency-review, desktop-smoke, e2e-artifacts, commit/title/triage checks. ## Related Part of #925.

## Summary - Extract safe-recovery replay safety into RunIncident.evaluateReplaySafety so the gate owns only replay permission, recovery mode, and blocked reason. - Move side-effect boundary snapshot construction into RunObservability and keep processor retry scheduling unchanged. - Keep retry attempt metadata and presentation mapping in retry-decision, preserving #922 safe-retry behavior. ## Verification - bun test test/session/run-incident-safety-gate.test.ts test/session/retry-decision.test.ts: 10 passed - bun test test/session/run-observability-boundary.test.ts test/session/run-incident-safety-gate.test.ts test/session/retry-decision.test.ts: 11 passed - bun test test/session/run-observability.test.ts: 63 passed - bun test test/session/processor-effect.test.ts: 33 passed - bun run typecheck: ok - PR CI: green, including unit-opencode and smoke-macos-arm64 ## Review Follow-up - Addressed Gemini import/test coverage comments. - Addressed P3 layer-boundary feedback by keeping presentation and attempt kind outside the safety gate. Closes part of #925.

Summary:\n- Add a dedicated safe-recovery retry policy with separate one-replay budget metadata and lightweight retry presentation.\n- Route session.processor safe-recovery status/backoff scheduling through that policy instead of local retry status and sleep mechanics.\n- Preserve #922 reasoning timeout behavior, lifecycle-close checks, safe_retry_failed notice behavior, and visible-output replay guards.\n\nReview follow-up:\n- Removed the remaining processor-local sleep so the retry policy step owns the backoff delay.\n- Added policy-level coverage for the safe-recovery budget exhaustion branch and tightened it to assert schedule completion rather than generic failure.\n- Verified Cause.done exists in the current Effect runtime; the review concern was a false positive on API existence, but it exposed a useful test-strengthening gap.\n\nCI recovery note:\n- The initial empty retrigger commit did not create pull_request workflow runs. Two temporary marker commits were used to trigger the missing required checks; they leave no final PR diff and are intentionally collapsed by this squash merge.\n\nVerification:\n- TDD RED: safeRecoveryPolicy initially missing; focused test failed for that reason.\n- bun test test/session/retry.test.ts test/session/retry-decision.test.ts test/session/run-incident-safety-gate.test.ts: 44 passed.\n- bun test test/session/processor-effect.test.ts -t "reasoning connect watchdog is attempt-scoped for before-progress safe retry": passed.\n- bun test test/session/processor-effect.test.ts -t "reasoning-only retry writes a notice after the one safe retry is exhausted": passed.\n- bun test test/session/processor-effect.test.ts -t "connect timeout auto retry stops if lifecycle closes during backoff": passed.\n- bun test test/session/processor-effect.test.ts -t "retryable stream error after visible output does not replay the assistant message": passed before the final review follow-up.\n- bun run typecheck: passed.\n- git diff --check: passed.\n- PR CI required checks: passed after rerun.\n\nRelated:\n- Closes part of #925.

Astro-Han added 2 commits May 26, 2026 12:46

fix: scope reasoning safe retry timeouts by attempt

deb52cf

test: update safe retry snap copy

1768085

Astro-Han added bug Something isn't working P2 Medium priority ui Design system and user interface harness Model harness, prompts, tool descriptions, and session mechanics labels May 26, 2026

github-actions Bot added the app Application behavior and product flows label May 26, 2026

github-actions Bot reviewed May 26, 2026

View reviewed changes

gemini-code-assist Bot reviewed May 26, 2026

View reviewed changes

Comment thread packages/ui/src/components/session-safe-retry-contract.test.ts

Comment thread packages/ui/src/components/session-safe-retry-contract.test.ts

Astro-Han added 4 commits May 26, 2026 13:00

test: cover zht safe retry copy

dcd84a8

fix: keep unsafe reasoning boundaries on global timeout

4278cdc

fix: align fast retry with local tool boundaries

15661b6

Merge remote-tracking branch 'origin/dev' into codex/i918-reasoning-s…

ba65081

…afe-retry-timeouts # Conflicts: # packages/opencode/src/session/run-incident/policy.ts

Astro-Han mentioned this pull request May 26, 2026

[Task] Consolidate model execution retry pipeline #925

Closed

refactor: share before-progress boundary policy

7b4c99c

Astro-Han merged commit 9d1b7a8 into dev May 26, 2026
27 checks passed

This was referenced May 26, 2026

[Feature] Recover model runs after tool activity on transport disconnect #927

Open

refactor(session): add unified retry decision metadata #928

Merged

Astro-Han mentioned this pull request May 26, 2026

refactor(session): route safe recovery through retry policy #931

Merged

13 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: scope reasoning safe retry timeouts by attempt#922

fix: scope reasoning safe retry timeouts by attempt#922
Astro-Han merged 7 commits into
devfrom
codex/i918-reasoning-safe-retry-timeouts

Astro-Han commented May 26, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 26, 2026 •

edited

Loading

Review limit reached

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot left a comment

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented May 26, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Astro-Han commented May 26, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Related Issue

Human Review Status

Review Focus

Risk Notes

How To Verify

Screenshots or Recordings

Checklist

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Perf delta summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Astro-Han commented May 26, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 26, 2026 •

edited

Loading

github-actions Bot commented May 26, 2026 •

edited

Loading