fix: widen LLM connect timeout for reasoning models by Astro-Han · Pull Request #758 · Astro-Han/pawwork

Astro-Han · 2026-05-19T08:10:10Z

Summary

Add ProviderTransform.streamTimeouts(model) returning { connectTimeoutMs: 120_000 } for reasoning-capable models (gated on model.capabilities.reasoning), and apply it at the two production llm.stream() call sites — session/processor.ts (main response) and session/prompt.ts (title generation) — with helper-first spread order so any caller-provided StreamInput.connectTimeoutMs still wins. The default CONNECT_STREAM_TIMEOUT_MS is untouched; non-reasoning models keep the 30s ceiling.

Why

The 30-second first-progress watchdog in session/llm.ts aborts reasoning-model streams whose first observable provider event arrives later than the ceiling. The incident recorded in #755 shows OpenAI gpt-5.5 on a long session spending 30204ms with only start: 1 and zero content events before the watchdog aborted. PR #729 fixed an earlier residual where the connect timer armed before the HTTP request was actually sent; the residual addressed here is the 30s ceiling itself, which #729's body explicitly deferred. The capability gate keys off model.capabilities.reasoning after verifying the current models.json catalog has correct reasoning=true labels on the gpt-5 family (22/23, only gpt-5.3-chat-latest excluded), all o-series, Claude haiku/sonnet/opus 4.x thinking variants, and the Gemini 2.5+/3.1 series — so no provider-id allowlist fallback is needed at this time.

Related Issue

Refs #755 (short-term path; the full deferred set — SessionRetry.policy.retryable() not classifying local timeouts, watchdog architectural rewrite, and the parallel mid-stream terminated failure mode — is documented in the issue body and left to follow-up PRs).

Human Review Status

Pending

Review Focus

Helper-first spread order at session/processor.ts:954 and session/prompt.ts:465. processor.ts receives streamInput from process() callers (session/prompt.ts:2020, session/compaction.ts:480); neither sets connectTimeoutMs today, so the spread defaults to the helper value. The order is helper-first specifically so that a future caller wishing to override does not need code changes here.
Capability-based gate vs model-id-pattern matching. Catalog cross-checked at PR-prep time and labels are reliable across the four target families; a second pair of eyes on whether model.capabilities.reasoning === true is the right axis is welcome.
The new contract test floor at >= 90_000 in transform.test.ts codifies the policy direction (the lowest value considered for reasoning models during the issue discussion). Worth questioning whether a corresponding upper bound is also needed.
Title-generation path applies the helper to mdl from provider.getSmallModel(input.providerID) or the title agent's explicit override. The typical small model is non-reasoning so the helper returns {}; only when a user explicitly configures a reasoning small variant for the title agent does the 120s apply, and the failure mode is silent (caught at prompt.ts:482-488, logs and falls back to the default title). The deliberate choice was to avoid a per-mode branch inside the helper.

Risk Notes

After the 120s ceiling is reached, a first-progress timeout still surfaces as a hard UnknownError because SessionRetry.policy.retryable() does not type-tag local timeouts and does not match the bare error message. The user-experience improvement here is the lower hit rate; no automatic retry is added in this PR. Retry classification is deferred until the parallel mid-stream terminated failure mode is analyzed, so retry can be designed once against both failure shapes rather than speculatively per-shape.
Explicit connectTimeoutMs: undefined from a caller would clobber the helper value during spread and fall through to the 30s default via llm.ts:434-439. No production path constructs streamInput this way today, but a future caller with conditional override should pass a positive number or omit the field rather than set it to undefined.
Behavior change is strictly scoped to reasoning-capable models. Non-reasoning models keep the 30s default.
No visible UI or copy changed; the visible-UI conditional checkbox is left unticked for that reason.
No platform / packaging / updater / signing / paths / shell / permissions surface was touched; the macOS/Windows conditional checkbox is left unticked for that reason.
No docs / release notes / dependencies / permissions / credentials / deletion behavior / generated content / local file changes; the related conditional checkbox is left unticked for that reason.

How To Verify

typecheck (bun --cwd packages/opencode run typecheck):                       ok
bun test packages/opencode/test/provider/transform.test.ts (streamTimeouts):  3 pass / 0 fail
bun test packages/opencode/test/session/ packages/opencode/test/provider/:    940 pass / 4 skip / 1 todo / 0 fail
internal cross-review (Claude Opus + Codex high, parallel):                   0 P0 / 0 P1 / 1 P3 both reviewers flagged (policy floor too loose) fixed by tightening test to >= 90_000

Screenshots or Recordings

Not applicable (no UI change).

Checklist

Summary by CodeRabbit

New Features
- Added intelligent timeout management for AI model streaming with extended timeouts for reasoning-capable models, improving stability during complex inference tasks.
- Enhanced title generation streams with optimized timeout configuration for better performance on reasoning models.
Tests
- Added test coverage for new timeout management functionality, validating behavior across different model types.

The 30s first-progress watchdog in session/llm.ts aborts reasoning-model streams whose first observable provider event arrives later than the ceiling. This is reproducible with OpenAI gpt-5.5 on long sessions and was missed by #729 (which only fixed the timer-start moment). Inject a 120s connect timeout via a new ProviderTransform.streamTimeouts helper, gated on model.capabilities.reasoning. Apply it at the two production llm.stream() call sites (processor main response + prompt title generation) with helper-first spread order so any caller-provided StreamInput.connectTimeoutMs still wins. Three contract tests in transform.test.ts: - policy floor: helper output exceeds CONNECT_STREAM_TIMEOUT_MS - routing: reasoning emits override, non-reasoning emits empty - caller override precedence: explicit StreamInput value wins Out of scope, tracked separately on the issue: - SessionRetry.policy.retryable() does not classify local timeouts - watchdog architecture rewrite (typed errors, wall-clock budget) - mid-stream "terminated" errors (separate incident, separate PR) Refs #755

Crosscheck flagged the original >30s assertion as too loose — a regression that dropped the helper value to 31s would still pass. Add a >=90_000 lower bound; 90s is the lowest ceiling considered for reasoning models in #755 discussion, so this floor codifies the policy direction without pinning the chosen 120s constant. Refs #755

coderabbitai · 2026-05-19T08:10:24Z

Warning

Rate limit exceeded

@Astro-Han has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 40 minutes and 44 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 07fe7edd-91c9-4081-ac61-72b4c563d13e

📥 Commits

Reviewing files that changed from the base of the PR and between d44b7ed and a8268e7.

📒 Files selected for processing (1)

packages/opencode/src/provider/transform.ts

📝 Walkthrough

Walkthrough

This PR adds a provider-aware LLM stream connection timeout transformer. A new streamTimeouts function in provider/transform.ts returns a 120-second timeout for reasoning-capable models and is injected into the main processor and title-generation LLM streams; comprehensive tests validate reasoning/non-reasoning behavior and caller override precedence.

Changes

Stream Connect Timeout Wiring

Layer / File(s)	Summary
Stream timeout transformer definition `packages/opencode/src/provider/transform.ts`	New `streamTimeouts(model)` function returns `{ connectTimeoutMs: 120000 }` when `model.capabilities.reasoning` is enabled, otherwise `{}`. Exported via `ProviderTransform` namespace.
Processor and title generation stream integration `packages/opencode/src/session/processor.ts`, `packages/opencode/src/session/prompt.ts`	`ProviderTransform.streamTimeouts(model)` is spread into `llm.stream()` options in both the main message stream and title-generation stream, injecting reasoning-aware timeout values into LLM calls.
Stream timeout transformer tests `packages/opencode/test/provider/transform.test.ts`	New test block validates: reasoning models produce `connectTimeoutMs` ≥ 90,000 ms, non-reasoning models emit `undefined`, and explicit caller-provided `connectTimeoutMs` overrides the computed value.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related issues

[Bug] 30s LLM connect timeout aborts OpenAI reasoning streams (post-#729 residual) #755: Both changes address LLM stream connection timeout behavior; this PR implements the ProviderTransform.streamTimeouts injection that directly overrides the 30-second connect timeout ceiling for reasoning models.

Possibly related PRs

Astro-Han/pawwork#729: Both PRs manage LLM stream connect-timeout timing—this PR injects connectTimeoutMs via ProviderTransform.streamTimeouts, while the retrieved PR defers timeout arming in session/llm.ts to measure the window correctly.
Astro-Han/pawwork#558: Both PRs wire LLM stream connectTimeoutMs behavior—this PR introduces the transformer and injection, while the retrieved PR implements the timeout enforcement and failure handling in session/llm.ts.

Suggested labels

bug, P2

Poem

🐰 A timeout for thought, so swift and so true,
One hundred twenty seconds for models that brew,
Reasoning rockets need time to ascend,
While quick ones sail fast—no timeout to spend. ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main change—widening the LLM connect timeout for reasoning models—matching the core changeset across all four modified files.
Description check	✅ Passed	The description is comprehensive and fully populated across all required template sections: Summary, Why, Related Issue, Human Review Status, Review Focus, Risk Notes, How To Verify, and a completed checklist with all applicable items checked.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch claude/i755-reasoning-connect-timeout

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions

Suggested priority: P2 (includes non-doc, non-test paths outside the low-risk bucket).

P1/P0 are reserved for maintainer confirmation. Please relabel manually if this is a release blocker, security issue, data-loss risk, or updater/runtime failure.

gemini-code-assist

Code Review

This pull request introduces a specialized connection timeout for reasoning models. It adds a streamTimeouts utility in ProviderTransform that sets a 120-second timeout when reasoning capabilities are detected. This utility is integrated into the LLM streaming logic in both the session processor and prompt generation. Comprehensive tests were added to verify the timeout logic and ensure that manual overrides are respected. I have no feedback to provide.

GPT Pro pre-merge review noted the helper-spread convention is only enforceable by code reading today. Add JSDoc so future readers see the "spread at every call site" expectation at the helper definition. Not a test addition — three internal/external reviewers agreed adding a heavy integration test for a 2-call-site contract is overkill. Refs #755

Astro-Han added 2 commits May 19, 2026 15:59

github-actions Bot added harness Model harness, prompts, tool descriptions, and session mechanics P2 Medium priority labels May 19, 2026

github-actions Bot reviewed May 19, 2026

View reviewed changes

Astro-Han added the bug Something isn't working label May 19, 2026

gemini-code-assist Bot reviewed May 19, 2026

View reviewed changes

Astro-Han merged commit 461b025 into dev May 19, 2026
25 checks passed

Astro-Han deleted the claude/i755-reasoning-connect-timeout branch May 19, 2026 08:46

This was referenced May 23, 2026

[Bug] 30s LLM connect timeout aborts OpenAI reasoning streams (post-#729 residual) #755

Closed

[Task] Track harness improvement series #195

Closed

[Feature] Recover faster from stalled reasoning-model connections before safe retry #918

Closed

coderabbitai Bot mentioned this pull request May 26, 2026

fix: scope reasoning safe retry timeouts by attempt #922

Merged

13 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: widen LLM connect timeout for reasoning models#758

fix: widen LLM connect timeout for reasoning models#758
Astro-Han merged 3 commits into
devfrom
claude/i755-reasoning-connect-timeout

Astro-Han commented May 19, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 19, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested labels

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot left a comment

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Astro-Han commented May 19, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Related Issue

Human Review Status

Review Focus

Risk Notes

How To Verify

Screenshots or Recordings

Checklist

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested labels

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Astro-Han commented May 19, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 19, 2026 •

edited

Loading