fix: widen LLM connect timeout for reasoning models#758
Conversation
The 30s first-progress watchdog in session/llm.ts aborts reasoning-model streams whose first observable provider event arrives later than the ceiling. This is reproducible with OpenAI gpt-5.5 on long sessions and was missed by #729 (which only fixed the timer-start moment). Inject a 120s connect timeout via a new ProviderTransform.streamTimeouts helper, gated on model.capabilities.reasoning. Apply it at the two production llm.stream() call sites (processor main response + prompt title generation) with helper-first spread order so any caller-provided StreamInput.connectTimeoutMs still wins. Three contract tests in transform.test.ts: - policy floor: helper output exceeds CONNECT_STREAM_TIMEOUT_MS - routing: reasoning emits override, non-reasoning emits empty - caller override precedence: explicit StreamInput value wins Out of scope, tracked separately on the issue: - SessionRetry.policy.retryable() does not classify local timeouts - watchdog architecture rewrite (typed errors, wall-clock budget) - mid-stream "terminated" errors (separate incident, separate PR) Refs #755
Crosscheck flagged the original >30s assertion as too loose — a regression that dropped the helper value to 31s would still pass. Add a >=90_000 lower bound; 90s is the lowest ceiling considered for reasoning models in #755 discussion, so this floor codifies the policy direction without pinning the chosen 120s constant. Refs #755
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThis PR adds a provider-aware LLM stream connection timeout transformer. A new ChangesStream Connect Timeout Wiring
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related issues
Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces a specialized connection timeout for reasoning models. It adds a streamTimeouts utility in ProviderTransform that sets a 120-second timeout when reasoning capabilities are detected. This utility is integrated into the LLM streaming logic in both the session processor and prompt generation. Comprehensive tests were added to verify the timeout logic and ensure that manual overrides are respected. I have no feedback to provide.
GPT Pro pre-merge review noted the helper-spread convention is only enforceable by code reading today. Add JSDoc so future readers see the "spread at every call site" expectation at the helper definition. Not a test addition — three internal/external reviewers agreed adding a heavy integration test for a 2-call-site contract is overkill. Refs #755
Summary
Add
ProviderTransform.streamTimeouts(model)returning{ connectTimeoutMs: 120_000 }for reasoning-capable models (gated onmodel.capabilities.reasoning), and apply it at the two productionllm.stream()call sites —session/processor.ts(main response) andsession/prompt.ts(title generation) — with helper-first spread order so any caller-providedStreamInput.connectTimeoutMsstill wins. The defaultCONNECT_STREAM_TIMEOUT_MSis untouched; non-reasoning models keep the 30s ceiling.Why
The 30-second first-progress watchdog in
session/llm.tsaborts reasoning-model streams whose first observable provider event arrives later than the ceiling. The incident recorded in #755 shows OpenAIgpt-5.5on a long session spending 30204ms with onlystart: 1and zero content events before the watchdog aborted. PR #729 fixed an earlier residual where the connect timer armed before the HTTP request was actually sent; the residual addressed here is the 30s ceiling itself, which #729's body explicitly deferred. The capability gate keys offmodel.capabilities.reasoningafter verifying the current models.json catalog has correctreasoning=truelabels on the gpt-5 family (22/23, onlygpt-5.3-chat-latestexcluded), all o-series, Claude haiku/sonnet/opus 4.x thinking variants, and the Gemini 2.5+/3.1 series — so no provider-id allowlist fallback is needed at this time.Related Issue
Refs #755 (short-term path; the full deferred set —
SessionRetry.policy.retryable()not classifying local timeouts, watchdog architectural rewrite, and the parallel mid-streamterminatedfailure mode — is documented in the issue body and left to follow-up PRs).Human Review Status
Pending
Review Focus
session/processor.ts:954andsession/prompt.ts:465.processor.tsreceivesstreamInputfromprocess()callers (session/prompt.ts:2020,session/compaction.ts:480); neither setsconnectTimeoutMstoday, so the spread defaults to the helper value. The order is helper-first specifically so that a future caller wishing to override does not need code changes here.model.capabilities.reasoning === trueis the right axis is welcome.>= 90_000intransform.test.tscodifies the policy direction (the lowest value considered for reasoning models during the issue discussion). Worth questioning whether a corresponding upper bound is also needed.mdlfromprovider.getSmallModel(input.providerID)or the title agent's explicit override. The typical small model is non-reasoning so the helper returns{}; only when a user explicitly configures a reasoning small variant for the title agent does the 120s apply, and the failure mode is silent (caught atprompt.ts:482-488, logs and falls back to the default title). The deliberate choice was to avoid a per-mode branch inside the helper.Risk Notes
UnknownErrorbecauseSessionRetry.policy.retryable()does not type-tag local timeouts and does not match the bare error message. The user-experience improvement here is the lower hit rate; no automatic retry is added in this PR. Retry classification is deferred until the parallel mid-streamterminatedfailure mode is analyzed, so retry can be designed once against both failure shapes rather than speculatively per-shape.connectTimeoutMs: undefinedfrom a caller would clobber the helper value during spread and fall through to the 30s default viallm.ts:434-439. No production path constructsstreamInputthis way today, but a future caller with conditional override should pass a positive number or omit the field rather than set it toundefined.How To Verify
Screenshots or Recordings
Not applicable (no UI change).
Checklist
bug,enhancement,task,documentation. Type labels are author-added; the labeler bot does NOT assign them. Add the label in the GitHub UI, then tick this.app,ui,platform,harness,ci. The labeler bot assigns these on PR open based on changed paths. Confirm the bot's choice (or override if wrong), then tick this.P0,P1,P2,P3. The priority-triage bot suggests one on PR open. Confirm or override, then tick this.Pending,Approved by @<reviewer>, orNot required: <reason>(default isPending; "not required" is restricted to bot-authored low-risk PRs).dev, and my PR title and commit messages use Conventional Commits in English.Summary by CodeRabbit
New Features
Tests