Skip to content

Commit bad0aa7

Browse files
committed
fix(llm-idle-timeout): honor models.providers.<id>.timeoutSeconds for cloud providers
The schema.help text for `models.providers.*.timeoutSeconds` documents the key as the user-facing knob for "slow local or self-hosted model servers". In practice the option is also the only configurable lever for the LLM idle/first-token watchdog. However `resolveLlmIdleTimeoutMs` was still running the explicit provider timeout through `clampImplicitTimeoutMs`, clamping it back down to the implicit ~120s `DEFAULT_LLM_IDLE_TIMEOUT_MS` ceiling for any non-cron, non-local provider. Consequence (matches #77744 and #78361): - User sets `models.providers.llamacpp.timeoutSeconds: 14400` (or 600 for a slow Gemini/Opus turn with a large tool payload). - Hot reload accepts the value, runtime resolves `modelRequestTimeoutMs = 14_400_000`. - Idle watchdog still trips at ~120s with "LLM idle timeout (120s): no response from model", aborting an otherwise-healthy upstream that is mid-prefill or buffering thinking tokens. Fix: when the caller passes an explicit `modelRequestTimeoutMs` (sourced from `models.providers.<id>.timeoutSeconds` / `model.requestTimeoutMs`), treat it as a deliberate ceiling for cloud providers too. The run-timeout / agent-timeout bounds still apply via `timeoutBounds`, so a shorter explicit run timeout always wins. The implicit default watchdog still kicks in when the user has not set a provider timeout, preserving the network-silence-as-hang guard for default configs. Updated the two corresponding test cases that asserted the old clamp-on-cloud behavior; all 71 tests in `llm-idle-timeout.test.ts` and the wider 430-test `src/agents/pi-embedded-runner/run/` lane pass. Schema help text refreshed to call out that the same knob raises the idle watchdog ceiling. Refs: #77744, #78361
1 parent 6fcfeed commit bad0aa7

3 files changed

Lines changed: 18 additions & 10 deletions

File tree

src/agents/pi-embedded-runner/run/llm-idle-timeout.test.ts

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -43,13 +43,14 @@ describe("resolveLlmIdleTimeoutMs", () => {
4343
expect(resolveLlmIdleTimeoutMs({ runTimeoutMs: 2_147_000_000 })).toBe(0);
4444
});
4545

46-
it("caps remote provider request timeouts at the default idle watchdog", () => {
47-
expect(resolveLlmIdleTimeoutMs({ modelRequestTimeoutMs: 300_000 })).toBe(
48-
DEFAULT_LLM_IDLE_TIMEOUT_MS,
49-
);
46+
it("honors an explicit models.providers.<id>.timeoutSeconds for cloud providers (#77744, #78361)", () => {
47+
// models.providers.<id>.timeoutSeconds is documented as the user-facing
48+
// knob to extend slow model responses. The idle watchdog must respect it
49+
// instead of clamping back to DEFAULT_LLM_IDLE_TIMEOUT_MS.
50+
expect(resolveLlmIdleTimeoutMs({ modelRequestTimeoutMs: 300_000 })).toBe(300_000);
5051
});
5152

52-
it("uses remote provider request timeouts when shorter than the default idle watchdog", () => {
53+
it("honors short explicit provider request timeouts", () => {
5354
expect(resolveLlmIdleTimeoutMs({ modelRequestTimeoutMs: 30_000 })).toBe(30_000);
5455
});
5556

src/agents/pi-embedded-runner/run/llm-idle-timeout.ts

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -154,11 +154,18 @@ export function resolveLlmIdleTimeoutMs(params?: {
154154
Number.isFinite(modelRequestTimeoutMs) &&
155155
modelRequestTimeoutMs > 0
156156
) {
157+
// `modelRequestTimeoutMs` is wired from `models.providers.<id>.timeoutSeconds`,
158+
// which is an explicit per-provider opt-in. The schema help describes it as
159+
// "Use this for slow local or self-hosted model servers instead of changing
160+
// global agent timeouts." so we honor it as a deliberate ceiling rather
161+
// than clamping it back down to the implicit `DEFAULT_LLM_IDLE_TIMEOUT_MS`
162+
// network-silence-as-hang guard. Without this, users hitting #77744 /
163+
// #78361 set provider timeoutSeconds to e.g. 600s, observe the value is
164+
// accepted and hot-reloaded, yet the idle watchdog still aborts at 120s.
165+
// The agent/run timeoutBounds still apply so an explicit shorter run
166+
// timeout always wins.
157167
const boundedTimeoutMs = Math.min(modelRequestTimeoutMs, ...timeoutBounds);
158-
if (params?.trigger === "cron" || isLocalProvider) {
159-
return clampTimeoutMs(boundedTimeoutMs);
160-
}
161-
return clampImplicitTimeoutMs(boundedTimeoutMs);
168+
return clampTimeoutMs(boundedTimeoutMs);
162169
}
163170

164171
if (typeof runTimeoutMs === "number" && Number.isFinite(runTimeoutMs) && runTimeoutMs > 0) {

src/config/schema.help.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -948,7 +948,7 @@ export const FIELD_HELP: Record<string, string> = {
948948
"models.providers.*.maxTokens":
949949
"Default maximum output token budget applied to models under this provider when a model entry does not set maxTokens.",
950950
"models.providers.*.timeoutSeconds":
951-
"Optional per-provider model request timeout in seconds. Applies to provider HTTP fetches, including connect, headers, body, and total request abort handling. Use this for slow local or self-hosted model servers instead of changing global agent timeouts.",
951+
"Optional per-provider model request timeout in seconds. Applies to provider HTTP fetches, including connect, headers, body, and total request abort handling, and also raises the LLM idle/stream watchdog ceiling for this provider above the implicit ~120s default. Use this for slow local or self-hosted model servers, or for cloud providers that buffer reasoning tokens silently on the wire (Gemini preview, large-tool-payload Claude/Opus), instead of changing global agent timeouts.",
952952
"models.providers.*.injectNumCtxForOpenAICompat":
953953
"Controls whether OpenClaw injects `options.num_ctx` for Ollama providers configured with the OpenAI-compatible adapter (`openai-completions`). Default is true. Set false only if your proxy/upstream rejects unknown `options` payload fields.",
954954
"models.providers.*.params":

0 commit comments

Comments
 (0)