Skip to content

Commit 7c0fdae

Browse files
committed
docs(providers): document local model request timeout
1 parent e0956a0 commit 7c0fdae

3 files changed

Lines changed: 37 additions & 0 deletions

File tree

docs/concepts/model-providers.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -625,6 +625,7 @@ Example (OpenAI‑compatible):
625625
baseUrl: "http://localhost:1234/v1",
626626
apiKey: "${LM_API_TOKEN}",
627627
api: "openai-completions",
628+
timeoutSeconds: 300,
628629
models: [
629630
{
630631
id: "my-local-model",
@@ -660,6 +661,7 @@ Example (OpenAI‑compatible):
660661
- Proxy-style OpenAI-compatible routes also skip native OpenAI-only request shaping: no `service_tier`, no Responses `store`, no Completions `store`, no prompt-cache hints, no OpenAI reasoning-compat payload shaping, and no hidden OpenClaw attribution headers.
661662
- For OpenAI-compatible Completions proxies that need vendor-specific fields, set `agents.defaults.models["provider/model"].params.extra_body` (or `extraBody`) to merge extra JSON into the outbound request body.
662663
- For vLLM chat-template controls, set `agents.defaults.models["provider/model"].params.chat_template_kwargs`. OpenClaw automatically sends `enable_thinking: false` and `force_nonempty_content: true` for `vllm/nemotron-3-*` when the session thinking level is off.
664+
- For slow local models or remote LAN/tailnet hosts, set `models.providers.<id>.timeoutSeconds`. This extends provider model HTTP request handling, including connect, headers, body streaming, and the total guarded-fetch abort, without increasing the whole agent runtime timeout.
663665
- If `baseUrl` is empty/omitted, OpenClaw keeps the default OpenAI behavior (which resolves to `api.openai.com`).
664666
- For safety, an explicit `compat.supportsDeveloperRole: true` is still overridden on non-native `openai-completions` endpoints.
665667
</Accordion>

docs/gateway/local-models.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -124,6 +124,7 @@ vLLM, LiteLLM, OAI-proxy, or custom gateways work if they expose an OpenAI-style
124124
baseUrl: "http://127.0.0.1:8000/v1",
125125
apiKey: "sk-local",
126126
api: "openai-responses",
127+
timeoutSeconds: 300,
127128
models: [
128129
{
129130
id: "my-local-model",
@@ -142,6 +143,10 @@ vLLM, LiteLLM, OAI-proxy, or custom gateways work if they expose an OpenAI-style
142143
```
143144

144145
Keep `models.mode: "merge"` so hosted models stay available as fallbacks.
146+
Use `models.providers.<id>.timeoutSeconds` for slow local or remote model
147+
servers before raising `agents.defaults.timeoutSeconds`. The provider timeout
148+
applies only to model HTTP requests, including connect, headers, body streaming,
149+
and the total guarded-fetch abort.
145150

146151
Behavior note for local/proxied `/v1` backends:
147152

docs/providers/vllm.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -93,6 +93,7 @@ Use explicit config when:
9393
apiKey: "${VLLM_API_KEY}",
9494
api: "openai-completions",
9595
request: { allowPrivateNetwork: true },
96+
timeoutSeconds: 300, // Optional: extend connect/header/body/request timeout for slow local models
9697
models: [
9798
{
9899
id: "your-model-id",
@@ -179,6 +180,7 @@ Use explicit config when:
179180
apiKey: "${VLLM_API_KEY}",
180181
api: "openai-completions",
181182
request: { allowPrivateNetwork: true },
183+
timeoutSeconds: 300,
182184
models: [
183185
{
184186
id: "my-custom-model",
@@ -201,6 +203,34 @@ Use explicit config when:
201203
## Troubleshooting
202204

203205
<AccordionGroup>
206+
<Accordion title="Slow first response or remote server timeout">
207+
For large local models, remote LAN hosts, or tailnet links, set a
208+
provider-scoped request timeout:
209+
210+
```json5
211+
{
212+
models: {
213+
providers: {
214+
vllm: {
215+
baseUrl: "http://192.168.1.50:8000/v1",
216+
apiKey: "${VLLM_API_KEY}",
217+
api: "openai-completions",
218+
request: { allowPrivateNetwork: true },
219+
timeoutSeconds: 300,
220+
models: [{ id: "your-model-id", name: "Local vLLM Model" }],
221+
},
222+
},
223+
},
224+
}
225+
```
226+
227+
`timeoutSeconds` applies to vLLM model HTTP requests only, including
228+
connection setup, response headers, body streaming, and the total
229+
guarded-fetch abort. Prefer this before increasing
230+
`agents.defaults.timeoutSeconds`, which controls the whole agent run.
231+
232+
</Accordion>
233+
204234
<Accordion title="Server not reachable">
205235
Check that the vLLM server is running and accessible:
206236

0 commit comments

Comments
 (0)