Bug type
Regression (worked before, now fails)
Beta release blocker
No
Summary
After updating to OpenClaw 2026.5.3-1, long local llama.cpp requests time out even though the backend is still actively processing the prompt.
The old agents.defaults.llm.idleTimeoutSeconds config is rejected as an unrecognized key. The suggested models.providers.<id>.timeoutSeconds config is accepted and hot reloaded, but it does not prevent the chat from being cut off during long local model prefill.
This looks like the idle watchdog is still using another timeout path that is not configurable through the current schema.
Steps to reproduce
- Install or update OpenClaw to 2026.5.3-1.
- Run OpenClaw in Docker.
- Configure a local llama.cpp / OpenAI-compatible provider.
- Use a large-context local model, for example
qwen3.6-35b.
- Configure the agent primary model as
llamacpp/qwen3.6-35b.
- Send a very large prompt, around 150k tokens, through webchat.
- Observe that llama.cpp keeps processing the prompt, but OpenClaw cancels before the model produces a reply.
- Try adding
agents.defaults.llm.idleTimeoutSeconds.
- Observe OpenClaw rejects the config with
Unrecognized key: "llm".
- Try using
models.providers.llamacpp.timeoutSeconds.
- Observe the config is accepted and hot reloaded, but the chat still times out.
Expected behavior
OpenClaw should wait for the local provider according to the configured timeout, especially when the provider is still processing a long prefill and has not failed.
A long local llama.cpp prefill should not be treated as a failed model request merely because no output token has been produced yet.
Actual behavior
OpenClaw cuts off the agent request before the local llama.cpp backend finishes prompt prefill and produces a reply.
The llama.cpp server is still alive and processing. It is not dead. The request is cancelled from the OpenClaw side before completion.
When fallbacks are configured, OpenClaw then tries fallback providers even though the local model was still working.
OpenClaw version
2026.5.3-1
Operating system
Docker container on NAS, with local llama.cpp backend running on Mac.
Install method
Docker / custom image based on ghcr.io/openclaw/openclaw:latest.
Model
OpenClaw model id:
llamacpp/qwen3.6-35b
Local backend model:
Qwen3.6-35B-A3B-GGUF
Qwen3.6-35B-A3B-Q6_K.gguf
Provider / routing chain
OpenClaw → local llama.cpp OpenAI-compatible server → /v1/responses
Additional provider/model setup details
Provider config:
"models": {
"mode": "merge",
"providers": {
"llamacpp": {
"baseUrl": "http://cerebro-mac:8080/v1",
"apiKey": "dummy",
"api": "openai-responses",
"timeoutSeconds": 14400,
"models": [
{
"id": "qwen3.6-35b",
"name": "Qwen 3.6 35B A3B local llama.cpp",
"reasoning": true,
"input": ["text"],
"contextWindow": 262144,
"maxTokens": 100000
}
]
}
}
}
Agent config:
"agents": {
"defaults": {
"timeoutSeconds": 14400,
"model": {
"primary": "llamacpp/qwen3.6-35b",
"fallbacks": []
},
"compaction": {
"timeoutSeconds": 10800
},
"models": {
"llamacpp/qwen3.6-35b": {
"timeoutSeconds": 14400,
"streaming": true
}
}
}
}
Config that no longer works:
{
"agents": {
"defaults": {
"llm": {
"idleTimeoutSeconds": 3600
}
}
}
}
This raises:
agents.defaults: Unrecognized key: "llm"
models.providers.llamacpp.timeoutSeconds is accepted by hot reload, but the long request still gets cancelled.
Logs, screenshots, and evidence
OpenClaw logs show fallback and timeout behavior:
Embedded agent failed before reply: All models failed
llamacpp/qwen3.6-35b: LLM request timed out
Config reload confirms models.providers.llamacpp.timeoutSeconds is accepted:
config hot reload applied (... models.providers.llamacpp.timeoutSeconds ...)
Adding agents.defaults.llm.idleTimeoutSeconds fails:
config reload skipped (invalid config): agents.defaults: Unrecognized key: "llm"
The local llama.cpp backend receives the request and starts processing a very large prompt:
task.n_tokens = 152780
prompt processing progress ...
n_tokens = 67584
srv stop: cancel task
done request: POST /v1/responses 200
This suggests the backend was still processing and did not crash. The request was cancelled before the model had a chance to produce a reply.
Impact and severity
Affected: local large-context llama.cpp users.
Severity: Critical for long-running local model workflows.
Frequency: 100% on very large prompts since updating to 2026.5.3-1.
Consequence: long local model requests cannot complete because OpenClaw cancels before the first reply.
Additional information
This appears to be a regression or schema/config mismatch around idle timeout behavior.
The old agents.defaults.llm.idleTimeoutSeconds path is rejected.
The new suggested models.providers.<id>.timeoutSeconds path is accepted but does not stop the idle watchdog cancellation.
Question:
What is the correct supported config key in 2026.5.3-1 to increase the idle watchdog timeout for long local model prefill?
Is there a separate hidden/default idle timeout still active even when models.providers.<id>.timeoutSeconds is set?
Bug type
Regression (worked before, now fails)
Beta release blocker
No
Summary
After updating to OpenClaw 2026.5.3-1, long local llama.cpp requests time out even though the backend is still actively processing the prompt.
The old
agents.defaults.llm.idleTimeoutSecondsconfig is rejected as an unrecognized key. The suggestedmodels.providers.<id>.timeoutSecondsconfig is accepted and hot reloaded, but it does not prevent the chat from being cut off during long local model prefill.This looks like the idle watchdog is still using another timeout path that is not configurable through the current schema.
Steps to reproduce
qwen3.6-35b.llamacpp/qwen3.6-35b.agents.defaults.llm.idleTimeoutSeconds.Unrecognized key: "llm".models.providers.llamacpp.timeoutSeconds.Expected behavior
OpenClaw should wait for the local provider according to the configured timeout, especially when the provider is still processing a long prefill and has not failed.
A long local llama.cpp prefill should not be treated as a failed model request merely because no output token has been produced yet.
Actual behavior
OpenClaw cuts off the agent request before the local llama.cpp backend finishes prompt prefill and produces a reply.
The llama.cpp server is still alive and processing. It is not dead. The request is cancelled from the OpenClaw side before completion.
When fallbacks are configured, OpenClaw then tries fallback providers even though the local model was still working.
OpenClaw version
2026.5.3-1
Operating system
Docker container on NAS, with local llama.cpp backend running on Mac.
Install method
Docker / custom image based on
ghcr.io/openclaw/openclaw:latest.Model
OpenClaw model id:
llamacpp/qwen3.6-35bLocal backend model:
Qwen3.6-35B-A3B-GGUFQwen3.6-35B-A3B-Q6_K.ggufProvider / routing chain
OpenClaw → local llama.cpp OpenAI-compatible server →
/v1/responsesAdditional provider/model setup details
Provider config:
Agent config:
Config that no longer works:
{ "agents": { "defaults": { "llm": { "idleTimeoutSeconds": 3600 } } } }This raises:
models.providers.llamacpp.timeoutSecondsis accepted by hot reload, but the long request still gets cancelled.Logs, screenshots, and evidence
OpenClaw logs show fallback and timeout behavior:
Config reload confirms
models.providers.llamacpp.timeoutSecondsis accepted:Adding
agents.defaults.llm.idleTimeoutSecondsfails:The local llama.cpp backend receives the request and starts processing a very large prompt:
This suggests the backend was still processing and did not crash. The request was cancelled before the model had a chance to produce a reply.
Impact and severity
Affected: local large-context llama.cpp users.
Severity: Critical for long-running local model workflows.
Frequency: 100% on very large prompts since updating to 2026.5.3-1.
Consequence: long local model requests cannot complete because OpenClaw cancels before the first reply.
Additional information
This appears to be a regression or schema/config mismatch around idle timeout behavior.
The old
agents.defaults.llm.idleTimeoutSecondspath is rejected.The new suggested
models.providers.<id>.timeoutSecondspath is accepted but does not stop the idle watchdog cancellation.Question:
What is the correct supported config key in 2026.5.3-1 to increase the idle watchdog timeout for long local model prefill?
Is there a separate hidden/default idle timeout still active even when
models.providers.<id>.timeoutSecondsis set?