No matter what I send and do (/reset, /new, etc).
Gateway log shows it used the endpoint's maximum context length as default output length, causing total tokens to exceed the context limit. Commands such as /compact, /reset or /new do not solve the issue since the token size request will always be larger than max context length allowed by the model.
Since I am using openrouter manage available models (including free models), it does not seem to be feasible to manually add every single model in models.providers.openrouter.models..maxTokens, in order to limit the output length.
Up to 2026.5.7, everything works fine.
As described. See log below.
02:51:22+00:00 info web-inbound {"module":"web-inbound"} {"from":"+12345678","to":"+12345678","body":"Hi","timestamp":1779591082000} inbound message
02:51:24+00:00 info web-auto-reply {"module":"web-auto-reply","runId":"d70e9c4f-802d-4085-bf65-xxxxxxx"} {"connectionId":"59b26aab-a847-4c96-bb46-xxxxxxx","correlationId":"3EB0E8xxxxxxxxxx","from":"+12345678","to":"+12345678","body":"[WhatsApp +12345678 +10m Sun 2026-05-24 10:51 GMT+8] (self): Hi","mediaType":null,"mediaPath":null} inbound web message
02:51:24+00:00 info gateway/channels/whatsapp/inbound {"subsystem":"gateway/channels/whatsapp/inbound"} Inbound message +1234567 -> +1234567 (direct, 66 chars)
02:51:24+00:00 info gateway/channels/openclaw-weixin [compat] Host OpenClaw 2026.5.20 >= 2026.3.22, OK.
02:51:52+00:00 info web-heartbeat {"module":"web-heartbeat","runId":"d70e9c4f-802d-4085-bf65-xxxxxxxx"} {"connectionId":"59b26aab-a847-4c96-bb46-xxxxxxxxx","reconnectAttempts":0,"messagesHandled":1,"lastInboundAt":1779591084456,"lastTransportActivityAt":1779591085005,"authAgeMs":301228.5109863281,"uptimeMs":301348} web gateway heartbeat
02:52:07+00:00 warn fetch-timeout {"subsystem":"fetch-timeout"} {"timeoutMs":10000,"elapsedMs":14256,"operation":"fetchWithTimeout","url":"https://api.telegram.org/botxxxxx/getMe"} fetch timeout reached; aborting operation
02:52:07+00:00 warn telegram/network {"subsystem":"telegram/network"} fetch fallback: DNS-resolved IP unreachable; trying alternative Telegram API IP (codes=none, reason=probe timeout/network error)
02:52:09+00:00 warn agent/embedded {"subsystem":"agent/embedded"} {"event":"embedded_run_agent_end","tags":["error_handling","lifecycle","agent_end","assistant_error"],"runId":"0a8a8b87-6da6-47ee-9331-xxxxxxxx","isError":true,"error":"Context overflow: prompt too large for the model. Try /reset (or /new) to start a fresh session, or use a larger-context model.","failoverReason":null,"model":"nvidia/nemotron-3-super-120b-a12b:free","provider":"openrouter","rawErrorPreview":"400 This endpoint's maximum context length is 262144 tokens. However, you requested about 284005 tokens (8685 of text input, 13176 of tool input, 262144 in the output). Please reduce the length of either one, or use the context-compression plugin to compress your prompt automatically.","rawErrorHash":"sha256:xxxxxx","providerRuntimeFailureKind":"unclassified"} embedded run agent end
02:52:09+00:00 warn agent/embedded {"subsystem":"agent/embedded"} [context-overflow-diag] sessionKey=agent:main:main provider=openrouter/nvidia/nemotron-3-super-120b-a12b:free source=assistantError messages=3 sessionFile=~/.openclaw/agents/main/sessions/8d0c814c-8e24-4920-8f54-xxxxxxx.jsonl diagId=ovf-xxxxxx-xxxxxx compactionAttempts=0 observedTokens=unknown error=400 This endpoint's maximum context length is 262144 tokens. However, you requested about 284005 tokens (8685 of text input, 13176 of tool input, 262144 in the output). Please reduce the length of eit
02:52:09+00:00 warn agent/embedded {"subsystem":"agent/embedded"} context overflow detected (attempt 1/3); attempting auto-compaction for openrouter/nvidia/nemotron-3-super-120b-a12b:free
02:52:21+00:00 info compaction-safeguard {"subsystem":"compaction-safeguard"} Compaction safeguard: using session branch messages after compaction preparation omitted real conversation content.
02:52:21+00:00 info agent/embedded {"subsystem":"agent/embedded"} auto-compaction succeeded for openrouter/nvidia/nemotron-3-super-120b-a12b:free; retrying prompt
02:52:21+00:00 info agents/post-compaction-guard {"subsystem":"agents/post-compaction-guard"} post-compaction guard armed for 3 attempts
02:52:22+00:00 warn agent/embedded {"subsystem":"agent/embedded"} {"event":"embedded_run_agent_end","tags":["error_handling","lifecycle","agent_end","assistant_error"],"runId":"0a8a8b87-6da6-47ee-9331-xxxxxxxx","isError":true,"error":"Context overflow: prompt too large for the model. Try /reset (or /new) to start a fresh session, or use a larger-context model.","failoverReason":null,"model":"nvidia/nemotron-3-super-120b-a12b:free","provider":"openrouter","rawErrorPreview":"400 This endpoint's maximum context length is 262144 tokens. However, you requested about 284283 tokens (8963 of text input, 13176 of tool input, 262144 in the output). Please reduce the length of either one, or use the context-compression plugin to compress your prompt automatically.","rawErrorHash":"sha256:xxxxxxxx","providerRuntimeFailureKind":"unclassified"} embedded run agent end
02:52:22+00:00 warn agent/embedded {"subsystem":"agent/embedded"} [context-overflow-diag] sessionKey=agent:main:main provider=openrouter/nvidia/nemotron-3-super-120b-a12b:free source=assistantError messages=6 sessionFile=~/.openclaw/agents/main/sessions/8d0c814c-8e24-4920-8f54-xxxxxxxxxx.jsonl diagId=ovf-xxxxxxx-xxxxxxx compactionAttempts=1 observedTokens=unknown error=400 This endpoint's maximum context length is 262144 tokens. However, you requested about 284283 tokens (8963 of text input, 13176 of tool input, 262144 in the output). Please reduce the length of eit
02:52:22+00:00 warn agent/embedded {"subsystem":"agent/embedded"} context overflow detected (attempt 2/3); attempting auto-compaction for openrouter/nvidia/nemotron-3-super-120b-a12b:free
02:52:32+00:00 info compaction-safeguard {"subsystem":"compaction-safeguard"} Compaction safeguard: using session branch messages after compaction preparation omitted real conversation content.
02:52:32+00:00 info agent/embedded {"subsystem":"agent/embedded"} auto-compaction succeeded for openrouter/nvidia/nemotron-3-super-120b-a12b:free; retrying prompt
02:52:32+00:00 info agents/post-compaction-guard {"subsystem":"agents/post-compaction-guard"} post-compaction guard armed for 3 attempts
02:52:34+00:00 warn agent/embedded {"subsystem":"agent/embedded"} {"event":"embedded_run_agent_end","tags":["error_handling","lifecycle","agent_end","assistant_error"],"runId":"0a8a8b87-6da6-47ee-9331-xxxxxxxxx","isError":true,"error":"Context overflow: prompt too large for the model. Try /reset (or /new) to start a fresh session, or use a larger-context model.","failoverReason":null,"model":"nvidia/nemotron-3-super-120b-a12b:free","provider":"openrouter","rawErrorPreview":"400 This endpoint's maximum context length is 262144 tokens. However, you requested about 284469 tokens (9149 of text input, 13176 of tool input, 262144 in the output). Please reduce the length of either one, or use the context-compression plugin to compress your prompt automatically.","rawErrorHash":"sha256:xxxxxxx","providerRuntimeFailureKind":"unclassified"} embedded run agent end
02:52:34+00:00 warn agent/embedded {"subsystem":"agent/embedded"} [context-overflow-diag] sessionKey=agent:main:main provider=openrouter/nvidia/nemotron-3-super-120b-a12b:free source=assistantError messages=6 sessionFile=~/.openclaw/agents/main/sessions/8d0c814c-8e24-4920-8f54-xxxxxxx.jsonl diagId=ovf-xxxxxxx-xxxxxxx compactionAttempts=2 observedTokens=unknown error=400 This endpoint's maximum context length is 262144 tokens. However, you requested about 284469 tokens (9149 of text input, 13176 of tool input, 262144 in the output). Please reduce the length of eit
02:52:34+00:00 warn agent/embedded {"subsystem":"agent/embedded"} context overflow detected (attempt 3/3); attempting auto-compaction for openrouter/nvidia/nemotron-3-super-120b-a12b:free
02:52:43+00:00 info compaction-safeguard {"subsystem":"compaction-safeguard"} Compaction safeguard: using session branch messages after compaction preparation omitted real conversation content.
02:52:43+00:00 info agent/embedded {"subsystem":"agent/embedded"} auto-compaction succeeded for openrouter/nvidia/nemotron-3-super-120b-a12b:free; retrying prompt
02:52:43+00:00 info agents/post-compaction-guard {"subsystem":"agents/post-compaction-guard"} post-compaction guard armed for 3 attempts
02:52:44+00:00 warn agent/embedded {"subsystem":"agent/embedded"} {"event":"embedded_run_agent_end","tags":["error_handling","lifecycle","agent_end","assistant_error"],"runId":"0a8a8b87-6da6-xxxxxx-9331-xxxxxxx","isError":true,"error":"Context overflow: prompt too large for the model. Try /reset (or /new) to start a fresh session, or use a larger-context model.","failoverReason":null,"model":"nvidia/nemotron-3-super-120b-a12b:free","provider":"openrouter","rawErrorPreview":"400 This endpoint's maximum context length is 262144 tokens. However, you requested about 284671 tokens (9351 of text input, 13176 of tool input, 262144 in the output). Please reduce the length of either one, or use the context-compression plugin to compress your prompt automatically.","rawErrorHash":"sha256:xxxxxxxx","providerRuntimeFailureKind":"unclassified"} embedded run agent end
02:52:44+00:00 warn agent/embedded {"subsystem":"agent/embedded"} [context-overflow-diag] sessionKey=agent:main:main provider=openrouter/nvidia/nemotron-3-super-120b-a12b:free source=assistantError messages=6 sessionFile=~/.openclaw/agents/main/sessions/8d0c814c-8e24-4920-8f54-xxxxxxxx.jsonl diagId=ovf-mpj6mqcq-AygPFQ compactionAttempts=3 observedTokens=unknown error=400 This endpoint's maximum context length is 262144 tokens. However, you requested about 284671 tokens (9351 of text input, 13176 of tool input, 262144 in the output). Please reduce the length of eit
02:52:44+00:00 error Auto-compaction failed (400 This endpoint's maximum context length is 262144 tokens. However, you requested about 284671 tokens (9351 of text input, 13176 of tool input, 262144 in the output). Please reduce the length of either one, or use the context-compression plugin to compress your prompt automatically.). Restarting session agent:main:main -> fb763ae7-5cbe-4e0a-ab6c-xxxxxxxx and retrying.
Bug type
Regression (worked before, now fails)
Beta release blocker
No
Summary
After upgrading to 2026.5.12 or later versions, agent always reply:
To prevent this, increase your compaction buffer by setting agents.defaults.compaction.reserveTokensFloor to 20000 or higher in your config.
No matter what I send and do (/reset, /new, etc).
Gateway log shows it used the endpoint's maximum context length as default output length, causing total tokens to exceed the context limit. Commands such as /compact, /reset or /new do not solve the issue since the token size request will always be larger than max context length allowed by the model.
Since I am using openrouter manage available models (including free models), it does not seem to be feasible to manually add every single model in models.providers.openrouter.models..maxTokens, in order to limit the output length.
Steps to reproduce
Expected behavior
Up to 2026.5.7, everything works fine.
Actual behavior
As described. See log below.
OpenClaw version
2026.5.12 and 2026.5.20
Operating system
Ubuntu 24.04
Install method
npm global
Model
openrouter
Provider / routing chain
openclaw --> openrouter --> nvidia/nemotron-3-super-120b-a12b:free
Additional provider/model setup details
No response
Logs, screenshots, and evidence
02:51:22+00:00 info web-inbound {"module":"web-inbound"} {"from":"+12345678","to":"+12345678","body":"Hi","timestamp":1779591082000} inbound message 02:51:24+00:00 info web-auto-reply {"module":"web-auto-reply","runId":"d70e9c4f-802d-4085-bf65-xxxxxxx"} {"connectionId":"59b26aab-a847-4c96-bb46-xxxxxxx","correlationId":"3EB0E8xxxxxxxxxx","from":"+12345678","to":"+12345678","body":"[WhatsApp +12345678 +10m Sun 2026-05-24 10:51 GMT+8] (self): Hi","mediaType":null,"mediaPath":null} inbound web message 02:51:24+00:00 info gateway/channels/whatsapp/inbound {"subsystem":"gateway/channels/whatsapp/inbound"} Inbound message +1234567 -> +1234567 (direct, 66 chars) 02:51:24+00:00 info gateway/channels/openclaw-weixin [compat] Host OpenClaw 2026.5.20 >= 2026.3.22, OK. 02:51:52+00:00 info web-heartbeat {"module":"web-heartbeat","runId":"d70e9c4f-802d-4085-bf65-xxxxxxxx"} {"connectionId":"59b26aab-a847-4c96-bb46-xxxxxxxxx","reconnectAttempts":0,"messagesHandled":1,"lastInboundAt":1779591084456,"lastTransportActivityAt":1779591085005,"authAgeMs":301228.5109863281,"uptimeMs":301348} web gateway heartbeat 02:52:07+00:00 warn fetch-timeout {"subsystem":"fetch-timeout"} {"timeoutMs":10000,"elapsedMs":14256,"operation":"fetchWithTimeout","url":"https://api.telegram.org/botxxxxx/getMe"} fetch timeout reached; aborting operation 02:52:07+00:00 warn telegram/network {"subsystem":"telegram/network"} fetch fallback: DNS-resolved IP unreachable; trying alternative Telegram API IP (codes=none, reason=probe timeout/network error) 02:52:09+00:00 warn agent/embedded {"subsystem":"agent/embedded"} {"event":"embedded_run_agent_end","tags":["error_handling","lifecycle","agent_end","assistant_error"],"runId":"0a8a8b87-6da6-47ee-9331-xxxxxxxx","isError":true,"error":"Context overflow: prompt too large for the model. Try /reset (or /new) to start a fresh session, or use a larger-context model.","failoverReason":null,"model":"nvidia/nemotron-3-super-120b-a12b:free","provider":"openrouter","rawErrorPreview":"400 This endpoint's maximum context length is 262144 tokens. However, you requested about 284005 tokens (8685 of text input, 13176 of tool input, 262144 in the output). Please reduce the length of either one, or use the context-compression plugin to compress your prompt automatically.","rawErrorHash":"sha256:xxxxxx","providerRuntimeFailureKind":"unclassified"} embedded run agent end 02:52:09+00:00 warn agent/embedded {"subsystem":"agent/embedded"} [context-overflow-diag] sessionKey=agent:main:main provider=openrouter/nvidia/nemotron-3-super-120b-a12b:free source=assistantError messages=3 sessionFile=~/.openclaw/agents/main/sessions/8d0c814c-8e24-4920-8f54-xxxxxxx.jsonl diagId=ovf-xxxxxx-xxxxxx compactionAttempts=0 observedTokens=unknown error=400 This endpoint's maximum context length is 262144 tokens. However, you requested about 284005 tokens (8685 of text input, 13176 of tool input, 262144 in the output). Please reduce the length of eit 02:52:09+00:00 warn agent/embedded {"subsystem":"agent/embedded"} context overflow detected (attempt 1/3); attempting auto-compaction for openrouter/nvidia/nemotron-3-super-120b-a12b:free 02:52:21+00:00 info compaction-safeguard {"subsystem":"compaction-safeguard"} Compaction safeguard: using session branch messages after compaction preparation omitted real conversation content. 02:52:21+00:00 info agent/embedded {"subsystem":"agent/embedded"} auto-compaction succeeded for openrouter/nvidia/nemotron-3-super-120b-a12b:free; retrying prompt 02:52:21+00:00 info agents/post-compaction-guard {"subsystem":"agents/post-compaction-guard"} post-compaction guard armed for 3 attempts 02:52:22+00:00 warn agent/embedded {"subsystem":"agent/embedded"} {"event":"embedded_run_agent_end","tags":["error_handling","lifecycle","agent_end","assistant_error"],"runId":"0a8a8b87-6da6-47ee-9331-xxxxxxxx","isError":true,"error":"Context overflow: prompt too large for the model. Try /reset (or /new) to start a fresh session, or use a larger-context model.","failoverReason":null,"model":"nvidia/nemotron-3-super-120b-a12b:free","provider":"openrouter","rawErrorPreview":"400 This endpoint's maximum context length is 262144 tokens. However, you requested about 284283 tokens (8963 of text input, 13176 of tool input, 262144 in the output). Please reduce the length of either one, or use the context-compression plugin to compress your prompt automatically.","rawErrorHash":"sha256:xxxxxxxx","providerRuntimeFailureKind":"unclassified"} embedded run agent end 02:52:22+00:00 warn agent/embedded {"subsystem":"agent/embedded"} [context-overflow-diag] sessionKey=agent:main:main provider=openrouter/nvidia/nemotron-3-super-120b-a12b:free source=assistantError messages=6 sessionFile=~/.openclaw/agents/main/sessions/8d0c814c-8e24-4920-8f54-xxxxxxxxxx.jsonl diagId=ovf-xxxxxxx-xxxxxxx compactionAttempts=1 observedTokens=unknown error=400 This endpoint's maximum context length is 262144 tokens. However, you requested about 284283 tokens (8963 of text input, 13176 of tool input, 262144 in the output). Please reduce the length of eit 02:52:22+00:00 warn agent/embedded {"subsystem":"agent/embedded"} context overflow detected (attempt 2/3); attempting auto-compaction for openrouter/nvidia/nemotron-3-super-120b-a12b:free 02:52:32+00:00 info compaction-safeguard {"subsystem":"compaction-safeguard"} Compaction safeguard: using session branch messages after compaction preparation omitted real conversation content. 02:52:32+00:00 info agent/embedded {"subsystem":"agent/embedded"} auto-compaction succeeded for openrouter/nvidia/nemotron-3-super-120b-a12b:free; retrying prompt 02:52:32+00:00 info agents/post-compaction-guard {"subsystem":"agents/post-compaction-guard"} post-compaction guard armed for 3 attempts 02:52:34+00:00 warn agent/embedded {"subsystem":"agent/embedded"} {"event":"embedded_run_agent_end","tags":["error_handling","lifecycle","agent_end","assistant_error"],"runId":"0a8a8b87-6da6-47ee-9331-xxxxxxxxx","isError":true,"error":"Context overflow: prompt too large for the model. Try /reset (or /new) to start a fresh session, or use a larger-context model.","failoverReason":null,"model":"nvidia/nemotron-3-super-120b-a12b:free","provider":"openrouter","rawErrorPreview":"400 This endpoint's maximum context length is 262144 tokens. However, you requested about 284469 tokens (9149 of text input, 13176 of tool input, 262144 in the output). Please reduce the length of either one, or use the context-compression plugin to compress your prompt automatically.","rawErrorHash":"sha256:xxxxxxx","providerRuntimeFailureKind":"unclassified"} embedded run agent end 02:52:34+00:00 warn agent/embedded {"subsystem":"agent/embedded"} [context-overflow-diag] sessionKey=agent:main:main provider=openrouter/nvidia/nemotron-3-super-120b-a12b:free source=assistantError messages=6 sessionFile=~/.openclaw/agents/main/sessions/8d0c814c-8e24-4920-8f54-xxxxxxx.jsonl diagId=ovf-xxxxxxx-xxxxxxx compactionAttempts=2 observedTokens=unknown error=400 This endpoint's maximum context length is 262144 tokens. However, you requested about 284469 tokens (9149 of text input, 13176 of tool input, 262144 in the output). Please reduce the length of eit 02:52:34+00:00 warn agent/embedded {"subsystem":"agent/embedded"} context overflow detected (attempt 3/3); attempting auto-compaction for openrouter/nvidia/nemotron-3-super-120b-a12b:free 02:52:43+00:00 info compaction-safeguard {"subsystem":"compaction-safeguard"} Compaction safeguard: using session branch messages after compaction preparation omitted real conversation content. 02:52:43+00:00 info agent/embedded {"subsystem":"agent/embedded"} auto-compaction succeeded for openrouter/nvidia/nemotron-3-super-120b-a12b:free; retrying prompt 02:52:43+00:00 info agents/post-compaction-guard {"subsystem":"agents/post-compaction-guard"} post-compaction guard armed for 3 attempts 02:52:44+00:00 warn agent/embedded {"subsystem":"agent/embedded"} {"event":"embedded_run_agent_end","tags":["error_handling","lifecycle","agent_end","assistant_error"],"runId":"0a8a8b87-6da6-xxxxxx-9331-xxxxxxx","isError":true,"error":"Context overflow: prompt too large for the model. Try /reset (or /new) to start a fresh session, or use a larger-context model.","failoverReason":null,"model":"nvidia/nemotron-3-super-120b-a12b:free","provider":"openrouter","rawErrorPreview":"400 This endpoint's maximum context length is 262144 tokens. However, you requested about 284671 tokens (9351 of text input, 13176 of tool input, 262144 in the output). Please reduce the length of either one, or use the context-compression plugin to compress your prompt automatically.","rawErrorHash":"sha256:xxxxxxxx","providerRuntimeFailureKind":"unclassified"} embedded run agent end 02:52:44+00:00 warn agent/embedded {"subsystem":"agent/embedded"} [context-overflow-diag] sessionKey=agent:main:main provider=openrouter/nvidia/nemotron-3-super-120b-a12b:free source=assistantError messages=6 sessionFile=~/.openclaw/agents/main/sessions/8d0c814c-8e24-4920-8f54-xxxxxxxx.jsonl diagId=ovf-mpj6mqcq-AygPFQ compactionAttempts=3 observedTokens=unknown error=400 This endpoint's maximum context length is 262144 tokens. However, you requested about 284671 tokens (9351 of text input, 13176 of tool input, 262144 in the output). Please reduce the length of eit 02:52:44+00:00 error Auto-compaction failed (400 This endpoint's maximum context length is 262144 tokens. However, you requested about 284671 tokens (9351 of text input, 13176 of tool input, 262144 in the output). Please reduce the length of either one, or use the context-compression plugin to compress your prompt automatically.). Restarting session agent:main:main -> fb763ae7-5cbe-4e0a-ab6c-xxxxxxxx and retrying.Impact and severity
No response
Additional information
No response