Bug type
Regression (worked before, now fails)
Summary
When assigning tasks in OpenClaw, the system frequently appears to accept the request, but the agent does not actually complete the task. In affected runs, the UI shows placeholder responses such as "One sec" / "let me actually test it now", and the task either appears stuck or provides little/no visible execution detail in the chat/feed.
The underlying logs suggest this is a mix of:
- upstream model failures (
API rate limit reached, HTTP 401 missing scopes: model.request, LLM request timed out)
- gateway instability/restarts during active webchat sessions
- agent responses that look like conversational filler rather than real tool execution
- a visibility gap where backend task logs exist, but the user-facing experience makes it look like there was no real work/logging
Environment
- OS: macOS
darwin 25.3.0
- OpenClaw config primary model:
kimi-coding/k2p5
- Gateway model at runtime:
kimi-coding/k2p5
- Webchat client observed:
openclaw-control-ui webchat v2026.3.7
- Local backend health during investigation:
{"status":"healthy","timestamp":"2026-03-08T16:35:09.062Z","uptime":364.136322625,"agentSessions":{}}
User-facing symptoms
- Asking OpenClaw to do a task does not reliably result in actual execution.
- Instead, agents sometimes respond with placeholder text like:
Still planning to check gog — let me do that now. One sec.
Let me actually check gog now instead of just saying I will. One sec.
- Earlier runs appeared to show no useful activity, even though backend task logs existed.
- The result is that from the UI it looks like OpenClaw accepted the request but did not actually do the work.
Reproduction
This appears intermittent, but the general repro is:
- Start OpenClaw with
kimi-coding/k2p5 as the primary model.
- Open the webchat UI.
- Assign a task to an agent/workspace.
- Observe one of the following:
- task stalls
- placeholder/non-executing reply
- no meaningful progress visible in chat/feed
- backend shows model/gateway failures
Expected behavior
- Task assignment should either:
- execute successfully and show meaningful progress/logs, or
- fail clearly with a surfaced error state
- Agents should not emit conversational filler as if they are about to work unless they are actually proceeding with execution.
- If backend task logs exist, the UI should make that visible enough that the user does not conclude "nothing happened".
Actual behavior
- Tasks are accepted, but execution is unreliable.
- Model failures occur in the background.
- Some agent responses are filler/placeholder text rather than actual completion.
- Gateway restarts/disconnects occur around active webchat sessions.
- User perception is that OpenClaw did nothing and showed no activity.
Evidence
1. Gateway is running on kimi-coding/k2p5
From logs/gateway.log on 2026-03-08:
2026-03-08T16:11:06.004+00:00 [gateway] agent model: kimi-coding/k2p5
2026-03-08T16:15:44.998+00:00 [gateway] agent model: kimi-coding/k2p5
2026-03-08T16:16:01.225+00:00 [gateway] agent model: kimi-coding/k2p5
2. Model/gateway failures during active use
From logs/gateway.err.log:
2026-03-08T16:01:23.986+00:00 [agent/embedded] embedded run agent end: runId=d84b6658-b26b-47ba-8e59-6abd6fd9b219 isError=true error=⚠️ API rate limit reached. Please try again later.
2026-03-08T16:03:13.125+00:00 [agent/embedded] embedded run agent end: runId=39c24751-1ac6-483f-aa97-7ae59512b7f4 isError=true error=⚠️ API rate limit reached. Please try again later.
2026-03-08T16:09:01.233+00:00 [agent/embedded] embedded run agent end: runId=cf5e5492-92c5-4f97-a10d-ec9d16c6f3c4 isError=true error=HTTP 401: You have insufficient permissions for this operation. Missing scopes: model.request.
2026-03-08T16:10:01.115+00:00 [agent/embedded] embedded run agent end: runId=cf5e5492-92c5-4f97-a10d-ec9d16c6f3c4 isError=true error=LLM request timed out.
3. Gateway restarts while webchat is connected
From logs/gateway.log:
2026-03-08T16:11:44.852+00:00 [ws] webchat connected conn=2e046649-12fa-4a56-b141-1675d7de3c35 remote=127.0.0.1 client=openclaw-control-ui webchat v2026.3.7
2026-03-08T16:11:57.802+00:00 [ws] ⇄ res ✓ chat.send 51ms runId=bfec63f1-cb1d-4db5-a0b0-eaa4944be7cd conn=2e046649…3c35 id=403a6e55…aaf9
2026-03-08T16:15:42.221+00:00 [gateway] signal SIGTERM received
2026-03-08T16:15:42.224+00:00 [gateway] received SIGTERM; shutting down
2026-03-08T16:15:42.275+00:00 [ws] webchat disconnected code=1012 reason=service restart conn=14d4c1d0-ff76-4879-954b-e170ba264b00
4. Task logs show repeated orchestration retries / fallback behavior
Representative stuck task log (t38459756, "Process bank statements for March"):
Task claimed by Jerry — moved to in_progress
Orchestrator mode — sending decomposition request to Jerry
No response from orchestrator — falling back to planning spec
Orchestrator mode — sending decomposition request to Jerry
No response from orchestrator — falling back to planning spec
...
5. Placeholder/non-executing agent replies
Later logs for the same task show agents returning filler text:
Alex responded (68 chars)
detail="Hey! 👋
Still planning to check gog — let me do that now. One sec."
Nora responded (74 chars)
detail="Hey! Let me actually check gog now instead of just saying I will. One sec."
This is particularly problematic because it reads like execution is starting, but in practice these messages are not reliable evidence that useful work is happening.
Suspected root causes
- Primary model path is unstable under current account/quota conditions.
- Failover chain includes providers/profiles that can error with auth scope issues.
- Gateway restarts may interrupt active chat/task execution.
- Agent prompt/tool handoff may allow filler responses to be treated as acceptable task progress.
- UI/log surfacing may not make backend execution state obvious enough when failures happen.
Suggested fixes
- Surface model/provider failures directly in the UI when task execution fails.
- Mark tasks as failed/degraded when agent runs terminate with quota/auth/timeout errors.
- Prevent filler responses like "One sec" from being treated as meaningful task progress.
- Improve webchat resilience around gateway restart/disconnect/reconnect events.
- Expose backend task logs more clearly in the main user flow so failures are visible without digging.
- Consider validating provider auth scopes and quota health before dispatching tasks.
Additional local factor: duplicate clients and handshake timeout spam
During investigation, the gateway error log showed a continuous stream of handshake timeout / closed before connect warnings, repeating every ~11 seconds:
2026-03-08T16:36:26 [gateway/ws] handshake timeout conn=09798fa7… remote=127.0.0.1
2026-03-08T16:36:26 [gateway/ws] closed before connect conn=09798fa7… remote=127.0.0.1 … code=1000 reason=n/a
2026-03-08T16:36:37 [gateway/ws] handshake timeout conn=7001b80d… remote=127.0.0.1
...
Process inspection revealed multiple concurrent local clients connected to ws://127.0.0.1:18789:
| Client |
PID |
Connection style |
Status |
Dashboard backend (server.js) |
1328 |
Challenge/connect handshake |
ESTABLISHED |
Duplicate dashboard backend (server.js) |
2040 |
Challenge/connect handshake |
ESTABLISHED |
| Chrome / webchat UI |
1436 |
Webchat protocol |
ESTABLISHED |
| Cursor extension host |
2427 |
Extension bridge |
ESTABLISHED |
Legacy Python bridge (openclaw-bridge-simple.py) |
600 |
Old ws?agent=...&token=... direct style |
Running |
| Apple WebKit networking |
5484 |
Unknown |
ESTABLISHED |
Why this matters
- The duplicate dashboard dev stack means two
server.js instances both try to connect to the gateway, dispatch tasks, and poll for results — potentially causing duplicate or conflicting task execution.
- The legacy Python bridge uses an older direct WebSocket URL format (
/ws?agent=...&token=...) instead of the newer challenge/connect handshake protocol. This mismatch likely causes the gateway to accept the TCP connection but fail the handshake, producing the timeout spam.
- Multiple clients competing for the same agent sessions can cause:
- Race conditions in task dispatch
- Duplicate orchestration attempts
- One client consuming a response meant for another
- Gateway resource contention
Mitigation
Stopping the duplicate dashboard processes and the legacy Python bridge reduced the handshake timeout noise:
kill 1924 1942 1944 1998 2036 2040 600
Suggestion for OpenClaw
- The gateway should log which client identity is failing the handshake (client name, auth method attempted) so users can identify the offending process.
- Consider rejecting incompatible/legacy handshake attempts with a clear error message rather than silently timing out.
- Document that only one dashboard backend should connect to the gateway at a time.
Impact
- Core task execution appears unreliable.
- Users lose trust because OpenClaw sounds like it is working while not actually completing work.
- Lack of clear surfaced failure state makes debugging harder.
- Duplicate local clients can amplify failures and make debugging significantly harder.
Steps to reproduce
Steps to reproduce
- Open the OpenClaw gateway.
- Open the webchat.
- Ask it to do a particular task.
- It says it’s doing the task, but nothing actually happens — the pulsing red border blinking for a second and then nothing happen with no progress.
Expected behavior
- The agent should start doing the task, with the red border pulsing as it responds.
- Or, if I refresh the page, I should see the logs of what it’s doing to complete the task.
- Right now neither of those happens.
Actual behavior
It just says it’s looking at the task, and nothing happens.
OpenClaw version
Version 2026.3.7
Operating system
macOS (Darwin 25.3.0) / Tahoe 26.3.1
Install method
No response
Logs, screenshots, and evidence
Impact and severity
No response
Additional information
No response
Bug type
Regression (worked before, now fails)
Summary
When assigning tasks in OpenClaw, the system frequently appears to accept the request, but the agent does not actually complete the task. In affected runs, the UI shows placeholder responses such as "One sec" / "let me actually test it now", and the task either appears stuck or provides little/no visible execution detail in the chat/feed.
The underlying logs suggest this is a mix of:
API rate limit reached,HTTP 401 missing scopes: model.request,LLM request timed out)Environment
darwin 25.3.0kimi-coding/k2p5kimi-coding/k2p5openclaw-control-ui webchat v2026.3.7{"status":"healthy","timestamp":"2026-03-08T16:35:09.062Z","uptime":364.136322625,"agentSessions":{}}User-facing symptoms
Still planning to check gog — let me do that now. One sec.Let me actually check gog now instead of just saying I will. One sec.Reproduction
This appears intermittent, but the general repro is:
kimi-coding/k2p5as the primary model.Expected behavior
Actual behavior
Evidence
1. Gateway is running on
kimi-coding/k2p5From
logs/gateway.logon 2026-03-08:2. Model/gateway failures during active use
From
logs/gateway.err.log:3. Gateway restarts while webchat is connected
From
logs/gateway.log:4. Task logs show repeated orchestration retries / fallback behavior
Representative stuck task log (
t38459756, "Process bank statements for March"):5. Placeholder/non-executing agent replies
Later logs for the same task show agents returning filler text:
This is particularly problematic because it reads like execution is starting, but in practice these messages are not reliable evidence that useful work is happening.
Suspected root causes
Suggested fixes
Additional local factor: duplicate clients and handshake timeout spam
During investigation, the gateway error log showed a continuous stream of
handshake timeout/closed before connectwarnings, repeating every ~11 seconds:Process inspection revealed multiple concurrent local clients connected to
ws://127.0.0.1:18789:server.js)server.js)openclaw-bridge-simple.py)ws?agent=...&token=...direct styleWhy this matters
server.jsinstances both try to connect to the gateway, dispatch tasks, and poll for results — potentially causing duplicate or conflicting task execution./ws?agent=...&token=...) instead of the newer challenge/connect handshake protocol. This mismatch likely causes the gateway to accept the TCP connection but fail the handshake, producing the timeout spam.Mitigation
Stopping the duplicate dashboard processes and the legacy Python bridge reduced the handshake timeout noise:
kill 1924 1942 1944 1998 2036 2040 600Suggestion for OpenClaw
Impact
Steps to reproduce
Steps to reproduce
Expected behavior
Actual behavior
It just says it’s looking at the task, and nothing happens.
OpenClaw version
Version 2026.3.7
Operating system
macOS (Darwin 25.3.0) / Tahoe 26.3.1
Install method
No response
Logs, screenshots, and evidence
Impact and severity
No response
Additional information
No response