Summary
Tool-calling is unreliable in embedded agent runs across provider failover.
In my local setup on OpenClaw 2026.3.7, I can reproduce all of the following within the same session:
volcengine-plan/ark-code-latest does produce real toolCall + toolResult
- the same run later gets aborted / times out before a stable final assistant response
- OpenClaw then attempts to continue via fallback / retry paths
kimi-coding/k2p5 may either:
- reply
TOOL_UNAVAILABLE without attempting a tool call, or
- answer from prior context instead of issuing a fresh tool call, or
- hit provider rate limits / timeouts during continuation
So this does not look like the older "tools parameter is never sent" bug from #8923.
It looks more like a session / embedded-run / failover reliability problem after tool execution has already started or completed.
Environment
- OpenClaw version:
2026.3.7
- OS: macOS arm64
- Primary model during repro:
volcengine-plan/ark-code-latest
- Fallback model during repro:
kimi-coding/k2p5
- Provider API types involved:
volcengine-plan: openai-completions
kimi-coding: anthropic-messages
Minimal Prompt Used
Use exactly one available tool to inspect the current working directory. Do not simulate a tool call or reuse prior results. If tool invocation is unavailable, reply with TOOL_UNAVAILABLE.
What I Observed
1. Volcengine first fails with connection error
Session file recorded:
{"stopReason":"error","errorMessage":"Connection error."}
2. Volcengine then successfully emits a real tool call
Recorded in session JSONL:
- assistant emits
toolCall with name:"exec"
- tool result is recorded immediately after
Excerpt:
{"type":"toolCall","name":"exec","arguments":{"command":"pwd && ls -la"}}
followed by:
{"role":"toolResult","toolName":"exec","isError":false}
3. Later tagged repro ([RUN V1]) also succeeds at toolCall + toolResult
Again, Volcengine emitted a real exec call and OpenClaw recorded the corresponding toolResult.
4. But the embedded run still ends as aborted / timed out
After the successful tool result, the same run was still marked aborted / timed out:
{"customType":"openclaw:prompt-error","data":{"error":"aborted"}}
and gateway logs showed:
[agent/embedded] embedded run timeout ... timeoutMs=45000
FailoverError: LLM request timed out.
5. Kimi fallback / continuation is inconsistent
In the same session history, after switching to / continuing with kimi-coding/k2p5, I observed:
- one response that returned:
with no fresh toolCall recorded for that turn
- another response that answered from prior context instead of clearly issuing a fresh tool call
- a provider-side 429 during continuation:
{"errorMessage":"429 {\"error\":{\"type\":\"rate_limit_error\",...}}"}
Why This Seems Distinct From Existing Issues
Expected Behavior
If a model successfully emits a tool call and OpenClaw records a valid tool result, then one of the following should happen deterministically:
- the same run completes cleanly with a final assistant response, or
- the run fails in a way that preserves coherent session state for the next retry / fallback attempt
Fallback / continuation should not degrade into:
- aborted runs after successful tool execution
- stale-context answers instead of fresh tool calls
TOOL_UNAVAILABLE from the fallback model when tools are in fact available in the session
Actual Behavior
Successful tool execution can still be followed by:
aborted
LLM request timed out
FailoverError: LLM request timed out
- fallback continuation that no longer behaves consistently with available tools
Related Issues
Suggested Areas To Inspect
- embedded run timeout behavior after a successful
toolResult
- failover / continuation serialization between provider adapters (
openai-completions -> anthropic-messages)
- whether tool availability / tool schema state is preserved correctly across aborted runs
- whether continuation prompts after timeout are causing models to infer from context instead of issuing tool calls
Local Evidence
I can provide the exact session JSONL / timestamps if helpful, but the key repro facts are already visible locally:
- real
toolCall + toolResult for volcengine-plan/ark-code-latest
- later
aborted for the same run
- subsequent fallback / continuation instability with
kimi-coding/k2p5
Summary
Tool-calling is unreliable in embedded agent runs across provider failover.
In my local setup on OpenClaw
2026.3.7, I can reproduce all of the following within the same session:volcengine-plan/ark-code-latestdoes produce realtoolCall+toolResultkimi-coding/k2p5may either:TOOL_UNAVAILABLEwithout attempting a tool call, orSo this does not look like the older "tools parameter is never sent" bug from #8923.
It looks more like a session / embedded-run / failover reliability problem after tool execution has already started or completed.
Environment
2026.3.7volcengine-plan/ark-code-latestkimi-coding/k2p5volcengine-plan:openai-completionskimi-coding:anthropic-messagesMinimal Prompt Used
What I Observed
1. Volcengine first fails with connection error
Session file recorded:
{"stopReason":"error","errorMessage":"Connection error."}2. Volcengine then successfully emits a real tool call
Recorded in session JSONL:
toolCallwithname:"exec"Excerpt:
{"type":"toolCall","name":"exec","arguments":{"command":"pwd && ls -la"}}followed by:
{"role":"toolResult","toolName":"exec","isError":false}3. Later tagged repro (
[RUN V1]) also succeeds at toolCall + toolResultAgain, Volcengine emitted a real
execcall and OpenClaw recorded the correspondingtoolResult.4. But the embedded run still ends as aborted / timed out
After the successful tool result, the same run was still marked aborted / timed out:
{"customType":"openclaw:prompt-error","data":{"error":"aborted"}}and gateway logs showed:
5. Kimi fallback / continuation is inconsistent
In the same session history, after switching to / continuing with
kimi-coding/k2p5, I observed:with no fresh
toolCallrecorded for that turn{"errorMessage":"429 {\"error\":{\"type\":\"rate_limit_error\",...}}"}Why This Seems Distinct From Existing Issues
volcengine-plan/ark-code-latest, because realtoolCall/toolResultentries existTOOL_UNAVAILABLEor stale/context-derived answers instead of fresh tool useExpected Behavior
If a model successfully emits a tool call and OpenClaw records a valid tool result, then one of the following should happen deterministically:
Fallback / continuation should not degrade into:
TOOL_UNAVAILABLEfrom the fallback model when tools are in fact available in the sessionActual Behavior
Successful tool execution can still be followed by:
abortedLLM request timed outFailoverError: LLM request timed outRelated Issues
Suggested Areas To Inspect
toolResultopenai-completions->anthropic-messages)Local Evidence
I can provide the exact session JSONL / timestamps if helpful, but the key repro facts are already visible locally:
toolCall+toolResultforvolcengine-plan/ark-code-latestabortedfor the same runkimi-coding/k2p5