Skip to content

fix(agent): detect truncated streaming tool calls before execution#6776

Closed
AIandI0x1 wants to merge 1 commit into
NousResearch:mainfrom
AIandI0x1:fix/issue-6638-truncated-tool-calls
Closed

fix(agent): detect truncated streaming tool calls before execution#6776
AIandI0x1 wants to merge 1 commit into
NousResearch:mainfrom
AIandI0x1:fix/issue-6638-truncated-tool-calls

Conversation

@AIandI0x1

Copy link
Copy Markdown
Contributor

Summary

  • detect truncated streamed tool-call arguments during mock response reconstruction
  • upgrade truncated streamed tool-call responses to finish_reason="length"
  • refuse to execute truncated tool-call responses and return a partial truncation error instead
  • add regression coverage for both streaming reconstruction and run_conversation() behavior

Root cause

The OpenAI-compatible streaming response builder reconstructed a mock chat-completions response and defaulted missing finish reasons to "stop".

If the stream died mid-tool-call, the accumulated function.arguments could be invalid/incomplete JSON, but the rebuilt response still looked like a normal completed assistant turn.

That meant truncated tool-call responses could be misclassified as successful completions instead of surfaced as truncation failures.

Fix

  • validate accumulated streamed tool-call arguments while rebuilding the mock response
  • if any tool-call arguments are invalid JSON, set the reconstructed finish reason to "length"
  • in the main loop, when a length response contains tool calls, return a partial truncation error immediately and do not execute any tools
  • keep the existing continuation path for normal truncated text-only responses

Validation

  • python -m pytest tests/run_agent/test_run_agent.py::TestStreamingApiCall::test_truncated_tool_call_args_upgrade_finish_reason_to_length tests/run_agent/test_run_agent.py::TestRunConversation::test_length_with_tool_calls_returns_partial_without_executing_tools tests/run_agent/test_run_agent.py::TestRunConversation::test_length_finish_reason_requests_continuation -q
  • python -m pytest tests/run_agent/test_run_agent.py -q
  • python -m py_compile run_agent.py tests/run_agent/test_run_agent.py

All of the above passed locally.

Closes #6638

teknium1 pushed a commit that referenced this pull request Apr 9, 2026
When a streaming response is cut mid-tool-call (connection drop, timeout),
the accumulated function.arguments is invalid JSON. The mock response
builder defaulted finish_reason to 'stop', so the agent loop treated it
as a valid completed turn and tried to execute tools with broken args.

Fix: validate tool call arguments with json.loads() during mock response
reconstruction. If any are invalid JSON, override finish_reason to
'length'. In the main loop's length handler, if tool calls are present,
refuse to execute and return partial=True with a clear error instead of
silently failing or wasting retries.

Also fixes _thinking_exhausted to not short-circuit when tool calls are
present — truncated tool calls are not thinking exhaustion.

Original cherry-picked from PR #6776 by AIandI0x1.
Closes #6638.
teknium1 pushed a commit that referenced this pull request Apr 10, 2026
When a streaming response is cut mid-tool-call (connection drop, timeout),
the accumulated function.arguments is invalid JSON. The mock response
builder defaulted finish_reason to 'stop', so the agent loop treated it
as a valid completed turn and tried to execute tools with broken args.

Fix: validate tool call arguments with json.loads() during mock response
reconstruction. If any are invalid JSON, override finish_reason to
'length'. In the main loop's length handler, if tool calls are present,
refuse to execute and return partial=True with a clear error instead of
silently failing or wasting retries.

Also fixes _thinking_exhausted to not short-circuit when tool calls are
present — truncated tool calls are not thinking exhaustion.

Original cherry-picked from PR #6776 by AIandI0x1.
Closes #6638.
@teknium1

Copy link
Copy Markdown
Contributor

Merged via PR #6847. Your commit was cherry-picked onto current main with authorship preserved. We added a single retry before refusing — if the first attempt has truncated tool args, the agent retries the API call once before giving up. Thanks for the fix!

@teknium1 teknium1 closed this Apr 10, 2026
@AIandI0x1 AIandI0x1 deleted the fix/issue-6638-truncated-tool-calls branch April 10, 2026 22:45
Tommyeds pushed a commit to Tommyeds/hermes-agent that referenced this pull request Apr 12, 2026
When a streaming response is cut mid-tool-call (connection drop, timeout),
the accumulated function.arguments is invalid JSON. The mock response
builder defaulted finish_reason to 'stop', so the agent loop treated it
as a valid completed turn and tried to execute tools with broken args.

Fix: validate tool call arguments with json.loads() during mock response
reconstruction. If any are invalid JSON, override finish_reason to
'length'. In the main loop's length handler, if tool calls are present,
refuse to execute and return partial=True with a clear error instead of
silently failing or wasting retries.

Also fixes _thinking_exhausted to not short-circuit when tool calls are
present — truncated tool calls are not thinking exhaustion.

Original cherry-picked from PR NousResearch#6776 by AIandI0x1.
Closes NousResearch#6638.
ulasbilgen pushed a commit to ulasbilgen/hermes-adhd-agent that referenced this pull request May 1, 2026
When a streaming response is cut mid-tool-call (connection drop, timeout),
the accumulated function.arguments is invalid JSON. The mock response
builder defaulted finish_reason to 'stop', so the agent loop treated it
as a valid completed turn and tried to execute tools with broken args.

Fix: validate tool call arguments with json.loads() during mock response
reconstruction. If any are invalid JSON, override finish_reason to
'length'. In the main loop's length handler, if tool calls are present,
refuse to execute and return partial=True with a clear error instead of
silently failing or wasting retries.

Also fixes _thinking_exhausted to not short-circuit when tool calls are
present — truncated tool calls are not thinking exhaustion.

Original cherry-picked from PR NousResearch#6776 by AIandI0x1.
Closes NousResearch#6638.
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
When a streaming response is cut mid-tool-call (connection drop, timeout),
the accumulated function.arguments is invalid JSON. The mock response
builder defaulted finish_reason to 'stop', so the agent loop treated it
as a valid completed turn and tried to execute tools with broken args.

Fix: validate tool call arguments with json.loads() during mock response
reconstruction. If any are invalid JSON, override finish_reason to
'length'. In the main loop's length handler, if tool calls are present,
refuse to execute and return partial=True with a clear error instead of
silently failing or wasting retries.

Also fixes _thinking_exhausted to not short-circuit when tool calls are
present — truncated tool calls are not thinking exhaustion.

Original cherry-picked from PR NousResearch#6776 by AIandI0x1.
Closes NousResearch#6638.
olympus-terminal pushed a commit to olympus-terminal/hermes-agent that referenced this pull request May 16, 2026
When a streaming response is cut mid-tool-call (connection drop, timeout),
the accumulated function.arguments is invalid JSON. The mock response
builder defaulted finish_reason to 'stop', so the agent loop treated it
as a valid completed turn and tried to execute tools with broken args.

Fix: validate tool call arguments with json.loads() during mock response
reconstruction. If any are invalid JSON, override finish_reason to
'length'. In the main loop's length handler, if tool calls are present,
refuse to execute and return partial=True with a clear error instead of
silently failing or wasting retries.

Also fixes _thinking_exhausted to not short-circuit when tool calls are
present — truncated tool calls are not thinking exhaustion.

Original cherry-picked from PR NousResearch#6776 by AIandI0x1.
Closes NousResearch#6638.
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
When a streaming response is cut mid-tool-call (connection drop, timeout),
the accumulated function.arguments is invalid JSON. The mock response
builder defaulted finish_reason to 'stop', so the agent loop treated it
as a valid completed turn and tried to execute tools with broken args.

Fix: validate tool call arguments with json.loads() during mock response
reconstruction. If any are invalid JSON, override finish_reason to
'length'. In the main loop's length handler, if tool calls are present,
refuse to execute and return partial=True with a clear error instead of
silently failing or wasting retries.

Also fixes _thinking_exhausted to not short-circuit when tool calls are
present — truncated tool calls are not thinking exhaustion.

Original cherry-picked from PR NousResearch#6776 by AIandI0x1.
Closes NousResearch#6638.
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
When a streaming response is cut mid-tool-call (connection drop, timeout),
the accumulated function.arguments is invalid JSON. The mock response
builder defaulted finish_reason to 'stop', so the agent loop treated it
as a valid completed turn and tried to execute tools with broken args.

Fix: validate tool call arguments with json.loads() during mock response
reconstruction. If any are invalid JSON, override finish_reason to
'length'. In the main loop's length handler, if tool calls are present,
refuse to execute and return partial=True with a clear error instead of
silently failing or wasting retries.

Also fixes _thinking_exhausted to not short-circuit when tool calls are
present — truncated tool calls are not thinking exhaustion.

Original cherry-picked from PR NousResearch#6776 by AIandI0x1.
Closes NousResearch#6638.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Truncated streaming tool calls silently treated as valid — tool executes with broken/empty args

2 participants