fix(server): return 400 (not 500) for truncated tool-call args in input (#19) by marksverdhei · Pull Request #109 · heiervang-technologies/ht-llama.cpp

marksverdhei · 2026-06-13T19:21:34Z

Summary

Closes #19 (Part 2). Truncated tool-call arguments in the input message history now produce an actionable HTTP 400 instead of a misleading 500.

The bug

When a /v1/responses or /v1/chat/completions conversation hits the --ctx-size limit, the model can emit a tool call whose arguments JSON is cut off mid-string. The client echoes that truncated tool call back in the next request's history. func_args_not_string() (common/chat.cpp) then fails to json::parse it and threw std::runtime_error, which the server's ex_wrapper (server.cpp) maps to a generic 500 with no cause. Agentic clients (Codex CLI, etc.) cannot diagnose this and retry forever — the history only grows, so it never recovers.

The fix

Throw std::invalid_argument instead. ex_wrapper already maps that to 400 (ERROR_TYPE_INVALID_REQUEST), and the message now names the likely root cause and the remedy:

Invalid tool call arguments in input messages: <json parse error w/ column>.
This usually means a previous tool call was truncated because the conversation
reached the context-size limit; reduce the conversation history or increase
--ctx-size, then retry.

This matches the existing idiom in the file — OAI message-shape validation already throws std::invalid_argument for malformed input.

Part 1 was already done

The other half of #19 (status:"incomplete" + incomplete_details:{reason:"max_output_tokens"} when stop == STOP_TYPE_LIMIT) is already implemented in server-task.cpp (to_json_oaicompat_resp / _stream). This PR completes the remaining Part 2.

Test

Adds test_input_with_truncated_tool_call_arguments_returns_400 to test_tool_call.py. Notable: the truncated call must sit mid-history (followed by a tool result + user turn) — a trailing assistant tool_call is stripped by the continue-final-message path before the parse runs, so it wouldn't exercise the bug. Uses the Hermes-2-Pro tool_use template, which advertises object-arguments support (verified via llama-debug-template-parser).

unit/test_tool_call.py::test_input_with_truncated_tool_call_arguments_returns_400 PASSED
7 passed, 196 deselected   # full non-slow tool-call suite, no regressions

Backend rebuilt + smoke-tested live (llama-server + Hermes-2-Pro template): malformed mid-history args → HTTP 400; valid history unaffected.

When a tool_call in the input message history carries an `arguments` string that is invalid JSON — which happens when a prior /v1/responses or /v1/chat/completions reply was truncated because the conversation hit the context-size limit — func_args_not_string() threw std::runtime_error, which the server's ex_wrapper maps to a generic HTTP 500. Agentic clients (Codex CLI etc.) cannot diagnose the cause and retry endlessly because the history only grows. Throw std::invalid_argument instead, which ex_wrapper already maps to a 400 with an actionable message pointing at context-size truncation. This matches the existing idiom in this file (oai message validation already throws std::invalid_argument for malformed input). Adds a server regression test on the Hermes-2-Pro tool_use template (which advertises object-arguments support). The truncated call must sit mid-history followed by a tool result + user turn: a trailing assistant tool_call is stripped by the continue-final-message path before the parse runs, so it would not exercise the bug. Part 1 of #19 (status:"incomplete" + incomplete_details on STOP_TYPE_LIMIT) already shipped in server-task.cpp; this completes Part 2. Closes #19.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(server): return 400 (not 500) for truncated tool-call args in input (#19)#109

fix(server): return 400 (not 500) for truncated tool-call args in input (#19)#109
marksverdhei wants to merge 1 commit into
htfrom
fix/responses-truncated-toolcall-400

marksverdhei commented Jun 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

marksverdhei commented Jun 13, 2026

Summary

The bug

The fix

Part 1 was already done

Test

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant