feat: add Gemma 4 tool call parser#7449
Conversation
Add a client-side parser for Gemma 4's text-based tool call format: <|tool_call>call:func_name(arg1: "value1", arg2: "value2")<tool_call|> Gemma 4 outputs tool calls in Python function-call-like syntax wrapped in <|tool_call> ... <tool_call|> tags. The native OpenAI-style `tools` parameter via OpenRouter often produces incomplete/truncated tool calls, making a dedicated text parser necessary. Supported formats: - Parenthesis syntax: call:name(key: "value", key2: 'value2') - Brace syntax (issue NousResearch#6626): call:name{key: "value"} - Special quote tokens: <|"|> delimiters - Python literals: strings, ints, floats, nested dicts/lists - Unicode/Korean content in arguments - Multiple tool calls in one response - Truncated/unclosed tags (graceful handling) Tested against real Gemma 4 31B output via Nous Research API with 6 different prompt patterns (single arg, multi arg, complex JSON, multi tool call, nested args, Korean). Closes NousResearch#6626 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The fallback parser in agent_loop.py was hardcoded to only detect Hermes-format tool calls (<tool_call>) and always use the "hermes" parser. This meant other model families (Gemma 4, Kimi K2, DeepSeek, Mistral) whose tool calls went unparsed by the server would silently fail in the fallback path. Changes: - Auto-detect tool call markers for all registered parser formats - Use server-configured parser (tool_parser) when available - Fall back to marker-based auto-detection when no parser is configured - Covers: Hermes/Qwen, Gemma 4, Kimi K2, DeepSeek V3, Mistral Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
+1 |
|
this does not fix local tool calls for gemma-4-E4B-it over vllm. @0xarkstar please see my PR to your branch 0xarkstar#1 |
|
Tested this PR locally against 1. Fallback parser isn't invoked from the production CLI pathThe new Reproduced with a worker session that did 16 successful native tool calls, then on the 17th call vLLM dropped the raw markup into 2. Parser stores nested unquoted-key dicts/lists as raw stringsWhen the model emits brace-syntax args containing nested objects with bare keys, e.g.:
Fix is small: replace the bare def _parse_value(value_str: str):
value_str = value_str.strip()
if not value_str:
return ""
try:
return ast.literal_eval(value_str)
except (ValueError, SyntaxError):
pass
if value_str.startswith("{") and value_str.endswith("}"):
return _parse_kwargs_to_dict(value_str[1:-1])
if value_str.startswith("[") and value_str.endswith("]"):
return [_parse_value(x) for x in _split_top_level(value_str[1:-1])]
return value_str.strip("'\"")3. String parser is naïve about embedded quotes (causes a hard infinite loop)
After substitution the parser sees:
Fix: capture each All three issues are reliably hit by tool calls with rich payloads (long shell commands, |
|
Following up on this thread — running the previous patch + fixes 1–3 against 4. Markers leak inside structured
|
| Marker (raw) | After JSON-escape inside arguments | Fix |
|---|---|---|
| `<|"|>` opening (full) | `"<|"|"` | replace with `"` |
| `<|"|>` closing (full) | `<|"` | replace with `"` |
| `<|"|>` opening (truncated, no inner `"`) | `"<|` | replace with `"` |
| `<|"|>` closing (truncated, no inner `\`) | `<|"` | replace with `"` |
A small companion to the existing block in `agent/transports/chat_completions.py::normalize_response` handles all four, with a guard to never silently rewrite already-valid JSON:
```python
if tool_calls:
import json as _json_for_validation
def _strip_gemma4_quote_markers(s: str) -> str:
if not s or "<|" not in s:
return s
return (s
.replace('"<|\\\\"|"', '"') # full open: "<|\\"|" → "
.replace('<|\\\\"', '"') # full close: <|\\" → "
.replace('"<|', '"') # bare open: "<| → "
.replace('<|"', '"')) # bare close: <|" → "
for tc in tool_calls:
args_str = tc.arguments
if not args_str or "<|" not in args_str:
continue
try:
_json_for_validation.loads(args_str)
continue # already valid — never mutate
except (ValueError, TypeError):
pass
cleaned = _strip_gemma4_quote_markers(args_str)
if cleaned == args_str:
continue
try:
_json_for_validation.loads(cleaned)
except (ValueError, TypeError):
continue # cleaned still broken — leave original for downstream error
tc.arguments = cleaned
```
Validated against 14 tool_calls captured from a failing gemma4 + NaN Builders session: 12 already-valid args untouched, 2 broken args (both `kanban_complete` payloads with nested `metadata.findings`) parse cleanly post-sanitization, 0 collateral damage.
|
Closing this PR. Two material things have changed since 2026-04:
Thanks @spprod35, @raritytiks, @Naroh091 for testing and the detailed bug reports. The work is consolidating around #19887; please direct future fixes there. |
Summary
Adds a client-side text parser for Gemma 4's tool call format, enabling reliable tool calling for Gemma 4 models (31B, 26B) in Hermes Agent.
Problem
Gemma 4 outputs tool calls in a unique text format:
Currently, Hermes has no parser for this format. When using Gemma 4 via Nous API / OpenRouter:
tool_callsfield is alwaysNone(native tool calling doesn't work reliably through the proxy)message.contentThis was reported in #6626.
Solution
New
gemma4parser (environments/tool_call_parsers/gemma4_parser.py) that extracts structuredChatCompletionMessageToolCallobjects from Gemma 4's text output.Supported formats:
call:name(key: "value", key2: 'value2')call:name{key: "value"}<|"|>delimiters (as reported in Gemma 4 tool calling support (parser availability & required configuration) #6626)Testing
Real-world validation against Gemma 4 31B via Nous Research API:
call:search(query: "blockchain news")call:search(query: 'blockchain news')call:send_message(target='#ann', message='Hello')call:add_item(database='M', name='Alice', priority=1)call:create_db(title='M', props={'name': 'text'})call:send(target='research_team', message='hello world')call:status()<|tool_call>blockscall:sf{pattern: "/var", target: "f"}Unit tests: 17 test cases added to
tests/tools/test_tool_call_parsers.pyChanges
environments/tool_call_parsers/gemma4_parser.py— New parser (234 lines)environments/tool_call_parsers/__init__.py— RegisterGemma4ToolCallParsertests/tools/test_tool_call_parsers.py— 17 test cases forTestGemma4ParserContext
Discovered while building a Discord bot using Hermes + Gemma 4 31B via Nous API. The bot would describe tool calls in text instead of executing them, causing repeated retry loops. Switching to Qwen (which has a parser) worked immediately, confirming the parser gap.
Closes #6626
🤖 Generated with Claude Code
Additional fix: model-aware fallback parser in agent_loop.py
The fallback parser in
agent_loop.py:268-289was hardcoded to only detect<tool_call>markers and always use the"hermes"parser. This meant all other model families (Gemma 4, Kimi K2, DeepSeek V3, Mistral) whose tool calls went unparsed by the server would silently fail in the fallback path.Changes:
tool_parser) when available<tool_call>), Gemma 4 (<|tool_call>), Kimi K2 (<|tool_calls_section_begin|>), DeepSeek V3 (<|tool▁calls▁begin|>), Mistral ([TOOL_CALLS])This benefits not only Gemma 4 but all model families when the server-side parser fails or is unavailable.