Skip to content

feat: validate-then-repair layer for semantic tool argument issues#19652

Closed
NikolayGusev-astra wants to merge 1 commit into
NousResearch:mainfrom
NikolayGusev-astra:validate-then-repair-tool-args
Closed

feat: validate-then-repair layer for semantic tool argument issues#19652
NikolayGusev-astra wants to merge 1 commit into
NousResearch:mainfrom
NikolayGusev-astra:validate-then-repair-tool-args

Conversation

@NikolayGusev-astra

Copy link
Copy Markdown

Problem

LLMs (especially open-weight models like DeepSeek, Qwen, GLM) systematically produce tool call arguments with 5 classes of semantic malformations that pass JSON syntax validation but violate the tool argument schema:

  1. null on optional fields — instead of omitting the field
  2. JSON-encoded string where array expected — causes silent data loss when json.loads fails and repair layer returns empty object
  3. Bare scalar where array expected — single value not wrapped in array
  4. Markdown links in path fields[notes.md](http://notes.md) instead of notes.md
  5. Relational invariant violations — offset without limit (or vice versa)

All 5 patterns currently pass through without feedback to the model, causing silent data loss, false successes, or confusing error messages.

Solution

A validate-then-repair layer in handle_function_call() that:

  • Checks each argument against the tool's registered JSON Schema
  • Only touches fields that violate the schema (valid data passes through unchanged)
  • Applies one of 5 targeted repairs
  • Prepends a [validate-then-repair: ...] note to the tool result so the model learns from its mistakes
  • Logs every repair at INFO level for telemetry by (tool_name, repair_type)

Key design choices

  • validate-then-repair, not preprocessing: the schema itself locates the problem, repair budget is spent only where needed
  • Repairs are ordered: null->omit runs before string->array / bare->array to avoid wrapping None in arrays
  • The existing _repair_tool_call_arguments in run_agent.py (JSON syntax: trailing commas, unclosed brackets, control chars) is complementary — this layer handles semantic issues

Testing

  • Syntax-verified with ast.parse
  • All 5 repair functions independently tested conceptually against known failure modes from production use with DeepSeek V4 Flash

Related

Blog post detailing the problem analysis: https://telegra.ph/Cursor-vsyo-slomal-no-chinit-nado-ne-Cursor-validate-then-repair-protiv-razryva-svyazej-v-AI-agentah-05-04

Adds a validate-then-repair layer in handle_function_call that catches
and fixes 5 classes of semantic tool argument malformations before they
reach tool handlers:

1. null_on_optional - removes None values from optional fields
2. string_as_array - parses JSON-encoded strings as arrays
3. bare_as_array - wraps single scalars in single-element arrays
4. markdown_link - strips markdown link syntax from path fields
5. missing_with_offset / missing_with_limit - defaults for relational pairs

Design: validate-then-repair (not preprocess) - only schema-violating
fields are touched. Repair notes are prepended to tool results so the
model learns from its mistakes. INFO-level logging for telemetry.

Reference: https://telegra.ph/Cursor-vsyo-slomal-no-chinit-nado-ne-Cursor-validate-then-repair-protiv-razryva-svyazej-v-AI-agentah-05-04
@alt-glitch alt-glitch added type/feature New feature or request P2 Medium — degraded but workaround exists comp/tools Tool registry, model_tools, toolsets labels May 4, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Note: This PR adds a standalone model_tools.py file rather than integrating into the existing _repair_tool_call_arguments flow in run_agent.py. Consider whether this should be a method in the existing repair pipeline or a separate module.

teknium1 pushed a commit that referenced this pull request May 4, 2026
…args

Open-weight models (DeepSeek, Qwen, GLM) sometimes emit tool calls like
`{"urls": "https://a.com"}` when the tool schema declares
`type: array`.  The call was JSON-valid but semantically wrong, and
`coerce_tool_args` would pass the bare string through — the tool then
failed with a confusing type error.

`coerce_tool_args` now wraps non-list, non-null values in a
single-element list when the schema declares `array`.  Strings still go
through `_coerce_value` first so JSON-encoded arrays
(`'["a","b"]'`) parse correctly and nullable `"null"` still
becomes `None`.  `None` itself is preserved — tools with sensible
defaults already handle it, and we don't want to silently mask a
deliberate null.

Salvaged from #19652 (NikolayGusev-astra) — the broader validate-then-
repair layer had several issues (duplicated existing coercion,
mis-classified `old_string` as a path field, prepended non-JSON
prefixes to tool results that break downstream JSON parsing, hardcoded
offset/limit defaults unsuitable for non-read_file tools).  The one
genuinely new capability is wrapping bare scalars, which is implemented
here directly inside the existing coercion path.

Co-authored-by: Nikolay Gusev <ngusev@astralinux.ru>
@teknium1

teknium1 commented May 4, 2026

Copy link
Copy Markdown
Contributor

Thanks for the contribution and the detailed analysis in your blog post — the bare-scalar-as-array pattern is real and worth fixing.

Salvaged the genuinely new capability (wrapping bare scalars into single-element lists when the schema declares type: array) into #19717, which just merged as commit fdf9343 with your authorship preserved.

The broader validate-then-repair layer had a few issues that kept us from merging the full PR:

  • String→array / null→coerce — already handled by the existing coerce_tool_args + _coerce_value path in model_tools.py (_coerce_json(value, list) parses JSON strings, _schema_allows_null handles nullable "null").
  • Markdown-in-pathold_string is the patch tool's text-to-find argument, not a path. Stripping markdown link syntax from it would silently corrupt any patch that edits markdown files.
  • Relational offset/limit defaults — hardcoding offset=1 / limit=500 matches read_file but is incorrect for any other tool that happens to share those parameter names.
  • Prefix-injected repair notes on the tool result — many tools return JSON strings; prepending [validate-then-repair: ...]\n turns the result into non-JSON and breaks downstream consumers (compression, plugins, telemetry) that do json.loads(result).
  • Bare→array wrapping on None — if the field is required, _repair_null_to_omit leaves None in place and _repair_bare_to_array then wraps it to [None], turning a clear error into a confusing one.

The surviving capability lives directly in the existing coercion path — ~24 LOC + 7 tests — and you're the commit author. Thanks again!

@teknium1 teknium1 closed this May 4, 2026
nickdlkk pushed a commit to nickdlkk/hermes-agent that referenced this pull request May 11, 2026
…args

Open-weight models (DeepSeek, Qwen, GLM) sometimes emit tool calls like
`{"urls": "https://a.com"}` when the tool schema declares
`type: array`.  The call was JSON-valid but semantically wrong, and
`coerce_tool_args` would pass the bare string through — the tool then
failed with a confusing type error.

`coerce_tool_args` now wraps non-list, non-null values in a
single-element list when the schema declares `array`.  Strings still go
through `_coerce_value` first so JSON-encoded arrays
(`'["a","b"]'`) parse correctly and nullable `"null"` still
becomes `None`.  `None` itself is preserved — tools with sensible
defaults already handle it, and we don't want to silently mask a
deliberate null.

Salvaged from NousResearch#19652 (NikolayGusev-astra) — the broader validate-then-
repair layer had several issues (duplicated existing coercion,
mis-classified `old_string` as a path field, prepended non-JSON
prefixes to tool results that break downstream JSON parsing, hardcoded
offset/limit defaults unsuitable for non-read_file tools).  The one
genuinely new capability is wrapping bare scalars, which is implemented
here directly inside the existing coercion path.

Co-authored-by: Nikolay Gusev <ngusev@astralinux.ru>
rmulligan pushed a commit to rmulligan/hermes-agent that referenced this pull request May 11, 2026
…args

Open-weight models (DeepSeek, Qwen, GLM) sometimes emit tool calls like
`{"urls": "https://a.com"}` when the tool schema declares
`type: array`.  The call was JSON-valid but semantically wrong, and
`coerce_tool_args` would pass the bare string through — the tool then
failed with a confusing type error.

`coerce_tool_args` now wraps non-list, non-null values in a
single-element list when the schema declares `array`.  Strings still go
through `_coerce_value` first so JSON-encoded arrays
(`'["a","b"]'`) parse correctly and nullable `"null"` still
becomes `None`.  `None` itself is preserved — tools with sensible
defaults already handle it, and we don't want to silently mask a
deliberate null.

Salvaged from NousResearch#19652 (NikolayGusev-astra) — the broader validate-then-
repair layer had several issues (duplicated existing coercion,
mis-classified `old_string` as a path field, prepended non-JSON
prefixes to tool results that break downstream JSON parsing, hardcoded
offset/limit defaults unsuitable for non-read_file tools).  The one
genuinely new capability is wrapping bare scalars, which is implemented
here directly inside the existing coercion path.

Co-authored-by: Nikolay Gusev <ngusev@astralinux.ru>
jsboige pushed a commit to jsboige/hermes-agent that referenced this pull request May 14, 2026
…args

Open-weight models (DeepSeek, Qwen, GLM) sometimes emit tool calls like
`{"urls": "https://a.com"}` when the tool schema declares
`type: array`.  The call was JSON-valid but semantically wrong, and
`coerce_tool_args` would pass the bare string through — the tool then
failed with a confusing type error.

`coerce_tool_args` now wraps non-list, non-null values in a
single-element list when the schema declares `array`.  Strings still go
through `_coerce_value` first so JSON-encoded arrays
(`'["a","b"]'`) parse correctly and nullable `"null"` still
becomes `None`.  `None` itself is preserved — tools with sensible
defaults already handle it, and we don't want to silently mask a
deliberate null.

Salvaged from NousResearch#19652 (NikolayGusev-astra) — the broader validate-then-
repair layer had several issues (duplicated existing coercion,
mis-classified `old_string` as a path field, prepended non-JSON
prefixes to tool results that break downstream JSON parsing, hardcoded
offset/limit defaults unsuitable for non-read_file tools).  The one
genuinely new capability is wrapping bare scalars, which is implemented
here directly inside the existing coercion path.

Co-authored-by: Nikolay Gusev <ngusev@astralinux.ru>
dannyJ848 pushed a commit to dannyJ848/hermes-agent that referenced this pull request May 17, 2026
…args

Open-weight models (DeepSeek, Qwen, GLM) sometimes emit tool calls like
`{"urls": "https://a.com"}` when the tool schema declares
`type: array`.  The call was JSON-valid but semantically wrong, and
`coerce_tool_args` would pass the bare string through — the tool then
failed with a confusing type error.

`coerce_tool_args` now wraps non-list, non-null values in a
single-element list when the schema declares `array`.  Strings still go
through `_coerce_value` first so JSON-encoded arrays
(`'["a","b"]'`) parse correctly and nullable `"null"` still
becomes `None`.  `None` itself is preserved — tools with sensible
defaults already handle it, and we don't want to silently mask a
deliberate null.

Salvaged from NousResearch#19652 (NikolayGusev-astra) — the broader validate-then-
repair layer had several issues (duplicated existing coercion,
mis-classified `old_string` as a path field, prepended non-JSON
prefixes to tool results that break downstream JSON parsing, hardcoded
offset/limit defaults unsuitable for non-read_file tools).  The one
genuinely new capability is wrapping bare scalars, which is implemented
here directly inside the existing coercion path.

Co-authored-by: Nikolay Gusev <ngusev@astralinux.ru>
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
…args

Open-weight models (DeepSeek, Qwen, GLM) sometimes emit tool calls like
`{"urls": "https://a.com"}` when the tool schema declares
`type: array`.  The call was JSON-valid but semantically wrong, and
`coerce_tool_args` would pass the bare string through — the tool then
failed with a confusing type error.

`coerce_tool_args` now wraps non-list, non-null values in a
single-element list when the schema declares `array`.  Strings still go
through `_coerce_value` first so JSON-encoded arrays
(`'["a","b"]'`) parse correctly and nullable `"null"` still
becomes `None`.  `None` itself is preserved — tools with sensible
defaults already handle it, and we don't want to silently mask a
deliberate null.

Salvaged from NousResearch#19652 (NikolayGusev-astra) — the broader validate-then-
repair layer had several issues (duplicated existing coercion,
mis-classified `old_string` as a path field, prepended non-JSON
prefixes to tool results that break downstream JSON parsing, hardcoded
offset/limit defaults unsuitable for non-read_file tools).  The one
genuinely new capability is wrapping bare scalars, which is implemented
here directly inside the existing coercion path.

Co-authored-by: Nikolay Gusev <ngusev@astralinux.ru>
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
…args

Open-weight models (DeepSeek, Qwen, GLM) sometimes emit tool calls like
`{"urls": "https://a.com"}` when the tool schema declares
`type: array`.  The call was JSON-valid but semantically wrong, and
`coerce_tool_args` would pass the bare string through — the tool then
failed with a confusing type error.

`coerce_tool_args` now wraps non-list, non-null values in a
single-element list when the schema declares `array`.  Strings still go
through `_coerce_value` first so JSON-encoded arrays
(`'["a","b"]'`) parse correctly and nullable `"null"` still
becomes `None`.  `None` itself is preserved — tools with sensible
defaults already handle it, and we don't want to silently mask a
deliberate null.

Salvaged from NousResearch#19652 (NikolayGusev-astra) — the broader validate-then-
repair layer had several issues (duplicated existing coercion,
mis-classified `old_string` as a path field, prepended non-JSON
prefixes to tool results that break downstream JSON parsing, hardcoded
offset/limit defaults unsuitable for non-read_file tools).  The one
genuinely new capability is wrapping bare scalars, which is implemented
here directly inside the existing coercion path.

Co-authored-by: Nikolay Gusev <ngusev@astralinux.ru>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/tools Tool registry, model_tools, toolsets P2 Medium — degraded but workaround exists type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants