fix(patch-tool): per-mode required-param hints + per-turn file-mutation verifier footer by teknium1 · Pull Request #22149 · NousResearch/hermes-agent

teknium1 · 2026-05-09T00:17:04Z

Models no longer silently over-claim file edits. Two layers, one PR:

Layer 1 — salvaged from #15673 (@briandevans): Advertises per-mode required params in the patch schema's descriptions so strict tool callers (kimi-k2.x in #15524, grok-4.1-fast reported separately) stop omitting old_string/new_string/patch. JSON Schema's required array stays ["mode"] because anyOf/oneOf at parameters level breaks Anthropic, Fireworks, and Kimi — description-level REQUIRED when mode='…' hints are the only cross-provider-compatible mechanism.

Layer 2 (new): Per-turn verifier that catches over-claiming after the fact. Every write_file / patch call's outcome is tracked per resolved path; failures that are never superseded by a later successful write to the same path are surfaced in a footer appended to the assistant's final response:

⚠️ File-mutation verifier: 3 file(s) were NOT modified this turn despite any wording above that may suggest otherwise. Run `git status` or `read_file` to confirm.
  • concepts/automatic-organization.md — [patch] Could not find match for old_string
  • concepts/lora.md — [patch] Could not find match for old_string
  • concepts/rag-pipeline.md — [patch] Could not find match for old_string

Regression target: Ben Eng's llm-wiki session where grok-4.1-fast batched parallel patches, half failed with 'Could not find old_string', and the model summarised the turn claiming every file was edited — forcing the user to manually run git status every turn.

Changes

tools/file_tools.py — PATCH_SCHEMA description + per-property REQUIRED when mode='…' hints (cherry-pick from fix(patch-tool): advertise per-mode required params in schema descriptions #15673)
run_agent.py — _FILE_MUTATING_TOOLS const, _extract_file_mutation_targets (handles write_file, patch-replace, and patch-v4a single + multi-file), _extract_error_preview, AIAgent._record_file_mutation_result, AIAgent._file_mutation_verifier_enabled, AIAgent._format_file_mutation_failure_footer. Recording wired into both _execute_tool_calls_concurrent and _execute_tool_calls_sequential so parallel and sequential batches are both covered. Footer emission runs at turn end, before transform_llm_output / post_llm_call plugin hooks, so plugins still see (and can modify) the augmented text.
hermes_cli/config.py — display.file_mutation_verifier default True
tests/run_agent/test_file_mutation_verifier.py — 31 unit tests (target extraction, error-preview extraction, state transitions, footer rendering, env/config precedence)
tests/tools/test_file_tools.py — 9 TestPatchSchemaShape assertions from fix(patch-tool): advertise per-mode required params in schema descriptions #15673
website/docs/user-guide/configuration.md + website/docs/reference/environment-variables.md — display.file_mutation_verifier + HERMES_FILE_MUTATION_VERIFIER

Validation

	Before	After
Strict tool caller omits `old_string`	Schema allows it (`required: ["mode"]`), tool returns `old_string required`, model loops	Description says `REQUIRED when mode='replace'`, model includes the field
Batch of parallel patches, half fail	Model summarises 'patched 5 files', user has to run `git status` to find the lie	Footer lists every file that did NOT change; user sees the truth on every turn
Model retries failed patch and recovers	Footer would have listed it	Success removes the path from the state dict; no false positive
Tests: `tests/tools/test_file_tools.py` + `tests/run_agent/test_file_mutation_verifier.py` + `test_steer.py` + `test_tool_call_args_sanitizer.py`	N/A (verifier didn't exist)	95 / 95 passing

Notes

Zero cost: no LLM calls, no tool schema changes (schema layer is just descriptions).
Prompt-cache safe: no messages injected into history, footer is appended to the returned string only.
Message-alternation safe: no synthetic user/tool messages introduced.
Opt-out: set display.file_mutation_verifier: false in config.yaml or HERMES_FILE_MUTATION_VERIFIER=0.
V4A multi-file patches: parsed for *** Update File: / *** Add File: / *** Delete File: headers so each file in a multi-file patch is tracked separately.

Closes #15524
Cherry-picks and credits @briandevans (PR #15673).

…tions Models that enforce required-only constraints (e.g. kimi-k2.x) were omitting old_string/new_string for replace mode and patch for patch mode because the schema only declared required: ["mode"]. Add explicit "REQUIRED when mode='X'" markers to each conditionally-required property description and a top-level "REQUIRED PARAMETERS: ..." summary for each mode. Avoids anyOf/oneOf which break Anthropic, Fireworks, and Kimi/Moonshot providers. Add TestPatchSchemaShape to lock the shape. Fixes #15524 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Detect when write_file / patch calls fail during a turn and are never superseded by a successful write to the same path. When the final text response is delivered, append an advisory footer listing the files that did NOT change — so models that over-claim 'patched 5 files' after 4 silent failures can't hide the lie. Catches the failure mode reported in Ben Eng's llm-wiki session: grok-4.1-fast issued batches of parallel patches, half failed with 'Could not find old_string', and the agent summarised the turn claiming every file was edited. The user had to manually run 'git status' each turn to catch it. The verifier is a pure post-hoc check on tool results — no new LLM calls, no synthetic messages injected into history (prompt cache preserved), no changes to tool argument dispatch. Per-turn state is keyed by path; a later successful write to the same path clears the failure entry so single-file retry recovery is not flagged. Wired into both _execute_tool_calls_concurrent and _execute_tool_calls_sequential, so batched parallel patches and one-at- a-time edits are both covered. Footer emission happens after the agent loop exits, before transform_llm_output / post_llm_call plugin hooks run, so plugins still see (and can modify) the augmented text. Config: display.file_mutation_verifier (bool, default true) + HERMES_FILE_MUTATION_VERIFIER env override. 31 unit tests in tests/run_agent/test_file_mutation_verifier.py cover target extraction (write_file, patch-replace, patch-v4a single and multi-file), error-preview extraction (JSON .error field and plain string), per-turn state transitions (first-error-wins on repeated failure, success supersedes failure), footer rendering (truncation at 10 entries, user-actionable hint), and env/config precedence. Companion docs updated: user-guide/configuration.md + reference/environment-variables.md.

teknium1 · 2026-05-12T18:49:51Z

Superseded by #24498 — same Layer 2 (verifier) commit, rebuilt onto current main with the docs conflict resolved.

Layer 1 of this PR (the patch-tool schema-description fix cherry-picked from #15673) was independently salvaged and merged as 3adcc64 on Apr 25, which is why this branch went stale. #24498 carries only the substantive verifier commit.

briandevans and others added 2 commits May 8, 2026 16:57

alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder comp/tools Tool registry, model_tools, toolsets tool/file File tools (read, write, patch, search) labels May 9, 2026

teknium1 mentioned this pull request May 12, 2026

feat(agent): per-turn file-mutation verifier footer #24498

Merged

teknium1 closed this May 12, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(patch-tool): per-mode required-param hints + per-turn file-mutation verifier footer#22149

fix(patch-tool): per-mode required-param hints + per-turn file-mutation verifier footer#22149
teknium1 wants to merge 2 commits into
mainfrom
hermes/hermes-3b90958e

teknium1 commented May 9, 2026

Uh oh!

teknium1 commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

teknium1 commented May 9, 2026

Changes

Validation

Notes

Uh oh!

teknium1 commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants