Skip to content

feat(patch): indent preservation, CRLF preservation, per-file failure escalation#32273

Merged
teknium1 merged 1 commit into
mainfrom
hermes/hermes-be45c600
May 25, 2026
Merged

feat(patch): indent preservation, CRLF preservation, per-file failure escalation#32273
teknium1 merged 1 commit into
mainfrom
hermes/hermes-be45c600

Conversation

@teknium1

@teknium1 teknium1 commented May 25, 2026

Copy link
Copy Markdown
Contributor

Three granular patch-tool refinements from the Roo Code deep-dive (#507). Items 1, 2d, 2e, 3, 4 of the issue were already shipped in earlier work; this picks up the three that hadn't been done yet and that still survived an audit.

What this PR makes true

  • Patching with a non-exact fuzzy strategy no longer corrupts file indentation when the LLM's tool args don't match the file's indent base.
  • write_file and patch preserve Windows-line-ending (CRLF) files instead of silently normalizing them to LF (or producing mixed endings).
  • After 3 consecutive failed patches to the same file, the agent gets an escalating hint instead of the same 'old_string not found' message every time.

Changes

Indentation preservation (tools/fuzzy_match.py)

fuzzy_find_and_replace now passes the original old_string through to _apply_replacements whenever the matched strategy is non-exact. A new _reindent_replacement helper computes the indent delta between old_string's first meaningful line and the matched region's first meaningful line, then shifts every line of new_string by that delta.

  • Exact-strategy matches: untouched (passthrough)
  • Blank lines in new_string: untouched
  • Lines less-indented than the LLM's base (dedent at start of new_string): anchored to the file's base indent
  • Same approach as Roo Code's multi-search-replace.ts:466-500

CRLF preservation (tools/file_operations.py)

Two new pure helpers (_detect_line_ending, _normalize_line_endings) plus a _detect_file_line_ending method on ShellFileOperations that uses already-read pre_content when available (zero extra exec) or falls back to head -c 4096.

  • write_file: if the file existed with CRLF, convert content to CRLF before write
  • patch_replace: normalize new_content to the original file's detected line ending before write (also fixes the unified diff output, which would otherwise show spurious line-ending changes)
  • New files (no pre-existing endings): write content verbatim

Per-file failure escalation (tools/file_tools.py)

New _patch_failure_tracker dict (with helpers _record_patch_failure, _reset_patch_failures). After 3 consecutive 'Could not find' failures on the same resolved path, the existing _hint is replaced with:

'This is failure #N patching X. Stop retrying with variations of the same old_string. Either: (1) re-read the file fresh, (2) use a longer / more unique old_string with surrounding context lines, or (3) use write_file to replace the entire file if the targeted region is hard to anchor.'

Counter resets on a successful patch. Per-path and per-task isolation. Capped at 64 distinct failing paths per task to avoid unbounded growth in long sessions.

Validation

Before After
Patch with 2-space LLM input on 8-space-indented file Silently produces broken indent (def at 2-space, body at 4-space) def at 8-space, body at 12-space, valid Python
Patch a CRLF .ini with LF tool args Mixed CRLF/LF endings on disk All CRLF preserved
write_file overwriting a CRLF file with LF content Silently normalized to LF Converted to CRLF before write
3rd failed patch on same path Same 'old_string not found' hint as attempt 1 Escalating hint with failure count and three concrete recovery options
  • 22 new tests across test_fuzzy_match.py (5), test_line_ending_preservation.py (12), test_patch_failure_tracking.py (5) — all pass
  • All existing tests in the touched files still pass (165/165)
  • E2E verified with real _handle_patch / _handle_write_file calls against real CRLF files and real failure loops

Out of scope (declined after audit)

  • 2b (start_line hint for patch): schema bloat for a problem the existing 'multiple matches' contract already handles
  • 5 (behavioral rules): conflicts with the personality system; the issue itself flags this concern

Items in #507 that were already shipped

# Status
1 head/tail truncation tools/terminal_tool.py:2131-2139 does 40/60 split; sandbox-persist for tool results in tool_result_storage.py
2d unicode normalization fuzzy_match.py:36-47 UNICODE_MAP + strategy 7
2e detailed error messages format_no_match_hint 'Did you mean one of these sections?' snippet
3 anti-hallucination prompt_builder.py:286-344 <tool_persistence>/<mandatory_tool_use>//<missing_context>
4 task methodology same block — <prerequisite_checks>,

Closes part of #507.

Infographic

PR #32273 patch tool refinements

…ilure escalation (#507)

Three granular patch-tool refinements from the Roo Code deep-dive (#507).

## Indentation preservation (fuzzy_match.py)

When fuzzy_find_and_replace matches via a non-exact strategy, the file's
indentation may differ from what the LLM sent in old_string/new_string
(common case: model sends zero-indent old/new for a method body that
lives inside an 8-space-indented class). Before this commit the
replacement was spliced in verbatim, producing a file with a broken
indent level that may still parse but is logically wrong.

The fix computes the indent delta between old_string's first meaningful
line and the matched region's first meaningful line, then re-indents
every line of new_string by that delta. Exact-strategy matches are
untouched (passthrough). Same approach as Roo Code's
multi-search-replace.ts:466-500.

## CRLF preservation (file_operations.py)

Models nearly always send tool args with bare LF endings (JSON-encoded),
but the file on disk may have CRLF (Windows-line-ending configs, .bat,
.cmd, .ini files). Before this commit:

- write_file silently normalized CRLF to LF on every overwrite
- patch produced mixed-ending files: the substituted region had LF,
  the surrounding context kept CRLF

The fix detects the file's existing line endings (via pre_content if
already read for lint/LSP, otherwise a tiny head -c 4096 probe), and
normalizes the entire write to that ending. New files are written
verbatim (no detection possible).

## Per-file failure escalation (file_tools.py)

When the agent fails to patch the same file 3+ times in a row, the
existing 'old_string not found' hint isn't strong enough — the model
keeps retrying with variations against a stale view of the file.

The fix tracks consecutive failures per (task_id, resolved_path) and
injects an escalating hint after 3 failures: 'This is failure #N
patching X. Stop retrying. Either re-read fresh, use longer context,
or fall back to write_file.' Counter resets on a successful patch to
the same path.

## Validation

- 22 new tests across tests/tools/test_fuzzy_match.py (5),
  test_line_ending_preservation.py (12), test_patch_failure_tracking.py (5)
- All existing tests pass (165/165 in the touched files)
- E2E verified with real _handle_patch / _handle_write_file calls
  against real CRLF files and real failure loops

Closes part of #507. The remaining open items in #507 (2b start_line
hint, behavioral rules) were declined after audit:
- 2b adds schema bloat for a problem the existing 'multiple matches'
  contract already handles
- Behavioral rules conflict with the personality system

Items 1, 2d, 2e, 3, 4 of #507 were already landed in earlier work.
@github-actions

Copy link
Copy Markdown
Contributor

🔎 Lint report: hermes/hermes-be45c600 vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 9352 on HEAD, 9350 on base (🆕 +2)

🆕 New issues (2):

Rule Count
unresolved-import 2
First entries
tests/tools/test_line_ending_preservation.py:15: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
tests/tools/test_patch_failure_tracking.py:14: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`

✅ Fixed issues: none

Unchanged: 4949 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

@alt-glitch alt-glitch added type/feature New feature or request P2 Medium — degraded but workaround exists tool/file File tools (read, write, patch, search) labels May 25, 2026
@teknium1 teknium1 merged commit 6bd0be3 into main May 25, 2026
26 checks passed
@teknium1 teknium1 deleted the hermes/hermes-be45c600 branch May 25, 2026 22:18
bridge25 pushed a commit to bridge25/hermes-agent that referenced this pull request May 27, 2026
…ilure escalation (NousResearch#507) (NousResearch#32273)

Three granular patch-tool refinements from the Roo Code deep-dive (NousResearch#507).

## Indentation preservation (fuzzy_match.py)

When fuzzy_find_and_replace matches via a non-exact strategy, the file's
indentation may differ from what the LLM sent in old_string/new_string
(common case: model sends zero-indent old/new for a method body that
lives inside an 8-space-indented class). Before this commit the
replacement was spliced in verbatim, producing a file with a broken
indent level that may still parse but is logically wrong.

The fix computes the indent delta between old_string's first meaningful
line and the matched region's first meaningful line, then re-indents
every line of new_string by that delta. Exact-strategy matches are
untouched (passthrough). Same approach as Roo Code's
multi-search-replace.ts:466-500.

## CRLF preservation (file_operations.py)

Models nearly always send tool args with bare LF endings (JSON-encoded),
but the file on disk may have CRLF (Windows-line-ending configs, .bat,
.cmd, .ini files). Before this commit:

- write_file silently normalized CRLF to LF on every overwrite
- patch produced mixed-ending files: the substituted region had LF,
  the surrounding context kept CRLF

The fix detects the file's existing line endings (via pre_content if
already read for lint/LSP, otherwise a tiny head -c 4096 probe), and
normalizes the entire write to that ending. New files are written
verbatim (no detection possible).

## Per-file failure escalation (file_tools.py)

When the agent fails to patch the same file 3+ times in a row, the
existing 'old_string not found' hint isn't strong enough — the model
keeps retrying with variations against a stale view of the file.

The fix tracks consecutive failures per (task_id, resolved_path) and
injects an escalating hint after 3 failures: 'This is failure #N
patching X. Stop retrying. Either re-read fresh, use longer context,
or fall back to write_file.' Counter resets on a successful patch to
the same path.

## Validation

- 22 new tests across tests/tools/test_fuzzy_match.py (5),
  test_line_ending_preservation.py (12), test_patch_failure_tracking.py (5)
- All existing tests pass (165/165 in the touched files)
- E2E verified with real _handle_patch / _handle_write_file calls
  against real CRLF files and real failure loops

Closes part of NousResearch#507. The remaining open items in NousResearch#507 (2b start_line
hint, behavioral rules) were declined after audit:
- 2b adds schema bloat for a problem the existing 'multiple matches'
  contract already handles
- Behavioral rules conflict with the personality system

Items 1, 2d, 2e, 3, 4 of NousResearch#507 were already landed in earlier work.
mathias3 pushed a commit to mathias3/hermes-agent that referenced this pull request May 28, 2026
…ilure escalation (NousResearch#507) (NousResearch#32273)

Three granular patch-tool refinements from the Roo Code deep-dive (NousResearch#507).

## Indentation preservation (fuzzy_match.py)

When fuzzy_find_and_replace matches via a non-exact strategy, the file's
indentation may differ from what the LLM sent in old_string/new_string
(common case: model sends zero-indent old/new for a method body that
lives inside an 8-space-indented class). Before this commit the
replacement was spliced in verbatim, producing a file with a broken
indent level that may still parse but is logically wrong.

The fix computes the indent delta between old_string's first meaningful
line and the matched region's first meaningful line, then re-indents
every line of new_string by that delta. Exact-strategy matches are
untouched (passthrough). Same approach as Roo Code's
multi-search-replace.ts:466-500.

## CRLF preservation (file_operations.py)

Models nearly always send tool args with bare LF endings (JSON-encoded),
but the file on disk may have CRLF (Windows-line-ending configs, .bat,
.cmd, .ini files). Before this commit:

- write_file silently normalized CRLF to LF on every overwrite
- patch produced mixed-ending files: the substituted region had LF,
  the surrounding context kept CRLF

The fix detects the file's existing line endings (via pre_content if
already read for lint/LSP, otherwise a tiny head -c 4096 probe), and
normalizes the entire write to that ending. New files are written
verbatim (no detection possible).

## Per-file failure escalation (file_tools.py)

When the agent fails to patch the same file 3+ times in a row, the
existing 'old_string not found' hint isn't strong enough — the model
keeps retrying with variations against a stale view of the file.

The fix tracks consecutive failures per (task_id, resolved_path) and
injects an escalating hint after 3 failures: 'This is failure #N
patching X. Stop retrying. Either re-read fresh, use longer context,
or fall back to write_file.' Counter resets on a successful patch to
the same path.

## Validation

- 22 new tests across tests/tools/test_fuzzy_match.py (5),
  test_line_ending_preservation.py (12), test_patch_failure_tracking.py (5)
- All existing tests pass (165/165 in the touched files)
- E2E verified with real _handle_patch / _handle_write_file calls
  against real CRLF files and real failure loops

Closes part of NousResearch#507. The remaining open items in NousResearch#507 (2b start_line
hint, behavioral rules) were declined after audit:
- 2b adds schema bloat for a problem the existing 'multiple matches'
  contract already handles
- Behavioral rules conflict with the personality system

Items 1, 2d, 2e, 3, 4 of NousResearch#507 were already landed in earlier work.
Bryce-huang pushed a commit to wbkunlun/hermes-agent that referenced this pull request May 29, 2026
…ilure escalation (NousResearch#507) (NousResearch#32273)

Three granular patch-tool refinements from the Roo Code deep-dive (NousResearch#507).

## Indentation preservation (fuzzy_match.py)

When fuzzy_find_and_replace matches via a non-exact strategy, the file's
indentation may differ from what the LLM sent in old_string/new_string
(common case: model sends zero-indent old/new for a method body that
lives inside an 8-space-indented class). Before this commit the
replacement was spliced in verbatim, producing a file with a broken
indent level that may still parse but is logically wrong.

The fix computes the indent delta between old_string's first meaningful
line and the matched region's first meaningful line, then re-indents
every line of new_string by that delta. Exact-strategy matches are
untouched (passthrough). Same approach as Roo Code's
multi-search-replace.ts:466-500.

## CRLF preservation (file_operations.py)

Models nearly always send tool args with bare LF endings (JSON-encoded),
but the file on disk may have CRLF (Windows-line-ending configs, .bat,
.cmd, .ini files). Before this commit:

- write_file silently normalized CRLF to LF on every overwrite
- patch produced mixed-ending files: the substituted region had LF,
  the surrounding context kept CRLF

The fix detects the file's existing line endings (via pre_content if
already read for lint/LSP, otherwise a tiny head -c 4096 probe), and
normalizes the entire write to that ending. New files are written
verbatim (no detection possible).

## Per-file failure escalation (file_tools.py)

When the agent fails to patch the same file 3+ times in a row, the
existing 'old_string not found' hint isn't strong enough — the model
keeps retrying with variations against a stale view of the file.

The fix tracks consecutive failures per (task_id, resolved_path) and
injects an escalating hint after 3 failures: 'This is failure #N
patching X. Stop retrying. Either re-read fresh, use longer context,
or fall back to write_file.' Counter resets on a successful patch to
the same path.

## Validation

- 22 new tests across tests/tools/test_fuzzy_match.py (5),
  test_line_ending_preservation.py (12), test_patch_failure_tracking.py (5)
- All existing tests pass (165/165 in the touched files)
- E2E verified with real _handle_patch / _handle_write_file calls
  against real CRLF files and real failure loops

Closes part of NousResearch#507. The remaining open items in NousResearch#507 (2b start_line
hint, behavioral rules) were declined after audit:
- 2b adds schema bloat for a problem the existing 'multiple matches'
  contract already handles
- Behavioral rules conflict with the personality system

Items 1, 2d, 2e, 3, 4 of NousResearch#507 were already landed in earlier work.
#AI commit#
mosaiq-systems pushed a commit to mosaiq-systems/hermes-agent that referenced this pull request May 29, 2026
…ilure escalation (NousResearch#507) (NousResearch#32273)

Three granular patch-tool refinements from the Roo Code deep-dive (NousResearch#507).

## Indentation preservation (fuzzy_match.py)

When fuzzy_find_and_replace matches via a non-exact strategy, the file's
indentation may differ from what the LLM sent in old_string/new_string
(common case: model sends zero-indent old/new for a method body that
lives inside an 8-space-indented class). Before this commit the
replacement was spliced in verbatim, producing a file with a broken
indent level that may still parse but is logically wrong.

The fix computes the indent delta between old_string's first meaningful
line and the matched region's first meaningful line, then re-indents
every line of new_string by that delta. Exact-strategy matches are
untouched (passthrough). Same approach as Roo Code's
multi-search-replace.ts:466-500.

## CRLF preservation (file_operations.py)

Models nearly always send tool args with bare LF endings (JSON-encoded),
but the file on disk may have CRLF (Windows-line-ending configs, .bat,
.cmd, .ini files). Before this commit:

- write_file silently normalized CRLF to LF on every overwrite
- patch produced mixed-ending files: the substituted region had LF,
  the surrounding context kept CRLF

The fix detects the file's existing line endings (via pre_content if
already read for lint/LSP, otherwise a tiny head -c 4096 probe), and
normalizes the entire write to that ending. New files are written
verbatim (no detection possible).

## Per-file failure escalation (file_tools.py)

When the agent fails to patch the same file 3+ times in a row, the
existing 'old_string not found' hint isn't strong enough — the model
keeps retrying with variations against a stale view of the file.

The fix tracks consecutive failures per (task_id, resolved_path) and
injects an escalating hint after 3 failures: 'This is failure #N
patching X. Stop retrying. Either re-read fresh, use longer context,
or fall back to write_file.' Counter resets on a successful patch to
the same path.

## Validation

- 22 new tests across tests/tools/test_fuzzy_match.py (5),
  test_line_ending_preservation.py (12), test_patch_failure_tracking.py (5)
- All existing tests pass (165/165 in the touched files)
- E2E verified with real _handle_patch / _handle_write_file calls
  against real CRLF files and real failure loops

Closes part of NousResearch#507. The remaining open items in NousResearch#507 (2b start_line
hint, behavioral rules) were declined after audit:
- 2b adds schema bloat for a problem the existing 'multiple matches'
  contract already handles
- Behavioral rules conflict with the personality system

Items 1, 2d, 2e, 3, 4 of NousResearch#507 were already landed in earlier work.
teddyjfpender added a commit to teddyjfpender/superforecasting-agent that referenced this pull request May 30, 2026
…tion (NousResearch#32273)

Port three patch-tool reliability refinements from upstream's Roo Code
deep-dive (NousResearch#507/NousResearch#32273), adapted to our diverged file tooling.

Indentation preservation (tools/fuzzy_match.py): when fuzzy_find_and_replace
matches via a non-exact strategy, re-indent new_string by the delta between
old_string's first meaningful line and the matched region's, so a zero-indent
old/new from the model lands at the file's actual indent depth instead of
splicing a broken indent level. Exact matches pass through untouched.

CRLF preservation (tools/file_operations.py): detect the file's existing line
endings (via pre_content if already read for lint/LSP, else a head -c 4096
probe) and normalize the whole write to that ending. Stops write_file from
silently flipping a CRLF file to LF, and stops patch from producing mixed
endings when only a substituted region changed. New files written verbatim.

Per-file failure escalation (tools/file_tools.py): track consecutive patch
failures per (task_id, resolved_path); after 3 in a row on the same path,
inject an escalating hint telling the model to stop retrying variations and
re-read / widen old_string / fall back to write_file. Counter resets on a
successful patch. Bounded to 64 distinct paths per task.

22 new tests (test_fuzzy_match +6, test_line_ending_preservation +12,
test_patch_failure_tracking +5); existing file_operations/file_tools/
patch suites stay green (143 passed).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
teddyjfpender added a commit to teddyjfpender/superforecasting-agent that referenced this pull request May 30, 2026
…ted (Wave 4 patch reliability)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
…ilure escalation (NousResearch#507) (NousResearch#32273)

Three granular patch-tool refinements from the Roo Code deep-dive (NousResearch#507).

## Indentation preservation (fuzzy_match.py)

When fuzzy_find_and_replace matches via a non-exact strategy, the file's
indentation may differ from what the LLM sent in old_string/new_string
(common case: model sends zero-indent old/new for a method body that
lives inside an 8-space-indented class). Before this commit the
replacement was spliced in verbatim, producing a file with a broken
indent level that may still parse but is logically wrong.

The fix computes the indent delta between old_string's first meaningful
line and the matched region's first meaningful line, then re-indents
every line of new_string by that delta. Exact-strategy matches are
untouched (passthrough). Same approach as Roo Code's
multi-search-replace.ts:466-500.

## CRLF preservation (file_operations.py)

Models nearly always send tool args with bare LF endings (JSON-encoded),
but the file on disk may have CRLF (Windows-line-ending configs, .bat,
.cmd, .ini files). Before this commit:

- write_file silently normalized CRLF to LF on every overwrite
- patch produced mixed-ending files: the substituted region had LF,
  the surrounding context kept CRLF

The fix detects the file's existing line endings (via pre_content if
already read for lint/LSP, otherwise a tiny head -c 4096 probe), and
normalizes the entire write to that ending. New files are written
verbatim (no detection possible).

## Per-file failure escalation (file_tools.py)

When the agent fails to patch the same file 3+ times in a row, the
existing 'old_string not found' hint isn't strong enough — the model
keeps retrying with variations against a stale view of the file.

The fix tracks consecutive failures per (task_id, resolved_path) and
injects an escalating hint after 3 failures: 'This is failure #N
patching X. Stop retrying. Either re-read fresh, use longer context,
or fall back to write_file.' Counter resets on a successful patch to
the same path.

## Validation

- 22 new tests across tests/tools/test_fuzzy_match.py (5),
  test_line_ending_preservation.py (12), test_patch_failure_tracking.py (5)
- All existing tests pass (165/165 in the touched files)
- E2E verified with real _handle_patch / _handle_write_file calls
  against real CRLF files and real failure loops

Closes part of NousResearch#507. The remaining open items in NousResearch#507 (2b start_line
hint, behavioral rules) were declined after audit:
- 2b adds schema bloat for a problem the existing 'multiple matches'
  contract already handles
- Behavioral rules conflict with the personality system

Items 1, 2d, 2e, 3, 4 of NousResearch#507 were already landed in earlier work.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P2 Medium — degraded but workaround exists tool/file File tools (read, write, patch, search) type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants