Skip to content

fix(tools): unescape common sequences in new_string when escape_normalized matches#33737

Closed
liuhao1024 wants to merge 1 commit into
NousResearch:mainfrom
liuhao1024:fix/patch-tool-unescape-new-string
Closed

fix(tools): unescape common sequences in new_string when escape_normalized matches#33737
liuhao1024 wants to merge 1 commit into
NousResearch:mainfrom
liuhao1024:fix/patch-tool-unescape-new-string

Conversation

@liuhao1024

Copy link
Copy Markdown
Contributor

Summary

When the patch tool's escape_normalized matching strategy succeeds, old_string contained literal \\t, \\n, \\r sequences that were unescaped to match real control characters in the file. However, new_string was written as-is, leaving literal backslash sequences (two characters: \ + t) in the patched file instead of real tab/newline bytes.

Root cause: The _strategy_escape_normalized function unescapes old_string for matching but new_string passes through to _apply_replacements without the same treatment.

Fix: Add _unescape_common_sequences() helper and apply it to new_string only when the matching strategy is escape_normalized. Exact matches are unaffected — literal backslashes are preserved when the file genuinely contains them.

Fixes #33733

Changes

  • tools/fuzzy_match.py: Added _unescape_common_sequences() helper; modified replacement block to unescape new_string when strategy_name == "escape_normalized"
  • tests/tools/test_fuzzy_match.py: Added TestEscapeNormalizedNewString class with 5 regression tests (tab, newline, CR, mixed sequences, exact-match preservation)

Testing

45 passed in 0.81s

All existing tests continue to pass. 5 new regression tests verify:

  • Tab (\\t) in new_string is unescaped to real tab byte
  • Newline (\\n) in new_string is unescaped to real newline
  • Carriage return (\\r) in new_string is unescaped
  • Mixed escape sequences are all unescaped together
  • Exact-match strategy preserves literal backslashes (no unescaping)

Code Intelligence

  • Analyzed: tools/fuzzy_match.py:fuzzy_find_and_replace (callers: tools/patch_tool.py, tests/tools/test_fuzzy_match.py)
  • Blast radius: LOW — only affects new_string processing when escape_normalized strategy matches; exact matches and all other strategies are unaffected
  • Related patterns: _detect_escape_drift (guards \\'/\\" drift), _strategy_escape_normalized (unescapes old_string for matching)

…lized matches

When the patch tool matches via the escape_normalized strategy, old_string
contains literal \t, \n, \r sequences that get unescaped to match real
control characters in the file. However, new_string was written as-is,
leaving literal backslash sequences in the output.

Add _unescape_common_sequences() helper and apply it to new_string when
the matching strategy is escape_normalized. This ensures LLM-generated
tab/newline sequences become real bytes in the patched file.

Fixes NousResearch#33733
@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/tools Tool registry, model_tools, toolsets tool/file File tools (read, write, patch, search) labels May 28, 2026
@teknium1

Copy link
Copy Markdown
Contributor

Merged via PR #33799. Your commit was cherry-picked onto current main with your authorship preserved in git log (commit e9f3f2b).

The follow-up commit widens the fix so it also catches the issue's headline reproduction — when old_string carries a real tab byte and matches under the exact strategy rather than escape_normalized. Instead of gating on the match strategy, the helper now decides per-sequence by checking whether the matched region of the file actually contains the corresponding control byte. That mirrors the region-based heuristic already in _detect_escape_drift and preserves legitimate writes of the literal two-character string "\t" (e.g. patching sep = "\t" in Python source).

Thanks for the contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/tools Tool registry, model_tools, toolsets P2 Medium — degraded but workaround exists tool/file File tools (read, write, patch, search) type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

patch tool: new_string escape sequences (\t) get written literally

3 participants