Overview
A deep architectural comparison between Roo Code (Apache 2.0, TypeScript VS Code AI agent) and Hermes Agent reveals several subtle but high-impact patterns we should adopt. These are not new features — they are refinements to existing systems that improve reliability, quality, and developer experience.
Each section below is independently implementable. They're grouped in one issue because they share a common origin (the Roo Code comparison), but could be split into separate PRs.
Related issues with Roo Code context added via comments: #499 (Context Compaction — condensation prompt), #481 (Loop Guard — repetition detection), #476 (Agent Modes), #344 (Multi-Agent — orchestrator pattern), #452 (Checkpoints — shadow git approach)
1. Tool Output Truncation: Head/Tail Split (HIGH IMPACT)
Current state: run_agent.py:2606-2613 — Global 100K char limit, HEAD-ONLY truncation:
function_result = function_result[:MAX_TOOL_RESULT_CHARS]
Problem: Build errors, test failures, and stack traces appear at the END of output. Head-only truncation loses exactly the information the agent needs most. A make build with 80K chars of success output followed by a 5K error message will show only success output and lose the error.
Roo Code's approach: Head/tail buffer with 50/50 split. Full output spills to disk; model can retrieve via read_command_output tool with byte-offset pagination and regex search.
Proposed fix:
MAX_TOOL_RESULT_CHARS = 100_000
if len(function_result) > MAX_TOOL_RESULT_CHARS:
head_size = MAX_TOOL_RESULT_CHARS * 2 // 5 # 40K head
tail_size = MAX_TOOL_RESULT_CHARS * 3 // 5 # 60K tail
function_result = (
function_result[:head_size]
+ f"\n\n[... {len(function_result) - head_size - tail_size:,} chars truncated ...]\n\n"
+ function_result[-tail_size:]
)
Effort: ~10 lines changed in run_agent.py. Zero risk.
2. Patch Tool Refinements (MEDIUM-HIGH IMPACT)
2a. Indentation Preservation on Fuzzy Match
Current state: When fuzzy_match.py matches via strategies 2-8 (non-exact), it maps positions back to the original content but replaces with the LLM's new_string literally. If the LLM's indentation differs from the file's, the replacement has wrong indentation.
Roo Code's approach (multi-search-replace.ts:466-500): After fuzzy matching, calculates the relative indentation between the search block's first line and the file's matched region, then adjusts all lines of the replacement to maintain the file's indentation pattern.
Example:
File has (4-space indent):
def foo():
return bar
LLM sends (2-space indent):
def foo():
return baz
With indentation preservation:
def foo():
return baz # 4-space indent preserved
2b. Line Number Hint for Replace Mode
Current state: patch(mode="replace") searches the entire file for old_string. In large files with repeated patterns, this can match the wrong location.
Roo Code's approach: apply_diff accepts :start_line:N which narrows the search to a ±40 line buffer around line N. Dramatically reduces false matches in large files.
Proposed: Add optional start_line parameter to patch tool. When provided, search only within ±50 lines of that line number first, falling back to full-file search if not found. Zero breaking change — parameter is optional.
2c. CRLF Line Ending Preservation
Current state: No explicit CRLF handling. If a file uses \r\n line endings (Windows), edits may silently convert to \n.
Roo Code's approach (EditFileTool.ts): Detects original line endings before edit, normalizes for matching, then restores original endings after replacement.
2d. Unicode Normalization Strategy
Current state: fuzzy_match.py strategy 5 handles \\n/\\t/\\r escape sequences but not Unicode variants.
Roo Code's approach (seek-sequence.ts): Normalizes smart quotes (curly quotes to straight), em-dashes to hyphens, and non-breaking spaces to regular spaces. LLMs frequently generate these Unicode variants instead of ASCII equivalents.
2e. Detailed Error Messages with Recovery Suggestions
Current state: Patch failures return simple strings like "Could not find unique match for old_string."
Roo Code's approach (EditFileTool.ts): Returns structured error details with numbered recovery steps:
Error: Could not find match.
1. Verify the file exists and the content is current (use read_file)
2. Check that the search string matches exactly
3. The file may have been modified since you last read it
4. Try using a more unique search string with more context
2f. Consecutive Failure Tracking Per File
Current state: No per-file error tracking. The agent can fail to edit a file 10 times in a row with no escalating guidance.
Roo Code's approach: Maintains consecutiveMistakeCountForApplyDiff map keyed by file path. After 2+ consecutive failures on the same file, shows escalating warnings.
Proposed: Track per-file failure counts in the agent loop. After 3 consecutive failed edits to the same file, inject a system hint: "You've failed to edit {filename} 3 times. Consider: read the file first to verify current content, use a more unique search string, or try write_file to replace the entire file."
3. Anti-Hallucination Prompt Instructions (MEDIUM-HIGH IMPACT)
Current state: prompt_builder.py has NO anti-hallucination instructions. The system prompt describes the agent's identity and tool-specific guidance but never tells the model to verify tool outcomes or avoid fabrication.
Roo Code's approach: System prompt includes explicit rules:
- "Do not assume the outcome of any tool use. Each step must be informed by the previous step's result."
- "If one of the values for a required parameter is missing, DO NOT invoke the tool (not even with fillers for the missing params)."
- "Do not use the ~ character or $HOME to refer to the home directory."
Proposed addition to prompt_builder.py:
TOOL_USE_RULES = """
Important rules for tool usage:
- Do not assume the outcome of any tool call. Each step must be informed by the previous step's result.
- Do not invoke tools with placeholder or filler values for required parameters.
- After editing a file, verify the edit succeeded before making dependent changes.
- Prefer calling multiple independent tools in a single response when possible.
"""
Effort: ~5 lines added to prompt_builder.py. These are proven instructions from a production agent used by millions.
4. Lightweight Task Methodology Prompt (MEDIUM IMPACT)
Current state: Hermes relies on the model's native reasoning capabilities. No explicit task methodology in the system prompt.
Roo Code's approach: OBJECTIVE section gives a 5-step methodology:
- Analyze the task and set clear goals
- Work through goals sequentially, using available tools
- Before each tool call, assess what information you have and what you need
- Verify each step's results before proceeding
- Present results when complete
Why this matters: Non-reasoning models (no CoT) and smaller models benefit significantly from explicit methodology guidance. Reasoning models can ignore it (they already do this internally), but it provides a floor for all models.
Proposed: Add a lightweight methodology section to the system prompt, conditional on whether the model has native reasoning support.
5. Behavioral Rules (LOW-MEDIUM IMPACT)
Current state: No explicit behavioral constraints. The model can start responses with filler words, end with unnecessary questions, or be overly conversational.
Roo Code's approach: Explicit rules like "NEVER start messages with affirmative phrases like 'Great', 'Certainly', 'Of course'" and "Your goal is to accomplish the user's task, NOT engage in back and forth conversation."
Consideration: These are opinionated and may conflict with Hermes's personality system. Could be added as defaults that personalities override, or as opt-in behavioral presets.
Implementation Priority
| # |
Improvement |
Impact |
Effort |
Risk |
| 1 |
Head/tail truncation |
HIGH |
Very Low (~10 lines) |
None |
| 3 |
Anti-hallucination instructions |
HIGH |
Very Low (~5 lines) |
None |
| 2f |
Per-file failure tracking |
MEDIUM |
Low (~30 lines) |
None |
| 2e |
Detailed error messages |
MEDIUM |
Low |
None |
| 2a |
Indentation preservation |
MEDIUM-HIGH |
Medium |
Low |
| 2d |
Unicode normalization |
MEDIUM |
Low |
None |
| 4 |
Task methodology prompt |
MEDIUM |
Low |
Low |
| 2b |
Line number hint |
MEDIUM |
Medium |
None |
| 2c |
CRLF preservation |
LOW-MEDIUM |
Low |
Low |
| 5 |
Behavioral rules |
LOW |
Very Low |
Opinionated |
References
- Roo Code GitHub (Apache 2.0)
- Roo Code source:
src/core/diff/strategies/multi-search-replace.ts (indentation preservation), src/core/tools/EditFileTool.ts (error messages, CRLF), src/core/tools/apply-patch/seek-sequence.ts (Unicode normalization), src/core/prompts/sections/rules.ts (behavioral rules), src/core/prompts/sections/tool-use-guidelines.ts (anti-hallucination), src/core/prompts/sections/objective.ts (task methodology)
- Hermes source:
run_agent.py:2606-2613 (truncation), tools/fuzzy_match.py (fuzzy matching), tools/patch_parser.py (V4A parser), prompt_builder.py (system prompt), agent/context_compressor.py (compression prompt)
Overview
A deep architectural comparison between Roo Code (Apache 2.0, TypeScript VS Code AI agent) and Hermes Agent reveals several subtle but high-impact patterns we should adopt. These are not new features — they are refinements to existing systems that improve reliability, quality, and developer experience.
Each section below is independently implementable. They're grouped in one issue because they share a common origin (the Roo Code comparison), but could be split into separate PRs.
Related issues with Roo Code context added via comments: #499 (Context Compaction — condensation prompt), #481 (Loop Guard — repetition detection), #476 (Agent Modes), #344 (Multi-Agent — orchestrator pattern), #452 (Checkpoints — shadow git approach)
1. Tool Output Truncation: Head/Tail Split (HIGH IMPACT)
Current state:
run_agent.py:2606-2613— Global 100K char limit, HEAD-ONLY truncation:Problem: Build errors, test failures, and stack traces appear at the END of output. Head-only truncation loses exactly the information the agent needs most. A
make buildwith 80K chars of success output followed by a 5K error message will show only success output and lose the error.Roo Code's approach: Head/tail buffer with 50/50 split. Full output spills to disk; model can retrieve via
read_command_outputtool with byte-offset pagination and regex search.Proposed fix:
Effort: ~10 lines changed in
run_agent.py. Zero risk.2. Patch Tool Refinements (MEDIUM-HIGH IMPACT)
2a. Indentation Preservation on Fuzzy Match
Current state: When
fuzzy_match.pymatches via strategies 2-8 (non-exact), it maps positions back to the original content but replaces with the LLM'snew_stringliterally. If the LLM's indentation differs from the file's, the replacement has wrong indentation.Roo Code's approach (
multi-search-replace.ts:466-500): After fuzzy matching, calculates the relative indentation between the search block's first line and the file's matched region, then adjusts all lines of the replacement to maintain the file's indentation pattern.Example:
2b. Line Number Hint for Replace Mode
Current state:
patch(mode="replace")searches the entire file forold_string. In large files with repeated patterns, this can match the wrong location.Roo Code's approach:
apply_diffaccepts:start_line:Nwhich narrows the search to a ±40 line buffer around line N. Dramatically reduces false matches in large files.Proposed: Add optional
start_lineparameter to patch tool. When provided, search only within ±50 lines of that line number first, falling back to full-file search if not found. Zero breaking change — parameter is optional.2c. CRLF Line Ending Preservation
Current state: No explicit CRLF handling. If a file uses
\r\nline endings (Windows), edits may silently convert to\n.Roo Code's approach (
EditFileTool.ts): Detects original line endings before edit, normalizes for matching, then restores original endings after replacement.2d. Unicode Normalization Strategy
Current state:
fuzzy_match.pystrategy 5 handles\\n/\\t/\\rescape sequences but not Unicode variants.Roo Code's approach (
seek-sequence.ts): Normalizes smart quotes (curly quotes to straight), em-dashes to hyphens, and non-breaking spaces to regular spaces. LLMs frequently generate these Unicode variants instead of ASCII equivalents.2e. Detailed Error Messages with Recovery Suggestions
Current state: Patch failures return simple strings like "Could not find unique match for old_string."
Roo Code's approach (
EditFileTool.ts): Returns structured error details with numbered recovery steps:2f. Consecutive Failure Tracking Per File
Current state: No per-file error tracking. The agent can fail to edit a file 10 times in a row with no escalating guidance.
Roo Code's approach: Maintains
consecutiveMistakeCountForApplyDiffmap keyed by file path. After 2+ consecutive failures on the same file, shows escalating warnings.Proposed: Track per-file failure counts in the agent loop. After 3 consecutive failed edits to the same file, inject a system hint: "You've failed to edit {filename} 3 times. Consider: read the file first to verify current content, use a more unique search string, or try write_file to replace the entire file."
3. Anti-Hallucination Prompt Instructions (MEDIUM-HIGH IMPACT)
Current state:
prompt_builder.pyhas NO anti-hallucination instructions. The system prompt describes the agent's identity and tool-specific guidance but never tells the model to verify tool outcomes or avoid fabrication.Roo Code's approach: System prompt includes explicit rules:
Proposed addition to
prompt_builder.py:Effort: ~5 lines added to
prompt_builder.py. These are proven instructions from a production agent used by millions.4. Lightweight Task Methodology Prompt (MEDIUM IMPACT)
Current state: Hermes relies on the model's native reasoning capabilities. No explicit task methodology in the system prompt.
Roo Code's approach: OBJECTIVE section gives a 5-step methodology:
Why this matters: Non-reasoning models (no CoT) and smaller models benefit significantly from explicit methodology guidance. Reasoning models can ignore it (they already do this internally), but it provides a floor for all models.
Proposed: Add a lightweight methodology section to the system prompt, conditional on whether the model has native reasoning support.
5. Behavioral Rules (LOW-MEDIUM IMPACT)
Current state: No explicit behavioral constraints. The model can start responses with filler words, end with unnecessary questions, or be overly conversational.
Roo Code's approach: Explicit rules like "NEVER start messages with affirmative phrases like 'Great', 'Certainly', 'Of course'" and "Your goal is to accomplish the user's task, NOT engage in back and forth conversation."
Consideration: These are opinionated and may conflict with Hermes's personality system. Could be added as defaults that personalities override, or as opt-in behavioral presets.
Implementation Priority
References
src/core/diff/strategies/multi-search-replace.ts(indentation preservation),src/core/tools/EditFileTool.ts(error messages, CRLF),src/core/tools/apply-patch/seek-sequence.ts(Unicode normalization),src/core/prompts/sections/rules.ts(behavioral rules),src/core/prompts/sections/tool-use-guidelines.ts(anti-hallucination),src/core/prompts/sections/objective.ts(task methodology)run_agent.py:2606-2613(truncation),tools/fuzzy_match.py(fuzzy matching),tools/patch_parser.py(V4A parser),prompt_builder.py(system prompt),agent/context_compressor.py(compression prompt)