Skip to content

fix: detect, warn, and block file re-read/search loops after context compression#705

Merged
teknium1 merged 5 commits into
NousResearch:mainfrom
0xbyt4:fix/reading-loop-detection
Mar 10, 2026
Merged

fix: detect, warn, and block file re-read/search loops after context compression#705
teknium1 merged 5 commits into
NousResearch:mainfrom
0xbyt4:fix/reading-loop-detection

Conversation

@0xbyt4

@0xbyt4 0xbyt4 commented Mar 8, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes the issue where the agent gets stuck in an infinite reading loop after context compression, re-reading the same files endlessly without writing or responding.

Root cause: Context compression summarizes conversation history but loses track of which files were already read. After compression, the model thinks it hasn't examined the files yet and reads them again. Combined with todo re-injection of completed items, this creates an infinite loop.

Fix (multi-layered):

  1. Read tracking with escalation (tools/file_tools.py): Track file reads per task. 2nd read returns a soft warning with content. 3rd+ read blocks — returns error with no content, forcing the model to stop.

  2. Search tracking (tools/file_tools.py): Same mechanism for search_files — identical searches are warned then blocked after 3 repeats.

  3. File history injection (run_agent.py): After context compression, inject a structured message listing all files already read with "do NOT re-read" instruction.

  4. Todo re-injection filtering (tools/todo_tool.py): format_for_injection() now filters out completed/cancelled todos. Only pending/in_progress items are re-injected after compression, preventing the model from re-doing finished work.

Design decisions:

  • Escalating response: warn (2nd) → block (3rd+) — gives the model one chance before hard-stopping
  • Thread-safe tracking with _read_tracker_lock
  • Task-isolated — different tasks have separate trackers
  • Pagination-aware — different offsets of the same file don't trigger false warnings
  • Search tracking keyed on (pattern, target, path, file_glob) — different queries are independent

Test plan

  • 26 unit tests: read warning/blocking, search warning/blocking, task isolation, different file/region/pattern, summary accuracy, tracker cleanup, compression history injection, todo filtering
  • Manual bash verification of all 3 fixes with real function calls

0xbyt4 added 2 commits March 8, 2026 20:44
When context compression summarizes conversation history, the agent
loses track of which files it already read and re-reads them in a loop.
Users report the agent reading the same files endlessly without writing.

Root cause: context compression is lossy — file contents and read history
are lost in the summary. After compression, the model thinks it hasn't
examined the files yet and reads them again.

Fix (two-part):
1. Track file reads per task in file_tools.py. When the same file region
   is read again, include a _warning in the response telling the model
   to stop re-reading and use existing information.
2. After context compression, inject a structured message listing all
   files already read in the session with explicit "do NOT re-read"
   instruction, preserving read history across compression boundaries.

Adds 16 tests covering warning detection, task isolation, summary
accuracy, tracker cleanup, and compression history injection.
…ted todos

- Block file reads after 3+ re-reads of same region (no content returned)
- Track search_files calls and block repeated identical searches
- Filter completed/cancelled todos from post-compression injection
  to prevent agent from re-doing finished work
- Add 10 new tests covering all three fixes
@0xbyt4 0xbyt4 changed the title fix: detect and warn on file re-read loops after context compression fix: detect, warn, and block file re-read/search loops after context compression Mar 8, 2026
0xbyt4 added 3 commits March 8, 2026 23:07
Completed/cancelled items are now filtered from format_for_injection()
output. Update the existing test to verify active items appear and
completed items are excluded.
Combine read/search loop detection with main's redact_sensitive_text
and truncation hint features. Add tracker reset to TestSearchHints
to prevent cross-test state leakage.
_FakeReadResult and _FakeSearchResult now expose the attributes
that read_file_tool/search_tool access after the redact_sensitive_text
integration from main.
@teknium1 teknium1 merged commit b53d5da into NousResearch:main Mar 10, 2026
1 check passed
teknium1 added a commit that referenced this pull request Mar 10, 2026
…ds, fix bugs

Follow-up to PR #705 (merged from 0xbyt4). Addresses several issues:

1. CONSECUTIVE-ONLY TRACKING: Redesigned the read/search tracker to only
   warn/block on truly consecutive identical calls. Any other tool call
   in between (write, patch, terminal, etc.) resets the counter via
   notify_other_tool_call(), called from handle_function_call() in
   model_tools.py. This prevents false blocks in read→edit→verify flows.

2. THRESHOLD ADJUSTMENT: Warn on 3rd consecutive (was 2nd), block on
   4th+ consecutive (was 3rd+). Gives the model more room before
   intervening.

3. TUPLE UNPACKING BUG: Fixed get_read_files_summary() which crashed on
   search keys (5-tuple) when trying to unpack as 3-tuple. Now uses a
   separate read_history set that only tracks file reads.

4. WEB_EXTRACT DOCSTRING: Reverted incorrect removal of 'title' from
   web_extract return docs in code_execution_tool.py — the field IS
   returned by web_tools.py.

5. TESTS: Rewrote test_read_loop_detection.py (35 tests) to cover
   consecutive-only behavior, notify_other_tool_call, interleaved
   read/search, and summary-unaffected-by-searches.
@teknium1

Copy link
Copy Markdown
Contributor

Merged! Thanks for the contribution @0xbyt4 — the read-loop detection and todo injection filtering are great additions.

I pushed a follow-up commit (a458b53) on top with several improvements:

  1. Consecutive-only tracking — The counter now resets whenever any other tool is called in between (write, patch, terminal, etc.), so only truly back-to-back identical reads/searches trigger warnings. This prevents false blocks in legitimate read→edit→verify workflows.

  2. Adjusted thresholds — Warn on 3rd consecutive (was 2nd), block on 4th+ (was 3rd+).

  3. Fixed tuple unpacking bugget_read_files_summary() was crashing on search keys (5-tuple) when trying to unpack as 3-tuple. Now uses a separate read_history set that only tracks file reads, so searches don't corrupt the summary.

  4. Reverted web_extract docstring — The title field IS returned by web_tools.py, so restored it in the docs.

  5. Tests updated — 35 tests covering the new consecutive-only behavior, notify_other_tool_call, interleaved operations, etc.

Sarge-Reaper added a commit to Sarge-Reaper/hermes-agent that referenced this pull request Mar 24, 2026
Applied Karpathy's autoresearch pattern to autonomously optimize the
context compressor. 50 experiments run, 8 improvements kept.

- _SUMMARY_RATIO 0.20 → 0.30 (more budget for summaries)
- _MIN_SUMMARY_TOKENS 2000 → 500 (no inflation on short conversations)
- _MAX_SUMMARY_TOKENS 8000 → 4000 (tighter cap)
- _DEFAULT_TAIL_TOKEN_BUDGET 20000 → 8000 (more aggressive compression)
- Truncation 3000 → 4500 chars (retains more tool output)
- Regex file path pre-extraction with "MUST appear in summary"
- Template restructured: Relevant Files + Critical Context moved up
- MANDATORY PRESERVATION RULES added to both prompts

Addresses NousResearch#705, NousResearch#1273, and context drift from lossy summarization.
Score improved 3.6% (0.6346 → 0.6572).
angelburgosrosado pushed a commit to angelburgosrosado/hermes-agent that referenced this pull request Apr 27, 2026
…search loops after context compression

Authored by 0xbyt4. Adds read/search loop detection, file history injection after compression, and todo filtering for active items only.
angelburgosrosado pushed a commit to angelburgosrosado/hermes-agent that referenced this pull request Apr 27, 2026
…ds, fix bugs

Follow-up to PR NousResearch#705 (merged from 0xbyt4). Addresses several issues:

1. CONSECUTIVE-ONLY TRACKING: Redesigned the read/search tracker to only
   warn/block on truly consecutive identical calls. Any other tool call
   in between (write, patch, terminal, etc.) resets the counter via
   notify_other_tool_call(), called from handle_function_call() in
   model_tools.py. This prevents false blocks in read→edit→verify flows.

2. THRESHOLD ADJUSTMENT: Warn on 3rd consecutive (was 2nd), block on
   4th+ consecutive (was 3rd+). Gives the model more room before
   intervening.

3. TUPLE UNPACKING BUG: Fixed get_read_files_summary() which crashed on
   search keys (5-tuple) when trying to unpack as 3-tuple. Now uses a
   separate read_history set that only tracks file reads.

4. WEB_EXTRACT DOCSTRING: Reverted incorrect removal of 'title' from
   web_extract return docs in code_execution_tool.py — the field IS
   returned by web_tools.py.

5. TESTS: Rewrote test_read_loop_detection.py (35 tests) to cover
   consecutive-only behavior, notify_other_tool_call, interleaved
   read/search, and summary-unaffected-by-searches.
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
…search loops after context compression

Authored by 0xbyt4. Adds read/search loop detection, file history injection after compression, and todo filtering for active items only.
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
…ds, fix bugs

Follow-up to PR NousResearch#705 (merged from 0xbyt4). Addresses several issues:

1. CONSECUTIVE-ONLY TRACKING: Redesigned the read/search tracker to only
   warn/block on truly consecutive identical calls. Any other tool call
   in between (write, patch, terminal, etc.) resets the counter via
   notify_other_tool_call(), called from handle_function_call() in
   model_tools.py. This prevents false blocks in read→edit→verify flows.

2. THRESHOLD ADJUSTMENT: Warn on 3rd consecutive (was 2nd), block on
   4th+ consecutive (was 3rd+). Gives the model more room before
   intervening.

3. TUPLE UNPACKING BUG: Fixed get_read_files_summary() which crashed on
   search keys (5-tuple) when trying to unpack as 3-tuple. Now uses a
   separate read_history set that only tracks file reads.

4. WEB_EXTRACT DOCSTRING: Reverted incorrect removal of 'title' from
   web_extract return docs in code_execution_tool.py — the field IS
   returned by web_tools.py.

5. TESTS: Rewrote test_read_loop_detection.py (35 tests) to cover
   consecutive-only behavior, notify_other_tool_call, interleaved
   read/search, and summary-unaffected-by-searches.
olympus-terminal pushed a commit to olympus-terminal/hermes-agent that referenced this pull request May 16, 2026
…search loops after context compression

Authored by 0xbyt4. Adds read/search loop detection, file history injection after compression, and todo filtering for active items only.
olympus-terminal pushed a commit to olympus-terminal/hermes-agent that referenced this pull request May 16, 2026
…ds, fix bugs

Follow-up to PR NousResearch#705 (merged from 0xbyt4). Addresses several issues:

1. CONSECUTIVE-ONLY TRACKING: Redesigned the read/search tracker to only
   warn/block on truly consecutive identical calls. Any other tool call
   in between (write, patch, terminal, etc.) resets the counter via
   notify_other_tool_call(), called from handle_function_call() in
   model_tools.py. This prevents false blocks in read→edit→verify flows.

2. THRESHOLD ADJUSTMENT: Warn on 3rd consecutive (was 2nd), block on
   4th+ consecutive (was 3rd+). Gives the model more room before
   intervening.

3. TUPLE UNPACKING BUG: Fixed get_read_files_summary() which crashed on
   search keys (5-tuple) when trying to unpack as 3-tuple. Now uses a
   separate read_history set that only tracks file reads.

4. WEB_EXTRACT DOCSTRING: Reverted incorrect removal of 'title' from
   web_extract return docs in code_execution_tool.py — the field IS
   returned by web_tools.py.

5. TESTS: Rewrote test_read_loop_detection.py (35 tests) to cover
   consecutive-only behavior, notify_other_tool_call, interleaved
   read/search, and summary-unaffected-by-searches.
nnnet added a commit to nnnet/hermes-agent that referenced this pull request May 20, 2026
Wraps `mc_task_update {status: assigned, retry_count++}` with policy
guards so an mc-pm-chief can retry a failed/blocked sub-task without
looping forever:

  - Capped by HERMES_MC_RETRY_COUNT env (default 3); explicit
    `max_retries` arg overrides per-call.
  - Refuses to retry tasks not in {failed, blocked}.
  - Optional `expected_assignee` guard refuses retry if operator
    reassigned the task to a different agent during failure handling.
  - Returns `alert: true` when budget is exhausted so the chief can
    surface a kanban_block or TG escalation instead of silently giving
    up.

Combined with the Phase 3 webhook filter and Phase 4 cost summary, a
mc-pm-chief can now run a complete retry-then-escalate loop:

    if task_status == 'failed':
        out = mc_task_retry(task_id=X, expected_assignee=A)
        if out.get('alert'):
            mc_task_comment(task_id=X, content='retry budget exhausted')
            kanban_block(reason='max retries hit, needs operator')

Tests: 86/86 (76 + 10 new) — validates id/max_retries/assignee,
budget-exhausted alert shape, env default vs explicit override,
reassignment refusal, dispatch-kwargs regression.

Closes Phase 5 of plans/hermes-mc-combo.md. All 5 phases of the combo
plan now implemented; only upstream PR NousResearch#705 (pnpm migration in
builderz-labs/mission-control) remains pending review.
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
…search loops after context compression

Authored by 0xbyt4. Adds read/search loop detection, file history injection after compression, and todo filtering for active items only.
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
…ds, fix bugs

Follow-up to PR NousResearch#705 (merged from 0xbyt4). Addresses several issues:

1. CONSECUTIVE-ONLY TRACKING: Redesigned the read/search tracker to only
   warn/block on truly consecutive identical calls. Any other tool call
   in between (write, patch, terminal, etc.) resets the counter via
   notify_other_tool_call(), called from handle_function_call() in
   model_tools.py. This prevents false blocks in read→edit→verify flows.

2. THRESHOLD ADJUSTMENT: Warn on 3rd consecutive (was 2nd), block on
   4th+ consecutive (was 3rd+). Gives the model more room before
   intervening.

3. TUPLE UNPACKING BUG: Fixed get_read_files_summary() which crashed on
   search keys (5-tuple) when trying to unpack as 3-tuple. Now uses a
   separate read_history set that only tracks file reads.

4. WEB_EXTRACT DOCSTRING: Reverted incorrect removal of 'title' from
   web_extract return docs in code_execution_tool.py — the field IS
   returned by web_tools.py.

5. TESTS: Rewrote test_read_loop_detection.py (35 tests) to cover
   consecutive-only behavior, notify_other_tool_call, interleaved
   read/search, and summary-unaffected-by-searches.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants