fix: detect, warn, and block file re-read/search loops after context compression by 0xbyt4 · Pull Request #705 · NousResearch/hermes-agent

0xbyt4 · 2026-03-08T18:06:43Z

Summary

Fixes the issue where the agent gets stuck in an infinite reading loop after context compression, re-reading the same files endlessly without writing or responding.

Root cause: Context compression summarizes conversation history but loses track of which files were already read. After compression, the model thinks it hasn't examined the files yet and reads them again. Combined with todo re-injection of completed items, this creates an infinite loop.

Fix (multi-layered):

Read tracking with escalation (tools/file_tools.py): Track file reads per task. 2nd read returns a soft warning with content. 3rd+ read blocks — returns error with no content, forcing the model to stop.
Search tracking (tools/file_tools.py): Same mechanism for search_files — identical searches are warned then blocked after 3 repeats.
File history injection (run_agent.py): After context compression, inject a structured message listing all files already read with "do NOT re-read" instruction.
Todo re-injection filtering (tools/todo_tool.py): format_for_injection() now filters out completed/cancelled todos. Only pending/in_progress items are re-injected after compression, preventing the model from re-doing finished work.

Design decisions:

Escalating response: warn (2nd) → block (3rd+) — gives the model one chance before hard-stopping
Thread-safe tracking with _read_tracker_lock
Task-isolated — different tasks have separate trackers
Pagination-aware — different offsets of the same file don't trigger false warnings
Search tracking keyed on (pattern, target, path, file_glob) — different queries are independent

Test plan

26 unit tests: read warning/blocking, search warning/blocking, task isolation, different file/region/pattern, summary accuracy, tracker cleanup, compression history injection, todo filtering
Manual bash verification of all 3 fixes with real function calls

When context compression summarizes conversation history, the agent loses track of which files it already read and re-reads them in a loop. Users report the agent reading the same files endlessly without writing. Root cause: context compression is lossy — file contents and read history are lost in the summary. After compression, the model thinks it hasn't examined the files yet and reads them again. Fix (two-part): 1. Track file reads per task in file_tools.py. When the same file region is read again, include a _warning in the response telling the model to stop re-reading and use existing information. 2. After context compression, inject a structured message listing all files already read in the session with explicit "do NOT re-read" instruction, preserving read history across compression boundaries. Adds 16 tests covering warning detection, task isolation, summary accuracy, tracker cleanup, and compression history injection.

…ted todos - Block file reads after 3+ re-reads of same region (no content returned) - Track search_files calls and block repeated identical searches - Filter completed/cancelled todos from post-compression injection to prevent agent from re-doing finished work - Add 10 new tests covering all three fixes

Completed/cancelled items are now filtered from format_for_injection() output. Update the existing test to verify active items appear and completed items are excluded.

Combine read/search loop detection with main's redact_sensitive_text and truncation hint features. Add tracker reset to TestSearchHints to prevent cross-test state leakage.

_FakeReadResult and _FakeSearchResult now expose the attributes that read_file_tool/search_tool access after the redact_sensitive_text integration from main.

…ds, fix bugs Follow-up to PR #705 (merged from 0xbyt4). Addresses several issues: 1. CONSECUTIVE-ONLY TRACKING: Redesigned the read/search tracker to only warn/block on truly consecutive identical calls. Any other tool call in between (write, patch, terminal, etc.) resets the counter via notify_other_tool_call(), called from handle_function_call() in model_tools.py. This prevents false blocks in read→edit→verify flows. 2. THRESHOLD ADJUSTMENT: Warn on 3rd consecutive (was 2nd), block on 4th+ consecutive (was 3rd+). Gives the model more room before intervening. 3. TUPLE UNPACKING BUG: Fixed get_read_files_summary() which crashed on search keys (5-tuple) when trying to unpack as 3-tuple. Now uses a separate read_history set that only tracks file reads. 4. WEB_EXTRACT DOCSTRING: Reverted incorrect removal of 'title' from web_extract return docs in code_execution_tool.py — the field IS returned by web_tools.py. 5. TESTS: Rewrote test_read_loop_detection.py (35 tests) to cover consecutive-only behavior, notify_other_tool_call, interleaved read/search, and summary-unaffected-by-searches.

teknium1 · 2026-03-10T23:25:54Z

Merged! Thanks for the contribution @0xbyt4 — the read-loop detection and todo injection filtering are great additions.

I pushed a follow-up commit (a458b53) on top with several improvements:

Consecutive-only tracking — The counter now resets whenever any other tool is called in between (write, patch, terminal, etc.), so only truly back-to-back identical reads/searches trigger warnings. This prevents false blocks in legitimate read→edit→verify workflows.
Adjusted thresholds — Warn on 3rd consecutive (was 2nd), block on 4th+ (was 3rd+).
Fixed tuple unpacking bug — get_read_files_summary() was crashing on search keys (5-tuple) when trying to unpack as 3-tuple. Now uses a separate read_history set that only tracks file reads, so searches don't corrupt the summary.
Reverted web_extract docstring — The title field IS returned by web_tools.py, so restored it in the docs.
Tests updated — 35 tests covering the new consecutive-only behavior, notify_other_tool_call, interleaved operations, etc.

Applied Karpathy's autoresearch pattern to autonomously optimize the context compressor. 50 experiments run, 8 improvements kept. - _SUMMARY_RATIO 0.20 → 0.30 (more budget for summaries) - _MIN_SUMMARY_TOKENS 2000 → 500 (no inflation on short conversations) - _MAX_SUMMARY_TOKENS 8000 → 4000 (tighter cap) - _DEFAULT_TAIL_TOKEN_BUDGET 20000 → 8000 (more aggressive compression) - Truncation 3000 → 4500 chars (retains more tool output) - Regex file path pre-extraction with "MUST appear in summary" - Template restructured: Relevant Files + Critical Context moved up - MANDATORY PRESERVATION RULES added to both prompts Addresses NousResearch#705, NousResearch#1273, and context drift from lossy summarization. Score improved 3.6% (0.6346 → 0.6572).

…search loops after context compression Authored by 0xbyt4. Adds read/search loop detection, file history injection after compression, and todo filtering for active items only.

…ds, fix bugs Follow-up to PR NousResearch#705 (merged from 0xbyt4). Addresses several issues: 1. CONSECUTIVE-ONLY TRACKING: Redesigned the read/search tracker to only warn/block on truly consecutive identical calls. Any other tool call in between (write, patch, terminal, etc.) resets the counter via notify_other_tool_call(), called from handle_function_call() in model_tools.py. This prevents false blocks in read→edit→verify flows. 2. THRESHOLD ADJUSTMENT: Warn on 3rd consecutive (was 2nd), block on 4th+ consecutive (was 3rd+). Gives the model more room before intervening. 3. TUPLE UNPACKING BUG: Fixed get_read_files_summary() which crashed on search keys (5-tuple) when trying to unpack as 3-tuple. Now uses a separate read_history set that only tracks file reads. 4. WEB_EXTRACT DOCSTRING: Reverted incorrect removal of 'title' from web_extract return docs in code_execution_tool.py — the field IS returned by web_tools.py. 5. TESTS: Rewrote test_read_loop_detection.py (35 tests) to cover consecutive-only behavior, notify_other_tool_call, interleaved read/search, and summary-unaffected-by-searches.

…search loops after context compression Authored by 0xbyt4. Adds read/search loop detection, file history injection after compression, and todo filtering for active items only.

…ds, fix bugs Follow-up to PR NousResearch#705 (merged from 0xbyt4). Addresses several issues: 1. CONSECUTIVE-ONLY TRACKING: Redesigned the read/search tracker to only warn/block on truly consecutive identical calls. Any other tool call in between (write, patch, terminal, etc.) resets the counter via notify_other_tool_call(), called from handle_function_call() in model_tools.py. This prevents false blocks in read→edit→verify flows. 2. THRESHOLD ADJUSTMENT: Warn on 3rd consecutive (was 2nd), block on 4th+ consecutive (was 3rd+). Gives the model more room before intervening. 3. TUPLE UNPACKING BUG: Fixed get_read_files_summary() which crashed on search keys (5-tuple) when trying to unpack as 3-tuple. Now uses a separate read_history set that only tracks file reads. 4. WEB_EXTRACT DOCSTRING: Reverted incorrect removal of 'title' from web_extract return docs in code_execution_tool.py — the field IS returned by web_tools.py. 5. TESTS: Rewrote test_read_loop_detection.py (35 tests) to cover consecutive-only behavior, notify_other_tool_call, interleaved read/search, and summary-unaffected-by-searches.

…search loops after context compression Authored by 0xbyt4. Adds read/search loop detection, file history injection after compression, and todo filtering for active items only.

…ds, fix bugs Follow-up to PR NousResearch#705 (merged from 0xbyt4). Addresses several issues: 1. CONSECUTIVE-ONLY TRACKING: Redesigned the read/search tracker to only warn/block on truly consecutive identical calls. Any other tool call in between (write, patch, terminal, etc.) resets the counter via notify_other_tool_call(), called from handle_function_call() in model_tools.py. This prevents false blocks in read→edit→verify flows. 2. THRESHOLD ADJUSTMENT: Warn on 3rd consecutive (was 2nd), block on 4th+ consecutive (was 3rd+). Gives the model more room before intervening. 3. TUPLE UNPACKING BUG: Fixed get_read_files_summary() which crashed on search keys (5-tuple) when trying to unpack as 3-tuple. Now uses a separate read_history set that only tracks file reads. 4. WEB_EXTRACT DOCSTRING: Reverted incorrect removal of 'title' from web_extract return docs in code_execution_tool.py — the field IS returned by web_tools.py. 5. TESTS: Rewrote test_read_loop_detection.py (35 tests) to cover consecutive-only behavior, notify_other_tool_call, interleaved read/search, and summary-unaffected-by-searches.

Wraps `mc_task_update {status: assigned, retry_count++}` with policy guards so an mc-pm-chief can retry a failed/blocked sub-task without looping forever: - Capped by HERMES_MC_RETRY_COUNT env (default 3); explicit `max_retries` arg overrides per-call. - Refuses to retry tasks not in {failed, blocked}. - Optional `expected_assignee` guard refuses retry if operator reassigned the task to a different agent during failure handling. - Returns `alert: true` when budget is exhausted so the chief can surface a kanban_block or TG escalation instead of silently giving up. Combined with the Phase 3 webhook filter and Phase 4 cost summary, a mc-pm-chief can now run a complete retry-then-escalate loop: if task_status == 'failed': out = mc_task_retry(task_id=X, expected_assignee=A) if out.get('alert'): mc_task_comment(task_id=X, content='retry budget exhausted') kanban_block(reason='max retries hit, needs operator') Tests: 86/86 (76 + 10 new) — validates id/max_retries/assignee, budget-exhausted alert shape, env default vs explicit override, reassignment refusal, dispatch-kwargs regression. Closes Phase 5 of plans/hermes-mc-combo.md. All 5 phases of the combo plan now implemented; only upstream PR NousResearch#705 (pnpm migration in builderz-labs/mission-control) remains pending review.

…search loops after context compression Authored by 0xbyt4. Adds read/search loop detection, file history injection after compression, and todo filtering for active items only.

…ds, fix bugs Follow-up to PR NousResearch#705 (merged from 0xbyt4). Addresses several issues: 1. CONSECUTIVE-ONLY TRACKING: Redesigned the read/search tracker to only warn/block on truly consecutive identical calls. Any other tool call in between (write, patch, terminal, etc.) resets the counter via notify_other_tool_call(), called from handle_function_call() in model_tools.py. This prevents false blocks in read→edit→verify flows. 2. THRESHOLD ADJUSTMENT: Warn on 3rd consecutive (was 2nd), block on 4th+ consecutive (was 3rd+). Gives the model more room before intervening. 3. TUPLE UNPACKING BUG: Fixed get_read_files_summary() which crashed on search keys (5-tuple) when trying to unpack as 3-tuple. Now uses a separate read_history set that only tracks file reads. 4. WEB_EXTRACT DOCSTRING: Reverted incorrect removal of 'title' from web_extract return docs in code_execution_tool.py — the field IS returned by web_tools.py. 5. TESTS: Rewrote test_read_loop_detection.py (35 tests) to cover consecutive-only behavior, notify_other_tool_call, interleaved read/search, and summary-unaffected-by-searches.

0xbyt4 added 2 commits March 8, 2026 20:44

0xbyt4 changed the title ~~fix: detect and warn on file re-read loops after context compression~~ fix: detect, warn, and block file re-read/search loops after context compression Mar 8, 2026

0xbyt4 added 3 commits March 8, 2026 23:07

fix: update test_non_empty_has_markers to match todo filtering behavior

67421ed

Completed/cancelled items are now filtered from format_for_injection() output. Update the existing test to verify active items appear and completed items are excluded.

merge: resolve file_tools.py conflict with origin/main

4684aaf

Combine read/search loop detection with main's redact_sensitive_text and truncation hint features. Add tracker reset to TestSearchHints to prevent cross-test state leakage.

fix(tests): add content attribute to fake result objects

912efe1

_FakeReadResult and _FakeSearchResult now expose the attributes that read_file_tool/search_tool access after the redact_sensitive_text integration from main.

teknium1 merged commit b53d5da into NousResearch:main Mar 10, 2026
1 check passed

teknium1 mentioned this pull request Mar 20, 2026

bug: post-compression file-read history injected as role=user, breaking message semantics #2224

Closed

Sarge-Reaper mentioned this pull request Mar 24, 2026

feat(compression): optimize context compressor via autoresearch #2866

Closed

4 tasks

SHL0MS mentioned this pull request Mar 31, 2026

Feature: Context Compaction Quality Overhaul — Handoff-Oriented Prompts, User Message Preservation, and Configurable Compaction (inspired by Codex CLI) #499

Closed

ganzercode mentioned this pull request May 27, 2026

[Bug]: Error: 'NoneType' object is not iterable #32892

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: detect, warn, and block file re-read/search loops after context compression#705

fix: detect, warn, and block file re-read/search loops after context compression#705
teknium1 merged 5 commits into
NousResearch:mainfrom
0xbyt4:fix/reading-loop-detection

0xbyt4 commented Mar 8, 2026 •

edited

Loading

Uh oh!

Uh oh!

teknium1 commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

0xbyt4 commented Mar 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

Uh oh!

teknium1 commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

0xbyt4 commented Mar 8, 2026 •

edited

Loading