Skip to content

fix(agent): frame compaction handoff sections as historical context#44345

Closed
konsisumer wants to merge 1 commit into
NousResearch:mainfrom
konsisumer:fix/historical-compaction-handoff
Closed

fix(agent): frame compaction handoff sections as historical context#44345
konsisumer wants to merge 1 commit into
NousResearch:mainfrom
konsisumer:fix/historical-compaction-handoff

Conversation

@konsisumer

Copy link
Copy Markdown
Contributor

What does this PR do?

Reframes persisted context-compaction summaries so earlier work is serialized under explicitly historical section headings instead of live-sounding ## Active Task / ## Pending User Asks / ## Remaining Work labels. This reduces the chance that a resumed or long-idle conversation treats stale compaction handoff text as the current instruction.

Related Issue

Refs #42812

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 🔒 Security fix
  • 📝 Documentation update
  • ✅ Tests (adding or improving test coverage)
  • ♻️ Refactor (no behavior change)
  • 🎯 New skill (bundled or hub)

Changes Made

  • agent/context_compressor.py: introduced historical handoff heading constants and updated the compaction prefix, fallback summary body, and LLM summary template to use historical section names.
  • tests/agent/test_summary_prefix_semantics.py: added coverage that the prefix references historical headings and no longer names active-sounding sections.
  • tests/agent/test_resume_stale_active_task.py: updated resumed-handoff regression fixtures to the new historical task heading.
  • tests/agent/test_context_compressor.py: updated fallback-summary assertions to match the historical heading.
  • tests/agent/test_context_compressor_temporal_anchoring.py: updated structured-template assertion to match the historical heading.

How to Test

  1. Run pytest tests/agent/test_summary_prefix_semantics.py tests/agent/test_resume_stale_active_task.py tests/agent/test_context_compressor.py tests/agent/test_context_compressor_temporal_anchoring.py -q.
  2. Confirm the compaction handoff text now uses ## Historical Task Snapshot, ## Historical Pending User Asks, and ## Historical Remaining Work instead of the old active-sounding headings.
  3. Optionally reproduce the idle/resume scenario from Stale compaction Active Task hijacks resumed sessions after idle timeout #42812 and verify Hermes answers the latest user message instead of resurfacing the old compacted task.
  4. Broad-suite note: local pytest tests/ -q -x --timeout=60 currently aborts during collection on tests/hermes_cli/test_dashboard_auth_401_reauth.py because fastapi is missing in this environment.

What platforms tested on

  • macOS on darwin-arm64 (local)

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix/feature (no unrelated commits)
  • I've run pytest tests/ -q and all tests pass
  • I've added tests for my changes (required for bug fixes, strongly encouraged for features)
  • I've tested on my platform: macOS darwin-arm64

Documentation & Housekeeping

  • I've updated relevant documentation (README, docs/, docstrings) — or N/A
  • I've updated cli-config.yaml.example if I added/changed config keys — or N/A
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — or N/A
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guide — or N/A
  • I've updated tool descriptions/schemas if I changed tool behavior — or N/A

Screenshots / Logs

  • Targeted tests passed: 110 passed in 8.61s.
  • Scoped ruff checks passed on the changed Python files.
  • python scripts/check-windows-footguns.py --all passed.
  • git diff --check passed.
  • Broad suite aborts early on missing fastapi during collection of tests/hermes_cli/test_dashboard_auth_401_reauth.py in this local environment.

@alt-glitch alt-glitch added type/bug Something isn't working comp/agent Core agent loop, run_agent.py, prompt builder P1 High — major feature broken, no workaround labels Jun 11, 2026
@liuhao1024

Copy link
Copy Markdown
Contributor

Code Review — Positive Verification

Reviewed the full diff across agent/context_compressor.py + 4 test files.

What was checked:

  • Section heading renames (## Active TaskHISTORICAL_TASK_HEADING etc.) are consistent across all occurrences — both the constant definitions and the f-string interpolations
  • Test updates mirror the heading constants — no leftover hardcoded "## Active Task" assertions
  • The SUMMARY_PREFIX text correctly references the new constant names in its conflict-resolution directive
  • Deterministic fallback body and _generate_summary template both use the same constants
  • No dead variables introduced; all 4 constants are consumed

Verdict: Clean. The rename from imperative headings ("Active Task", "Remaining Work") to historical framing ("Historical Task Snapshot", "Historical Remaining Work") directly addresses the stale-task-hijack failure mode documented in #35344 — models treating inherited compaction handoffs as live instructions. Well-structured refactoring with comprehensive test coverage.

@teknium1

Copy link
Copy Markdown
Contributor

Merged via PR #44454 — your commit was cherry-picked onto current main with your authorship preserved in git log (d5e2fbf). It serves as the base of the consolidated fix, combined with #41650's carveout removal and a frozen-prefix backward-compat fixup. Thanks!

#44454

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P1 High — major feature broken, no workaround type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants