test: cover #25957 production regressions by stephenschoettler · Pull Request #26039 · NousResearch/hermes-agent

stephenschoettler · 2026-05-15T01:46:30Z

What does this PR do?

Adds the follow-up regressions requested on #25957 for the two production behavior fixes that were merged as part of the CI unblocker.

This keeps the follow-up narrow:

proves a resumed protected-head compaction handoff restores _previous_summary before the next summary update
proves bg-review action summarization receives the review agent's captured messages after review_agent is closed
moves the bg-review message snapshot ahead of teardown in the refactored background-review helper, so close-time cleanup cannot erase the user-visible self-improvement summary

Related Issue

Follow-up to #25957. Does not close a separate issue.

Type of Change

🐛 Bug fix (non-breaking change that fixes an issue)
✨ New feature (non-breaking change that adds functionality)
🔒 Security fix
📝 Documentation update
✅ Tests (adding or improving test coverage)
♻️ Refactor (no behavior change)
🎯 New skill (bundled or hub)

Changes Made

agent/background_review.py
- snapshots review_agent._session_messages before memory-provider shutdown and review_agent.close() so cleanup cannot drop bg-review action reporting
tests/agent/test_context_compressor_summary_continuity.py
- adds a direct regression that protected-head handoff rehydration populates ContextCompressor._previous_summary and keeps the old handoff out of newly summarized turns
tests/run_agent/test_background_review.py
- adds a regression that summarize_background_review_actions receives captured review-agent tool messages after review_agent.close() runs
- updates the regression hook for the current agent.background_review refactor

How to Test

Run the focused related files:
./scripts/run_tests.sh tests/run_agent/test_background_review.py tests/agent/test_context_compressor_summary_continuity.py -- -q --tb=short
Run file sanity checks:
python -m py_compile agent/background_review.py run_agent.py tests/run_agent/test_background_review.py tests/agent/test_context_compressor_summary_continuity.py
Run lint on touched files:
python -m ruff check agent/background_review.py tests/run_agent/test_background_review.py tests/agent/test_context_compressor_summary_continuity.py
Check the live PR diff against current main:
git diff --check origin/main

Validation Status

Checklist

Code

I've read the Contributing Guide
My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
I searched for existing PRs to make sure this isn't a duplicate
My PR contains only changes related to this fix/feature (no unrelated commits)
I've run pytest tests/ -q and all tests pass
I've added tests for my changes (required for bug fixes, strongly encouraged for features)
I've tested on my platform: Arch Linux, Python 3.11 venv via scripts/run_tests.sh

Documentation & Housekeeping

I've updated relevant documentation (README, docs/, docstrings) - N/A
I've updated cli-config.yaml.example if I added/changed config keys - N/A
I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows - N/A
I've considered cross-platform impact (Windows, macOS) per the compatibility guide - no platform-specific code paths added
I've updated tool descriptions/schemas if I changed tool behavior - N/A

For New Skills

N/A

Screenshots / Logs

./scripts/run_tests.sh tests/run_agent/test_background_review.py tests/agent/test_context_compressor_summary_continuity.py -- -q --tb=short
# 8 tests passed, 0 failed in 1.5s

python -m py_compile agent/background_review.py run_agent.py tests/run_agent/test_background_review.py tests/agent/test_context_compressor_summary_continuity.py
python -m ruff check agent/background_review.py tests/run_agent/test_background_review.py tests/agent/test_context_compressor_summary_continuity.py
# All checks passed!

git diff --check origin/main
# passed

stephenschoettler · 2026-05-15T02:59:35Z

Heads up for review: this PR is still the #25957 follow-up with the production regression tests, but its current e2e red is inherited from shared CI state, not from this branch.

I opened #26048 as the narrow unblocker for that shared post-#25957 fallout. It fixes the Discord e2e mock surface plus the Nous provider-parity fixture issue, and it is green/clean now. Once #26048 lands, I’ll refresh/rerun this PR against main.

stephenschoettler · 2026-05-16T01:13:56Z

@ethernet8023 quick routing update: #26039 is refreshed after #26048 and the old e2e blocker is gone.

The remaining red is inherited main Tests state and now splits across existing unblockers:

fix(tests): stabilize xai env and provider parity #26565 is green/clean for the two xAI dotenv failures.
fix(gateway): treat stat-error on PATH probes as missing dir, not crash #26640 covers the two gateway PATH PermissionError failures, and is currently red only because it inherits the fix(tests): stabilize xai env and provider parity #26565 xAI failures.

Once those land on main, I’ll refresh/rerun #26039 again.

…-prod-regressions # Conflicts: # run_agent.py

stephenschoettler · 2026-05-21T20:33:49Z

@ethernet8023 bump on #26039: refreshed onto current main, conflicts resolved, CI is green, mergeState CLEAN, and the PR body was rechecked against the live diff.

This is still the narrow #25957 follow-up: one bg-review snapshot fix plus the two focused regression tests. Current diff is just agent/background_review.py, tests/agent/test_context_compressor_summary_continuity.py, and tests/run_agent/test_background_review.py.

Could you re-review/merge when you get a minute?

Snapshot review_agent._session_messages before teardown so close() can clean per-session state without dropping the user-visible self-improvement summary. Adds two regressions: - bg-review summarizer receives captured review-agent tool messages after review_agent.close() runs - context-compressor protected-head handoff rehydration populates _previous_summary and keeps the old handoff out of newly summarized turns Salvaged from PR #26039 onto current main after agent/background_review.py extraction. Original commit 63eaf60; bg-review test updated to patch the module-level summarize_background_review_actions in agent.background_review instead of the now-forwarder AIAgent._summarize_background_review_actions.

teknium1 · 2026-05-28T05:15:11Z

Merged via PR #33661. Your commit 63eaf6055 was salvaged onto current main with your authorship preserved (the cherry-pick conflicted on run_agent.py because the bg-review block was extracted to agent/background_review.py after your PR was opened — the test was adapted to patch the new module-level summarize_background_review_actions instead of the legacy AIAgent._summarize_background_review_actions forwarder, but the production fix is identical to yours and the regression catches the exact same bug). Thanks for the focused follow-up!

Snapshot review_agent._session_messages before teardown so close() can clean per-session state without dropping the user-visible self-improvement summary. Adds two regressions: - bg-review summarizer receives captured review-agent tool messages after review_agent.close() runs - context-compressor protected-head handoff rehydration populates _previous_summary and keeps the old handoff out of newly summarized turns Salvaged from PR NousResearch#26039 onto current main after agent/background_review.py extraction. Original commit 63eaf60; bg-review test updated to patch the module-level summarize_background_review_actions in agent.background_review instead of the now-forwarder AIAgent._summarize_background_review_actions.

Snapshot review_agent._session_messages before teardown so close() can clean per-session state without dropping the user-visible self-improvement summary. Adds two regressions: - bg-review summarizer receives captured review-agent tool messages after review_agent.close() runs - context-compressor protected-head handoff rehydration populates _previous_summary and keeps the old handoff out of newly summarized turns Salvaged from PR NousResearch#26039 onto current main after agent/background_review.py extraction. Original commit 63eaf60; bg-review test updated to patch the module-level summarize_background_review_actions in agent.background_review instead of the now-forwarder AIAgent._summarize_background_review_actions. #AI commit#

Snapshot review_agent._session_messages before teardown so close() can clean per-session state without dropping the user-visible self-improvement summary. Adds two regressions: - bg-review summarizer receives captured review-agent tool messages after review_agent.close() runs - context-compressor protected-head handoff rehydration populates _previous_summary and keeps the old handoff out of newly summarized turns Salvaged from PR NousResearch#26039 onto current main after agent/background_review.py extraction. Original commit 63eaf60; bg-review test updated to patch the module-level summarize_background_review_actions in agent.background_review instead of the now-forwarder AIAgent._summarize_background_review_actions.

alt-glitch added type/test Test coverage or test infrastructure P3 Low — cosmetic, nice to have comp/agent Core agent loop, run_agent.py, prompt builder labels May 15, 2026

ethernet8023 approved these changes May 15, 2026

View reviewed changes

test: cover ci-unblocker production regressions

63eaf60

stephenschoettler force-pushed the followup/ci-unblocker-prod-regressions branch from 462ee53 to 63eaf60 Compare May 16, 2026 00:45

Merge remote-tracking branch 'origin/main' into followup/ci-unblocker…

08967f6

…-prod-regressions # Conflicts: # run_agent.py

teknium1 mentioned this pull request May 28, 2026

test: cover #25957 production regressions (salvage of #26039) #33661

Merged

teknium1 closed this in #33661 May 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: cover #25957 production regressions#26039

test: cover #25957 production regressions#26039
stephenschoettler wants to merge 2 commits into
NousResearch:mainfrom
stephenschoettler:followup/ci-unblocker-prod-regressions

stephenschoettler commented May 15, 2026 •

edited

Loading

Uh oh!

stephenschoettler commented May 15, 2026

Uh oh!

stephenschoettler commented May 16, 2026

Uh oh!

stephenschoettler commented May 21, 2026

Uh oh!

teknium1 commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

stephenschoettler commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Related Issue

Type of Change

Changes Made

How to Test

Validation Status

Checklist

Code

Documentation & Housekeeping

For New Skills

Screenshots / Logs

Uh oh!

stephenschoettler commented May 15, 2026

Uh oh!

stephenschoettler commented May 16, 2026

Uh oh!

stephenschoettler commented May 21, 2026

Uh oh!

teknium1 commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

stephenschoettler commented May 15, 2026 •

edited

Loading