Skip to content

test: cover #25957 production regressions#26039

Closed
stephenschoettler wants to merge 2 commits into
NousResearch:mainfrom
stephenschoettler:followup/ci-unblocker-prod-regressions
Closed

test: cover #25957 production regressions#26039
stephenschoettler wants to merge 2 commits into
NousResearch:mainfrom
stephenschoettler:followup/ci-unblocker-prod-regressions

Conversation

@stephenschoettler

@stephenschoettler stephenschoettler commented May 15, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

Adds the follow-up regressions requested on #25957 for the two production behavior fixes that were merged as part of the CI unblocker.

This keeps the follow-up narrow:

  • proves a resumed protected-head compaction handoff restores _previous_summary before the next summary update
  • proves bg-review action summarization receives the review agent's captured messages after review_agent is closed
  • moves the bg-review message snapshot ahead of teardown in the refactored background-review helper, so close-time cleanup cannot erase the user-visible self-improvement summary

Related Issue

Follow-up to #25957. Does not close a separate issue.

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 🔒 Security fix
  • 📝 Documentation update
  • ✅ Tests (adding or improving test coverage)
  • ♻️ Refactor (no behavior change)
  • 🎯 New skill (bundled or hub)

Changes Made

  • agent/background_review.py
    • snapshots review_agent._session_messages before memory-provider shutdown and review_agent.close() so cleanup cannot drop bg-review action reporting
  • tests/agent/test_context_compressor_summary_continuity.py
    • adds a direct regression that protected-head handoff rehydration populates ContextCompressor._previous_summary and keeps the old handoff out of newly summarized turns
  • tests/run_agent/test_background_review.py
    • adds a regression that summarize_background_review_actions receives captured review-agent tool messages after review_agent.close() runs
    • updates the regression hook for the current agent.background_review refactor

How to Test

  1. Run the focused related files:
    ./scripts/run_tests.sh tests/run_agent/test_background_review.py tests/agent/test_context_compressor_summary_continuity.py -- -q --tb=short
  2. Run file sanity checks:
    python -m py_compile agent/background_review.py run_agent.py tests/run_agent/test_background_review.py tests/agent/test_context_compressor_summary_continuity.py
  3. Run lint on touched files:
    python -m ruff check agent/background_review.py tests/run_agent/test_background_review.py tests/agent/test_context_compressor_summary_continuity.py
  4. Check the live PR diff against current main:
    git diff --check origin/main

Validation Status

  • Refreshed onto current main base ba9964ff0d68002d9440f6b8a64276d7c34a77a4.
  • Refreshed branch pushed at head 08967f6d5c9b259b9627debe12e01cb8feafd998.
  • PR diff rechecked against current origin/main: 3 files changed, agent/background_review.py, tests/agent/test_context_compressor_summary_continuity.py, tests/run_agent/test_background_review.py.
  • Focused related files passed: 8 passed, 0 failed in 1.5s.
  • python -m py_compile ... passed for touched files plus run_agent.py.
  • python -m ruff check ... passed for touched files.
  • git diff --check origin/main passed.
  • Refreshed PR CI passed: test, e2e, lint, Nix, Docker builds, attribution, history, and supply-chain checks are green on head 08967f6d5c9b259b9627debe12e01cb8feafd998.
  • Docker publish-side jobs merge, move-main, and move-latest skipped as expected for PR CI.
  • Full pytest tests/ -q is not claimed here. This is intentionally a narrow follow-up to fix(ci): stabilize shared test state after 21012 #25957.

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix/feature (no unrelated commits)
  • I've run pytest tests/ -q and all tests pass
  • I've added tests for my changes (required for bug fixes, strongly encouraged for features)
  • I've tested on my platform: Arch Linux, Python 3.11 venv via scripts/run_tests.sh

Documentation & Housekeeping

  • I've updated relevant documentation (README, docs/, docstrings) - N/A
  • I've updated cli-config.yaml.example if I added/changed config keys - N/A
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows - N/A
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guide - no platform-specific code paths added
  • I've updated tool descriptions/schemas if I changed tool behavior - N/A

For New Skills

N/A

Screenshots / Logs

./scripts/run_tests.sh tests/run_agent/test_background_review.py tests/agent/test_context_compressor_summary_continuity.py -- -q --tb=short
# 8 tests passed, 0 failed in 1.5s

python -m py_compile agent/background_review.py run_agent.py tests/run_agent/test_background_review.py tests/agent/test_context_compressor_summary_continuity.py
python -m ruff check agent/background_review.py tests/run_agent/test_background_review.py tests/agent/test_context_compressor_summary_continuity.py
# All checks passed!

git diff --check origin/main
# passed

@alt-glitch alt-glitch added type/test Test coverage or test infrastructure P3 Low — cosmetic, nice to have comp/agent Core agent loop, run_agent.py, prompt builder labels May 15, 2026
@stephenschoettler

Copy link
Copy Markdown
Contributor Author

Heads up for review: this PR is still the #25957 follow-up with the production regression tests, but its current e2e red is inherited from shared CI state, not from this branch.

I opened #26048 as the narrow unblocker for that shared post-#25957 fallout. It fixes the Discord e2e mock surface plus the Nous provider-parity fixture issue, and it is green/clean now. Once #26048 lands, I’ll refresh/rerun this PR against main.

@stephenschoettler stephenschoettler force-pushed the followup/ci-unblocker-prod-regressions branch from 462ee53 to 63eaf60 Compare May 16, 2026 00:45
@stephenschoettler

Copy link
Copy Markdown
Contributor Author

@ethernet8023 quick routing update: #26039 is refreshed after #26048 and the old e2e blocker is gone.

The remaining red is inherited main Tests state and now splits across existing unblockers:

Once those land on main, I’ll refresh/rerun #26039 again.

…-prod-regressions

# Conflicts:
#	run_agent.py
@stephenschoettler

Copy link
Copy Markdown
Contributor Author

@ethernet8023 bump on #26039: refreshed onto current main, conflicts resolved, CI is green, mergeState CLEAN, and the PR body was rechecked against the live diff.

This is still the narrow #25957 follow-up: one bg-review snapshot fix plus the two focused regression tests. Current diff is just agent/background_review.py, tests/agent/test_context_compressor_summary_continuity.py, and tests/run_agent/test_background_review.py.

Could you re-review/merge when you get a minute?

teknium1 pushed a commit that referenced this pull request May 28, 2026
Snapshot review_agent._session_messages before teardown so close() can
clean per-session state without dropping the user-visible
self-improvement summary. Adds two regressions:

- bg-review summarizer receives captured review-agent tool messages
  after review_agent.close() runs
- context-compressor protected-head handoff rehydration populates
  _previous_summary and keeps the old handoff out of newly summarized
  turns

Salvaged from PR #26039 onto current main after agent/background_review.py
extraction. Original commit 63eaf60; bg-review test updated to patch
the module-level summarize_background_review_actions in
agent.background_review instead of the now-forwarder
AIAgent._summarize_background_review_actions.
@teknium1

Copy link
Copy Markdown
Contributor

Merged via PR #33661. Your commit 63eaf6055 was salvaged onto current main with your authorship preserved (the cherry-pick conflicted on run_agent.py because the bg-review block was extracted to agent/background_review.py after your PR was opened — the test was adapted to patch the new module-level summarize_background_review_actions instead of the legacy AIAgent._summarize_background_review_actions forwarder, but the production fix is identical to yours and the regression catches the exact same bug). Thanks for the focused follow-up!

mathias3 pushed a commit to mathias3/hermes-agent that referenced this pull request May 28, 2026
Snapshot review_agent._session_messages before teardown so close() can
clean per-session state without dropping the user-visible
self-improvement summary. Adds two regressions:

- bg-review summarizer receives captured review-agent tool messages
  after review_agent.close() runs
- context-compressor protected-head handoff rehydration populates
  _previous_summary and keeps the old handoff out of newly summarized
  turns

Salvaged from PR NousResearch#26039 onto current main after agent/background_review.py
extraction. Original commit 63eaf60; bg-review test updated to patch
the module-level summarize_background_review_actions in
agent.background_review instead of the now-forwarder
AIAgent._summarize_background_review_actions.
Bryce-huang pushed a commit to wbkunlun/hermes-agent that referenced this pull request May 29, 2026
Snapshot review_agent._session_messages before teardown so close() can
clean per-session state without dropping the user-visible
self-improvement summary. Adds two regressions:

- bg-review summarizer receives captured review-agent tool messages
  after review_agent.close() runs
- context-compressor protected-head handoff rehydration populates
  _previous_summary and keeps the old handoff out of newly summarized
  turns

Salvaged from PR NousResearch#26039 onto current main after agent/background_review.py
extraction. Original commit 63eaf60; bg-review test updated to patch
the module-level summarize_background_review_actions in
agent.background_review instead of the now-forwarder
AIAgent._summarize_background_review_actions.

#AI commit#
zwolniony pushed a commit to zwolniony/hermes-agent that referenced this pull request May 29, 2026
Snapshot review_agent._session_messages before teardown so close() can
clean per-session state without dropping the user-visible
self-improvement summary. Adds two regressions:

- bg-review summarizer receives captured review-agent tool messages
  after review_agent.close() runs
- context-compressor protected-head handoff rehydration populates
  _previous_summary and keeps the old handoff out of newly summarized
  turns

Salvaged from PR NousResearch#26039 onto current main after agent/background_review.py
extraction. Original commit 63eaf60; bg-review test updated to patch
the module-level summarize_background_review_actions in
agent.background_review instead of the now-forwarder
AIAgent._summarize_background_review_actions.
mosaiq-systems pushed a commit to mosaiq-systems/hermes-agent that referenced this pull request May 29, 2026
Snapshot review_agent._session_messages before teardown so close() can
clean per-session state without dropping the user-visible
self-improvement summary. Adds two regressions:

- bg-review summarizer receives captured review-agent tool messages
  after review_agent.close() runs
- context-compressor protected-head handoff rehydration populates
  _previous_summary and keeps the old handoff out of newly summarized
  turns

Salvaged from PR NousResearch#26039 onto current main after agent/background_review.py
extraction. Original commit 63eaf60; bg-review test updated to patch
the module-level summarize_background_review_actions in
agent.background_review instead of the now-forwarder
AIAgent._summarize_background_review_actions.
KKT-OPT pushed a commit to KKT-OPT/hermes-agent that referenced this pull request May 31, 2026
Snapshot review_agent._session_messages before teardown so close() can
clean per-session state without dropping the user-visible
self-improvement summary. Adds two regressions:

- bg-review summarizer receives captured review-agent tool messages
  after review_agent.close() runs
- context-compressor protected-head handoff rehydration populates
  _previous_summary and keeps the old handoff out of newly summarized
  turns

Salvaged from PR NousResearch#26039 onto current main after agent/background_review.py
extraction. Original commit 63eaf60; bg-review test updated to patch
the module-level summarize_background_review_actions in
agent.background_review instead of the now-forwarder
AIAgent._summarize_background_review_actions.
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
Snapshot review_agent._session_messages before teardown so close() can
clean per-session state without dropping the user-visible
self-improvement summary. Adds two regressions:

- bg-review summarizer receives captured review-agent tool messages
  after review_agent.close() runs
- context-compressor protected-head handoff rehydration populates
  _previous_summary and keeps the old handoff out of newly summarized
  turns

Salvaged from PR NousResearch#26039 onto current main after agent/background_review.py
extraction. Original commit 63eaf60; bg-review test updated to patch
the module-level summarize_background_review_actions in
agent.background_review instead of the now-forwarder
AIAgent._summarize_background_review_actions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P3 Low — cosmetic, nice to have type/test Test coverage or test infrastructure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants