fix(ci): stabilize shared test state after 21012 by stephenschoettler · Pull Request #25957 · NousResearch/hermes-agent

stephenschoettler · 2026-05-14T21:29:10Z

What does this PR do?

Stabilizes the shared test failures that remained after #21012 landed on main.

The fixes are intentionally small and CI-focused:

preserve background-review agent messages before closing the review agent, so the self-improvement summary can still report actions
reset auxiliary-client unhealthy-provider cache between tests, matching the existing runtime-main reset
isolate hermes update tests from developer-local lazy backend activation state
update the provider discovery count for the newly registered provider profile
make compression-feasibility tests independent from local provider config and custom-provider attrs
rehydrate persisted context summaries even when the handoff sits in the protected head after resume

Related Issue

Related to the CI gate for the hermes-lcm context-engine merge train, including stephenschoettler/hermes-lcm#133. This does not close that issue.

Type of Change

🐛 Bug fix (non-breaking change that fixes an issue)
✨ New feature (non-breaking change that adds functionality)
🔒 Security fix
📝 Documentation update
✅ Tests (adding or improving test coverage)
♻️ Refactor (no behavior change)
🎯 New skill (bundled or hub)

Changes Made

run_agent.py: capture review_agent._session_messages before closing and clearing review_agent.
agent/context_compressor.py: find persisted handoff summaries from the first non-system message through the compression window, not only from the computed compress-start boundary.
tests/conftest.py: reset agent.auxiliary_client unhealthy-provider state per test.
tests/hermes_cli/test_update_autostash.py: no-op active lazy-backend refresh in update-autostash tests.
tests/providers/test_plugin_discovery.py: update the provider registry assertion from 33 profiles to 34.
tests/run_agent/test_compression_feasibility.py: mock auxiliary provider resolution and include custom_providers in expected calls.
tests/agent/test_context_compressor_summary_continuity.py: keep a regression shape that actually compresses new turns after a persisted handoff.

How to Test

Run the focused current-main failing set:
./scripts/run_tests.sh -n 0 tests/agent/test_auxiliary_client.py tests/agent/test_context_compressor.py tests/agent/test_context_compressor_summary_continuity.py tests/hermes_cli/test_update_autostash.py tests/run_agent/test_provider_parity.py tests/providers/test_plugin_discovery.py tests/run_agent/test_background_review.py tests/run_agent/test_compression_feasibility.py -q --tb=short
Run file sanity checks:
git diff --check && python -m compileall -q agent/context_compressor.py run_agent.py tests/conftest.py tests/agent/test_context_compressor_summary_continuity.py tests/hermes_cli/test_update_autostash.py tests/providers/test_plugin_discovery.py tests/run_agent/test_compression_feasibility.py
Run lint on touched files:
python -m ruff check agent/context_compressor.py run_agent.py tests/conftest.py tests/agent/test_context_compressor_summary_continuity.py tests/hermes_cli/test_update_autostash.py tests/providers/test_plugin_discovery.py tests/run_agent/test_compression_feasibility.py

Validation Status

Focused failing set passed: 378 passed in 62.78s.
git diff --check passed.
python -m compileall -q ... passed for touched files.
python -m ruff check ... passed for touched files.
Full pytest tests/ -q is not claimed here. A broader local non-integration run on this branch still showed unrelated current-main failures outside this focused unblocker scope, so this PR keeps the validation claim narrow.

Checklist

Code

I've read the Contributing Guide
My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
I searched for existing PRs to make sure this isn't a duplicate
My PR contains only changes related to this fix/feature (no unrelated commits)
I've run pytest tests/ -q and all tests pass
I've added tests for my changes (required for bug fixes, strongly encouraged for features)
I've tested on my platform: Arch Linux, Python 3.11 venv via scripts/run_tests.sh

Documentation & Housekeeping

I've updated relevant documentation (README, docs/, docstrings) — N/A
I've updated cli-config.yaml.example if I added/changed config keys — N/A
I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — N/A
I've considered cross-platform impact (Windows, macOS) per the compatibility guide — no platform-specific code paths added
I've updated tool descriptions/schemas if I changed tool behavior — N/A

Screenshots / Logs

Focused validation output:

378 passed in 62.78s (0:01:02)
All checks passed!

ethernet8023 · 2026-05-15T01:26:47Z

thank you!

there's two prod bugfixes in here aren't just fixes to existing CI tests, and you haven't added any new regression testing for them. i'm going to merge this given that it will unblock CI, but could you make a follow-up PR adding two tests?

one asserting handoff-in-protected-head rehydration populates _previous_summary, and
one asserting bg-review action capture works after review_agent is closed

the continuity test adds messages to an existing fixture so compression triggers at all, but it doesn't explicitly assert "summary in protected head causes _previous_summary to be populated."
a future regression on the head-search would probably slip thru.

the fix for bg-review has no assertion that _summarize_background_review_actions actually receives the messages.

…-ci-unblocker-after-21012 fix(ci): stabilize shared test state after 21012

fix(ci): stabilize shared test state after 21012

5ce0067

alt-glitch added type/test Test coverage or test infrastructure P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder labels May 14, 2026

ethernet8023 merged commit 1702a94 into NousResearch:main May 15, 2026
14 checks passed

This was referenced May 15, 2026

test: cover #25957 production regressions #26039

Closed

test: unblock post-25957 shared CI #26048

Merged

Haderach-Ram mentioned this pull request May 15, 2026

Ecosystem Digest — 2026-05-15 Haderach-Ram/openclaw-radar#8

Open

briandevans mentioned this pull request May 15, 2026

fix(tirith): skip auto-install on Termux native (Bionic libc) #26281

Closed

3 tasks

github-actions Bot mentioned this pull request May 17, 2026

chore: bump NousResearch/hermes-agent version from v2026.5.7 to v2026.5.16 Docker-Hub-sirmark/docker-hermes-agent#6

Merged

This was referenced May 17, 2026

fix(tests/gateway): stabilize xai env resolver + harden PATH probe against stat errors (salvage #26565) #27562

Merged

fix(gateway): treat stat-error on PATH probes as missing dir, not crash #26640

Closed

gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026

Merge pull request NousResearch#25957 from stephenschoettler/fix/main…

7e8349e

…-ci-unblocker-after-21012 fix(ci): stabilize shared test state after 21012

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(ci): stabilize shared test state after 21012#25957

fix(ci): stabilize shared test state after 21012#25957
ethernet8023 merged 1 commit into
NousResearch:mainfrom
stephenschoettler:fix/main-ci-unblocker-after-21012

stephenschoettler commented May 14, 2026

Uh oh!

ethernet8023 commented May 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

stephenschoettler commented May 14, 2026

What does this PR do?

Related Issue

Type of Change

Changes Made

How to Test

Validation Status

Checklist

Code

Documentation & Housekeeping

Screenshots / Logs

Uh oh!

ethernet8023 commented May 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants