Skip to content

fix(memory): respect toolsets and reconnect openviking#18166

Open
flyingdoubleG wants to merge 1 commit into
NousResearch:mainfrom
flyingdoubleG:codex/fix-memory-provider-issue-regressions
Open

fix(memory): respect toolsets and reconnect openviking#18166
flyingdoubleG wants to merge 1 commit into
NousResearch:mainfrom
flyingdoubleG:codex/fix-memory-provider-issue-regressions

Conversation

@flyingdoubleG

Copy link
Copy Markdown

Summary

  • gate external memory provider tool injection on enabled_toolsets / disabled_toolsets so platform toolset filters are respected
  • lazily reconnect OpenViking on tool calls after startup health check failures instead of keeping the provider permanently disconnected
  • add regressions for both issue-derived failure modes

Related issues

Validation

  • .venv-win\Scripts\python.exe -m pytest tests\agent\test_memory_provider.py -q
  • .venv-win\Scripts\python.exe -m pytest tests\plugins\memory\test_openviking_provider.py -q

@alt-glitch alt-glitch added type/bug Something isn't working P3 Low — cosmetic, nice to have comp/plugins Plugin system and bundled plugins tool/memory Memory tool and memory providers labels May 1, 2026
@flyingdoubleG

Copy link
Copy Markdown
Author

Real-agent validation (2026-05-01): PASS

Isolated HERMES_HOME: D:\工作\hermes-agent-pr-real-agent\pr18166
Command shape: python -m hermes_cli.main --yolo --provider openai-codex -m gpt-5.5 -t terminal,file -z ...
Scenario: configured memory.provider: openviking with memory disabled and an unreachable endpoint, then ran the real OpenAI Codex-backed Hermes agent with only terminal,file toolsets.
Agent result: PR18166_AGENT_OK;本轮只启用了 terminal,file toolsets,且未调用 memory 工具。

This validates that the memory provider does not leak into a turn where memory is not enabled.

SelfParody added a commit to SelfParody/hermes-agent that referenced this pull request May 2, 2026
…han subprocess leak

v5 — addresses GPT-5.5 fusion-judge SHIP_WITH_FIXES on v4.
One-line cosmetic fix: SIGTERM handler now uses cleanup-scope local
alias `_os_local.kill(_os_local.getpid(), signum)` instead of bare
`os.kill(os.getpid(), signum)`, completing the self-containment intent.
Behavior unchanged (verified: SIGTERM still exits 143, SIGINT still 130).

═══ Code review history ═══

- v1: qwen3-max single review → SHIP_WITH_FIXES
- v2: Fusion (Codex+Sonnet+Gemini, GPT-5.5 judge) → NEEDS_REWORK (7 fixes)
- v3: Fusion → SHIP_WITH_FIXES (3 required + 4 hardening fixes)
- v4: Fusion → SHIP_WITH_FIXES (1 cosmetic — use _os_local in SIGTERM)
- v5 (this commit): cosmetic fix applied; behavior verified unchanged

═══ Final fix tracking ═══

| v2 required fix                       | Final |
|---------------------------------------|-------|
| 1. Fail-open import behavior          | FIXED |
| 2. SIGTERM exit code 143              | FIXED |
| 3. Unmatched -t warning logic         | FIXED |
| 4. _stdio_pids robustness             | FIXED |
| 5. time/os imports verified           | FIXED |
| 6. Poll-loop logger.debug             | FIXED |
| 7. Idempotency guard                  | FIXED |
| 8. SIGTERM uses self-contained _os    | FIXED (v5) |

═══ Verified locally (final) ═══

  hermes -z "ACK" -t web        : 4.8s wall (was 65s — 93% reduction)
  hermes -z "ACK" -t slack      : 5.5s wall (was 67s — 92% reduction)
  hermes -z "ACK"  (no -t)      : 75s     (unchanged — backwards-compat)
  SIGINT mid-flight             : exit 130 ✓
  SIGTERM mid-flight            : exit 143 ✓
  Orphan accumulation           : 0 new PPID=1 across 10+ runs

Related issues:
- Fixes the MCP subprocess component of NousResearch#18438 (gateway memory leak)
- Supersedes the startup-drag portion of NousResearch#18523 (closed unmerged)
- Extends toolset-gating pattern from NousResearch#18166 and NousResearch#5788 (memory) to MCPs

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

@ehz0ah ehz0ah left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OpenViking reconnect change for #5721 looks reasonable to me. I left one inline note on the #5544 toolset gate.

Comment thread run_agent.py
return False
if enabled_toolsets is None:
return True
return "memory" in enabled_toolsets

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should resolve toolsets before checking for memory. As written, enabled_toolsets=["hermes-acp"], ["hermes-cli"], or ["all"] returns false even though those toolsets resolve to include "memory", so external provider tools can disappear when memory is actually enabled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/plugins Plugin system and bundled plugins P3 Low — cosmetic, nice to have tool/memory Memory tool and memory providers type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants