perf(orchestrator): skip blocking Lore memory prefetch on parent agent by davidgut1982 · Pull Request #3 · davidgut1982/hermes-agent

davidgut1982 · 2026-06-01T14:27:03Z

What it does

The parent orchestrator calls MemoryManager.prefetch_all() → kb_search(top_k=30) to Lore (CT 121) synchronously at the start of every run_conversation() (agent/conversation_loop.py ~654), before the first LLM call. Measured ~4.3s on every request.

The parent never uses that result for routing: SOUL.md explicitly delegates ALL recall to the memory child profile. The prefetch is therefore pure latency the orchestrator pays and discards.

This PR passes the existing skip_memory=True constructor flag at every user-facing PARENT/orchestrator construction point. When set, agent_init.py leaves _memory_manager=None and the prefetch is never reached. No change to conversation_loop.py or memory_manager.py — the prefetch logic is untouched; we simply don't build a memory manager on the parent.

Why it's safe

SOUL delegates recall to the memory child — the parent never consults prefetch output for routing decisions, so dropping it changes no routing behavior.
Children are unaffected. Delegated children construct their own _memory_manager independently (the generic delegate_task path in tools/delegate_tool.py and profile-based children decide their own skip_memory). Setting the flag on the parent does not flow into the child constructor — recall still works when the parent delegates to a memory-capable child.
Explicit writes survive. memory(action="add") and kb_add via the MCP knowledge server are independent of the parent's _memory_manager prefetch path and continue to work.
The cached gateway agent retains _memory_manager=None across turns (the cache stores the constructed instance; _init_cached_agent_for_turn never rebuilds the manager), so the saving applies to every turn in a session, not just the first.

Tradeoff surfaced for review — passive memory writes

The parent's _memory_manager is also what performs the post-turn passive transcript write:

conversation_loop.py (~4261) → AIAgent._sync_external_memory_for_turn (run_agent.py ~2104) → MemoryManager.sync_all → provider.sync_turn, plus queue_prefetch_all to warm the next turn.

With skip_memory=True, these passive writes no longer fire for PARENT-handled turns. Concretely, what stops accumulating in Lore from the parent path: the automatic (user, assistant) turn-pair transcript capture and the next-turn prefetch warming.

What is not lost:

Explicit memory(action="add") writes and kb_add via the MCP knowledge server.
Passive turn sync performed by delegated children (each child that builds its own memory manager still syncs its own turns).
Curator / background-review already runs with skip_memory=True (and re-binds built-in MEMORY.md/USER.md from the parent), so it is unchanged.

Net: passive parent-transcript accumulation into Lore stops; child-driven memory and all explicit writes continue. This is the one real behavioral change — flagged for a human to accept before deploy.

Parent construction points changed

File	Function	Entry path
`gateway/platforms/api_server.py`	`_create_agent`	HTTP `/v1/chat/completions`
`gateway/run.py`	`_run_agent`	Telegram / Discord / Slack inbound
`gateway/run.py`	`_run_background_task`	gateway background tasks
`tui_gateway/server.py`	`_make_agent`	TUI / dashboard embedded chat (was conditional on `HERMES_IGNORE_RULES`; now forced `True` for the parent)
`tui_gateway/server.py`	`_background_agent_kwargs`	TUI background tasks

Construction points intentionally not changed:

tools/delegate_tool.py _build_child_agent — already skip_memory=True (children); left as-is so child memory scoping is unchanged.
gateway/platforms/feishu_comment.py, gateway _handle_compress_command, the gateway hygiene/compaction agent, background_review/curator — already skip_memory=True.

Flagged for reviewer

ACP adapter (acp_adapter/session.py ~624) is another user-facing top-level parent that currently builds a memory manager and pays the same ~4.3s prefetch. It was outside the stated scope (api_server / telegram / TUI) so it is left unchanged here — call out if you want it included.

Tests

tests/gateway/test_api_server_toolset.py — assert _create_agent passes skip_memory=True.
tests/gateway/test_orchestrator_skip_memory.py (new) — assert the TUI parent helpers pass skip_memory=True; assert a real AIAgent(skip_memory=True) yields _memory_manager is None; assert the delegated-child construction carries its own skip_memory kwarg (parent value does not leak into the child).

Targeted run: 17 passed. Full tests/gateway/ suite: 5862 passed with 9 pre-existing, unrelated failures (telegram-markdown-escaping + wecom-callback test-isolation flakes) that reproduce identically on main without this change.

🤖 Generated with Claude Code

The parent orchestrator calls MemoryManager.prefetch_all() -> kb_search(top_k=30) to Lore synchronously at the start of every run_conversation (~4.3s/request), before the first LLM call. The parent never uses the result for routing: SOUL.md delegates ALL recall to the `memory` child profile. The prefetch is therefore pure latency. Pass skip_memory=True at every user-facing PARENT/orchestrator construction point, which leaves _memory_manager=None so the prefetch is never reached. Delegated children build their own memory manager independently and are unaffected — recall still works when the parent delegates to a memory-capable child. Parent construction points changed: - gateway/platforms/api_server.py _create_agent (HTTP /v1/chat/completions) - gateway/run.py _run_agent (Telegram/Discord/Slack inbound) - gateway/run.py _run_background_task (gateway background tasks) - tui_gateway/server.py _make_agent (TUI / dashboard chat; was conditional on HERMES_IGNORE_RULES, now forced True for the parent) - tui_gateway/server.py _background_agent_kwargs (TUI background tasks) Tradeoff (passive writes): the parent's _memory_manager is also what performs the post-turn passive transcript write (_sync_external_memory_for_turn -> sync_all -> provider.sync_turn) and queue_prefetch_all. With skip_memory=True these passive writes no longer fire for PARENT-handled turns. Explicit memory(action="add") / kb_add via the MCP knowledge server are unaffected, and delegated children still sync their own turns. Surfaced for human review before deploy. Tests: - tests/gateway/test_api_server_toolset.py: assert _create_agent passes skip_memory=True. - tests/gateway/test_orchestrator_skip_memory.py: assert the TUI parent helpers pass skip_memory=True; assert a real AIAgent(skip_memory=True) yields _memory_manager is None; assert the delegated child construction carries its own skip_memory kwarg (parent value does not leak in). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

github-actions · 2026-06-01T14:28:05Z

🔎 Lint report: `feat/orchestrator-skip-memory` vs `origin/main`

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 9374 on HEAD, 9373 on base (🆕 +1)

🆕 New issues (1):

Rule	Count
`unresolved-import`	1

First entries

tests/gateway/test_orchestrator_skip_memory.py:21: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`

✅ Fixed issues: none

Unchanged: 4957 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

davidgut1982 · 2026-06-01T14:58:46Z

Closing — superseded by Lore-side performance fixes (kb_get batching + Postgres GIN/random_page_cost planner fix + semantic default recall). Those dropped the parent prefetch from ~4.3s to ~115ms (measured, 40x), so skip_memory would now save only ~115ms while losing passive memory — a bad trade. Keeping this branch for reference; the change still validly decouples the router from Lore availability if ever needed, but it should not merge.

…NousResearch#34192) (NousResearch#34382) NousResearch#34192 reports Hostinger's 'Hermes WebUI' catalog crashes on startup with: /usr/bin/tini: No such file or directory The image moved from tini to s6-overlay as PID 1 (/init) earlier in 2026. Orchestration templates that still pin /usr/bin/tini as the entrypoint \u2014 like the Hostinger Hermes WebUI catalog \u2014 have no binary to exec and the container crashes immediately. Hermes has no control over the Hostinger catalog template, but we can make the image backward-compatible by symlinking /usr/bin/tini -> /init during the s6-overlay install step. External wrappers that exec /usr/bin/tini will land on the same s6-overlay reaper they would have landed on if they'd used the canonical /init entrypoint. The image's own ENTRYPOINT continues to be /init verbatim \u2014 the shim is purely for legacy external wrappers, not for the image's own runtime path. Once affected catalogs are updated, the symlink can be removed. Other issues NousResearch#34192 raises that are NOT addressed by this PR: * Problem #2 (UID 1024 vs 10000 mismatch): already fixed by NousResearch#33148 (S6_KEEP_ENV=1) and NousResearch#32412 (with-contenv shebangs). The Hostinger template likely needs to update its env-var propagation. * Problem #3 (incompatible session formats): RFC for pluggable SessionDB is tracked in NousResearch#23717. * Problem #4 (Telegram polling conflict): an operations problem on Hostinger's side, not in this codebase. This PR is scoped to the one issue that can be fixed inside Dockerfile: the missing /usr/bin/tini binary. Tests (3 in test_dockerfile_tini_compat_shim.py): - test_tini_compat_symlink_present Guard: the symlink line must exist in Dockerfile. - test_tini_compat_comment_explains_why The NousResearch#34192 anchor comment must be present so future readers know why the shim is there (avoid accidental removal). - test_entrypoint_still_init_not_tini Sanity check: ENTRYPOINT remains /init (s6-overlay). The shim is only for external wrappers. Refs: NousResearch#34192 Partial fix: addresses the immediate tini-binary crash. Catalog-side fixes still needed by Hostinger for the UID and session-format problems documented in the issue. Co-authored-by: Cursor <cursoragent@cursor.com>

davidgut1982 closed this Jun 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(orchestrator): skip blocking Lore memory prefetch on parent agent#3

perf(orchestrator): skip blocking Lore memory prefetch on parent agent#3
davidgut1982 wants to merge 1 commit into
mainfrom
feat/orchestrator-skip-memory

davidgut1982 commented Jun 1, 2026

Uh oh!

github-actions Bot commented Jun 1, 2026

Uh oh!

davidgut1982 commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

davidgut1982 commented Jun 1, 2026

What it does

Why it's safe

Tradeoff surfaced for review — passive memory writes

Parent construction points changed

Flagged for reviewer

Tests

Uh oh!

github-actions Bot commented Jun 1, 2026

🔎 Lint report: feat/orchestrator-skip-memory vs origin/main

ruff

ty (type checker)

Uh oh!

davidgut1982 commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

🔎 Lint report: `feat/orchestrator-skip-memory` vs `origin/main`