Skip to content

perf(orchestrator): skip blocking Lore memory prefetch on parent agent#3

Closed
davidgut1982 wants to merge 1 commit into
mainfrom
feat/orchestrator-skip-memory
Closed

perf(orchestrator): skip blocking Lore memory prefetch on parent agent#3
davidgut1982 wants to merge 1 commit into
mainfrom
feat/orchestrator-skip-memory

Conversation

@davidgut1982

Copy link
Copy Markdown
Owner

What it does

The parent orchestrator calls MemoryManager.prefetch_all()kb_search(top_k=30) to Lore (CT 121) synchronously at the start of every run_conversation() (agent/conversation_loop.py ~654), before the first LLM call. Measured ~4.3s on every request.

The parent never uses that result for routing: SOUL.md explicitly delegates ALL recall to the memory child profile. The prefetch is therefore pure latency the orchestrator pays and discards.

This PR passes the existing skip_memory=True constructor flag at every user-facing PARENT/orchestrator construction point. When set, agent_init.py leaves _memory_manager=None and the prefetch is never reached. No change to conversation_loop.py or memory_manager.py — the prefetch logic is untouched; we simply don't build a memory manager on the parent.

Why it's safe

  • SOUL delegates recall to the memory child — the parent never consults prefetch output for routing decisions, so dropping it changes no routing behavior.
  • Children are unaffected. Delegated children construct their own _memory_manager independently (the generic delegate_task path in tools/delegate_tool.py and profile-based children decide their own skip_memory). Setting the flag on the parent does not flow into the child constructor — recall still works when the parent delegates to a memory-capable child.
  • Explicit writes survive. memory(action="add") and kb_add via the MCP knowledge server are independent of the parent's _memory_manager prefetch path and continue to work.
  • The cached gateway agent retains _memory_manager=None across turns (the cache stores the constructed instance; _init_cached_agent_for_turn never rebuilds the manager), so the saving applies to every turn in a session, not just the first.

Tradeoff surfaced for review — passive memory writes

The parent's _memory_manager is also what performs the post-turn passive transcript write:

conversation_loop.py (~4261) → AIAgent._sync_external_memory_for_turn (run_agent.py ~2104) → MemoryManager.sync_allprovider.sync_turn, plus queue_prefetch_all to warm the next turn.

With skip_memory=True, these passive writes no longer fire for PARENT-handled turns. Concretely, what stops accumulating in Lore from the parent path: the automatic (user, assistant) turn-pair transcript capture and the next-turn prefetch warming.

What is not lost:

  • Explicit memory(action="add") writes and kb_add via the MCP knowledge server.
  • Passive turn sync performed by delegated children (each child that builds its own memory manager still syncs its own turns).
  • Curator / background-review already runs with skip_memory=True (and re-binds built-in MEMORY.md/USER.md from the parent), so it is unchanged.

Net: passive parent-transcript accumulation into Lore stops; child-driven memory and all explicit writes continue. This is the one real behavioral change — flagged for a human to accept before deploy.

Parent construction points changed

File Function Entry path
gateway/platforms/api_server.py _create_agent HTTP /v1/chat/completions
gateway/run.py _run_agent Telegram / Discord / Slack inbound
gateway/run.py _run_background_task gateway background tasks
tui_gateway/server.py _make_agent TUI / dashboard embedded chat (was conditional on HERMES_IGNORE_RULES; now forced True for the parent)
tui_gateway/server.py _background_agent_kwargs TUI background tasks

Construction points intentionally not changed:

  • tools/delegate_tool.py _build_child_agent — already skip_memory=True (children); left as-is so child memory scoping is unchanged.
  • gateway/platforms/feishu_comment.py, gateway _handle_compress_command, the gateway hygiene/compaction agent, background_review/curator — already skip_memory=True.

Flagged for reviewer

  • ACP adapter (acp_adapter/session.py ~624) is another user-facing top-level parent that currently builds a memory manager and pays the same ~4.3s prefetch. It was outside the stated scope (api_server / telegram / TUI) so it is left unchanged here — call out if you want it included.

Tests

  • tests/gateway/test_api_server_toolset.py — assert _create_agent passes skip_memory=True.
  • tests/gateway/test_orchestrator_skip_memory.py (new) — assert the TUI parent helpers pass skip_memory=True; assert a real AIAgent(skip_memory=True) yields _memory_manager is None; assert the delegated-child construction carries its own skip_memory kwarg (parent value does not leak into the child).

Targeted run: 17 passed. Full tests/gateway/ suite: 5862 passed with 9 pre-existing, unrelated failures (telegram-markdown-escaping + wecom-callback test-isolation flakes) that reproduce identically on main without this change.

🤖 Generated with Claude Code

The parent orchestrator calls MemoryManager.prefetch_all() ->
kb_search(top_k=30) to Lore synchronously at the start of every
run_conversation (~4.3s/request), before the first LLM call. The parent
never uses the result for routing: SOUL.md delegates ALL recall to the
`memory` child profile. The prefetch is therefore pure latency.

Pass skip_memory=True at every user-facing PARENT/orchestrator
construction point, which leaves _memory_manager=None so the prefetch is
never reached. Delegated children build their own memory manager
independently and are unaffected — recall still works when the parent
delegates to a memory-capable child.

Parent construction points changed:
- gateway/platforms/api_server.py _create_agent  (HTTP /v1/chat/completions)
- gateway/run.py _run_agent                       (Telegram/Discord/Slack inbound)
- gateway/run.py _run_background_task             (gateway background tasks)
- tui_gateway/server.py _make_agent               (TUI / dashboard chat; was
                                                   conditional on HERMES_IGNORE_RULES,
                                                   now forced True for the parent)
- tui_gateway/server.py _background_agent_kwargs  (TUI background tasks)

Tradeoff (passive writes): the parent's _memory_manager is also what
performs the post-turn passive transcript write
(_sync_external_memory_for_turn -> sync_all -> provider.sync_turn) and
queue_prefetch_all. With skip_memory=True these passive writes no longer
fire for PARENT-handled turns. Explicit memory(action="add") / kb_add via
the MCP knowledge server are unaffected, and delegated children still
sync their own turns. Surfaced for human review before deploy.

Tests:
- tests/gateway/test_api_server_toolset.py: assert _create_agent passes
  skip_memory=True.
- tests/gateway/test_orchestrator_skip_memory.py: assert the TUI parent
  helpers pass skip_memory=True; assert a real AIAgent(skip_memory=True)
  yields _memory_manager is None; assert the delegated child construction
  carries its own skip_memory kwarg (parent value does not leak in).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown

🔎 Lint report: feat/orchestrator-skip-memory vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 9374 on HEAD, 9373 on base (🆕 +1)

🆕 New issues (1):

Rule Count
unresolved-import 1
First entries
tests/gateway/test_orchestrator_skip_memory.py:21: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`

✅ Fixed issues: none

Unchanged: 4957 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

@davidgut1982

Copy link
Copy Markdown
Owner Author

Closing — superseded by Lore-side performance fixes (kb_get batching + Postgres GIN/random_page_cost planner fix + semantic default recall). Those dropped the parent prefetch from ~4.3s to ~115ms (measured, 40x), so skip_memory would now save only ~115ms while losing passive memory — a bad trade. Keeping this branch for reference; the change still validly decouples the router from Lore availability if ever needed, but it should not merge.

davidgut1982 pushed a commit that referenced this pull request Jun 1, 2026
…NousResearch#34192) (NousResearch#34382)

NousResearch#34192 reports Hostinger's 'Hermes WebUI' catalog crashes on startup
with:

  /usr/bin/tini: No such file or directory

The image moved from tini to s6-overlay as PID 1 (/init) earlier in
2026. Orchestration templates that still pin /usr/bin/tini as the
entrypoint \u2014 like the Hostinger Hermes WebUI catalog \u2014 have no
binary to exec and the container crashes immediately.

Hermes has no control over the Hostinger catalog template, but we can
make the image backward-compatible by symlinking /usr/bin/tini -> /init
during the s6-overlay install step. External wrappers that exec
/usr/bin/tini will land on the same s6-overlay reaper they would have
landed on if they'd used the canonical /init entrypoint.

The image's own ENTRYPOINT continues to be /init verbatim \u2014 the shim
is purely for legacy external wrappers, not for the image's own
runtime path. Once affected catalogs are updated, the symlink can be
removed.

Other issues NousResearch#34192 raises that are NOT addressed by this PR:

  * Problem #2 (UID 1024 vs 10000 mismatch): already fixed by NousResearch#33148
    (S6_KEEP_ENV=1) and NousResearch#32412 (with-contenv shebangs). The Hostinger
    template likely needs to update its env-var propagation.

  * Problem #3 (incompatible session formats): RFC for pluggable
    SessionDB is tracked in NousResearch#23717.

  * Problem #4 (Telegram polling conflict): an operations problem on
    Hostinger's side, not in this codebase.

This PR is scoped to the one issue that can be fixed inside
Dockerfile: the missing /usr/bin/tini binary.

Tests (3 in test_dockerfile_tini_compat_shim.py):

  - test_tini_compat_symlink_present
    Guard: the symlink line must exist in Dockerfile.
  - test_tini_compat_comment_explains_why
    The NousResearch#34192 anchor comment must be present so future readers know
    why the shim is there (avoid accidental removal).
  - test_entrypoint_still_init_not_tini
    Sanity check: ENTRYPOINT remains /init (s6-overlay). The shim is
    only for external wrappers.

Refs: NousResearch#34192
Partial fix: addresses the immediate tini-binary crash. Catalog-side
fixes still needed by Hostinger for the UID and session-format
problems documented in the issue.

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant