Skip to content

fix(agent): preserve prompt cache + tree-dedup across background review (reference impl for #25322 Option B)#25427

Closed
simpolism wants to merge 1 commit into
NousResearch:mainfrom
simpolism:feat/background-review-system-prompt-inheritance
Closed

fix(agent): preserve prompt cache + tree-dedup across background review (reference impl for #25322 Option B)#25427
simpolism wants to merge 1 commit into
NousResearch:mainfrom
simpolism:feat/background-review-system-prompt-inheritance

Conversation

@simpolism

Copy link
Copy Markdown
Contributor

What does this PR do?

Reference implementation of Option B from issue #25322. Fixes the bug where the background self-improvement review's freshly-spawned AIAgent generates a system prompt that bytes-differs from the parent's, busting Anthropic prompt cache + preprocessing tree-dedup.

See #25322 for the full diagnosis, reproducer, and trade-off discussion. That issue lays out three options for fixing this bug, with different safety vs cache-hit trade-offs. This PR implements Option B (the one with maximum cache hit, at the cost of moving the safety boundary from "mechanical narrow toolset" to "prompt-layer instruction"). I'm filing it for maintainer reference / evaluation — happy to rewrite as Option A (preserves mechanical safety, accepts a slightly leaky system prompt) or Option C (smaller change, partial cache hit) if maintainers prefer.

Posting this PR alongside the issue because @Teknium asked for the code to be available as a reference; treat this as the worked-out version of one of the three branches in the discussion, not as a "please merge this" PR until we've agreed on which option is right.

Related Issue

Refs #25322 (the bug + design-space discussion)

The safety motivation for the original narrow-toolset design is #15204.

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)

(Behavior change for the background review pass: still produces the same memory/skill side effects, but now does so with a system message that prefix-matches the parent's.)

Changes Made

  • run_agent.py: _spawn_background_review now constructs the fork inheriting
    enabled_toolsets, session_start, session_id, and _cached_system_prompt
    from the parent. The narrow-toolset call (enabled_toolsets=["memory", "skills"])
    is removed; the safety boundary moves to a prompt-layer instruction baked into
    each review prompt (_MEMORY_REVIEW_PROMPT, _SKILL_REVIEW_PROMPT,
    _COMBINED_REVIEW_PROMPT).
  • tests/run_agent/test_background_review.py:
    new test_background_review_inherits_parent_system_prompt_and_toolsets
    asserts that the spawned review agent gets the parent's _cached_system_prompt,
    session_start, session_id, and enabled_toolsets verbatim.
  • tests/run_agent/test_background_review_toolset_restriction.py:
    inverted the existing assertions:
    • was: assert sorted(captured["enabled_toolsets"]) == ["memory", "skills"]
    • now: captured["enabled_toolsets"] == parent_toolsets
    • plus new test: review prompts contain "Only use memory/skill tools" instruction
    • this is the most reviewer-controversial diff and the part that needs careful judgment

How to Test

Run the impacted test files:

pytest tests/run_agent/test_background_review.py tests/run_agent/test_background_review_toolset_restriction.py -v

6/6 pass on this branch.

To observe the cache-hit improvement manually: configure Anthropic provider, run an interactive session, trigger a background review (memory- or skill-relevant turn), and inspect cached_tokens in the response of the review's HTTP call. Before this PR: 0. After: matches the parent's cached prefix size.

The empirical reproducer for the underlying bug (showing that parent and review system messages bytes-differ) is in #25322.

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits
  • I searched for existing PRs — no duplicate
  • My PR contains only changes related to this fix
  • I've run the impacted test files; all pass
  • I've added tests for my changes — new positive test for inheritance + inverted the existing restriction tests + new test for prompt-layer instruction
  • I've tested on my platform: Ubuntu 24.04

Documentation & Housekeeping

  • I've updated relevant documentation — N/A (internal behavior change; no user-facing config keys or commands added)
  • cli-config.yaml.example — N/A, no config key changes
  • CONTRIBUTING.md / AGENTS.md — N/A
  • Cross-platform impact — N/A, internal agent logic
  • Tool descriptions/schemas — N/A

Open question for maintainers

The safety relaxation is the part of this PR that I genuinely don't know is the right move. The two questions I'd want maintainer judgment on:

  1. How often do background-review passes empirically try to invoke non-memory/skill tools? If literally never, prompt-layer enforcement is fine. If sometimes, mechanical lockdown (Option A) is the better fix shape.
  2. Are there downstream invariants in run_agent.py that assume _cached_system_prompt content matches the agent's actual tool_schemas? I haven't found any, but a more careful audit by someone who knows the cache-control machinery would catch what I missed. Option A would put us in that "system prompt describes tools the model can't call" state, which is benign in API semantics (the model just discovers unavailability when it tries) but might break some sanity check.

If maintainers prefer either Option A or Option C from #25322, I'll close this PR and re-open with the alternative. The work to switch is small — most of the diff here is the test inversions, not the production change.

The background memory/skill review spawns a fresh AIAgent with a narrow
toolset, which means the review's system prompt diverges from the parent's
at three points (see issue NousResearch#25322 for the full diagnosis + reproducer):

  1. The 'Conversation started:' timestamp line (fresh _hermes_now() per spawn)
  2. The 'Session ID:' line (fresh uuid per spawn)
  3. The skills_prompt block (narrower enabled_toolsets produces a different
     list of available skills, plus tool-aware guidance blocks differ)

Consequence: the review's first HTTP request to the model bit-mismatches
the parent's system message, which (a) misses upstream prompt caches
(Anthropic / OpenRouter / Nous Portal — every skill nudge pays the full
uncached system-prompt cost), and (b) forks the preprocessing tree-dedup
at position 0, producing two separate conversation leaves for what is
logically one user session.  Recent PR NousResearch#24778 ('kill long-lived prefix
layout') fixed the intra-instance mutation mode; the bug here is the
cross-instance divergence, which survives NousResearch#24778.

Fix (Option B in NousResearch#25322's enumeration): spawn the review inheriting
parent's enabled_toolsets, session_start, session_id, and
_cached_system_prompt.  The review's outbound system message is now
bit-identical to the parent's.  Prompt-cache prefix match works; the
preprocessing pipeline collapses the review into the same leaf as the
main session.

Safety trade-off (the part that warrants careful review):

The original narrow-toolset design (issue NousResearch#15204) was a mechanical
safeguard — narrow toolset means terminal/send_message/delegate_task
schemas are literally not in the API call's 'tools' field, so the
model cannot invoke them.  This PR moves that safety boundary to the
prompt layer: each review prompt now carries an explicit 'Only use
memory/skill tools; do not invoke any other tool' instruction.
Mechanically the review *could* now call terminal etc. if it chose to
ignore the instruction.

This is the trade-off named as Option B in NousResearch#25322.  Option A (inherit
_cached_system_prompt from parent but keep enabled_toolsets=['memory',
'skills'] on the review) would preserve the mechanical safety AND get
the cache hit at the cost of a system prompt that 'describes' tools the
review can't actually call.  Option C (inherit only session_start +
session_id, rebuild the rest based on narrow toolset) preserves safety
entirely but only gets a partial cache hit.

Filing this PR as the implementation of Option B for maintainer
reference / evaluation.  Happy to rewrite as Option A or C if
maintainers prefer.  See issue NousResearch#25322 for the full trade-off discussion.

Tests:
  - tests/run_agent/test_background_review.py: new
    test_background_review_inherits_parent_system_prompt_and_toolsets
    asserts cached-prompt, session_start, session_id, and toolset copy-over
  - tests/run_agent/test_background_review_toolset_restriction.py:
    inverted the assertion (was: narrow toolset, now: inherit parent's) +
    added test that prompts carry the scoping instruction
  - _bare_agent() test fixture gains session_start, _cached_system_prompt,
    enabled_toolsets

Tested on Ubuntu 24.04.

Refs NousResearch#25322
teknium1 added a commit that referenced this pull request May 14, 2026
Belt-and-suspenders complement to the cached-system-prompt inheritance:
pin session_start and session_id to the parent's so any code path that
re-renders parts of the system prompt (compression, plugin hooks)
still produces byte-identical output. The cached-prompt assignment
already short-circuits the normal rebuild path, but these pins
guarantee parity even if a future code path bypasses the cache.

Idea from simpolism's reference PR #25427 for #25322.

Co-Authored-By: simpolism <32201324+simpolism@users.noreply.github.com>
@teknium1

Copy link
Copy Markdown
Contributor

Salvaged via PR #25434 alongside @WorldWriter's #17276 (which had the runtime-whitelist enforcement layer). Your session_start/session_id pinning idea was incorporated as a defensive complement. We kept the codex_app_server → codex_responses downgrade that your PR removed, since that path is load-bearing for review forks on codex_app_server runtime. Thanks for the thorough diagnosis in #25322 and the reference implementation — the option-by-option trade-off framing in the issue made the merge path clear.

@teknium1 teknium1 closed this May 14, 2026
@alt-glitch alt-glitch added type/perf Performance improvement or optimization P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder tool/memory Memory tool and memory providers labels May 14, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Superseded by #25434, which salvages this PR's approach together with #17276 into the final implementation. See #25322 for the tracking issue.

sunJose pushed a commit to sunJose/hermes-agent that referenced this pull request May 14, 2026
Belt-and-suspenders complement to the cached-system-prompt inheritance:
pin session_start and session_id to the parent's so any code path that
re-renders parts of the system prompt (compression, plugin hooks)
still produces byte-identical output. The cached-prompt assignment
already short-circuits the normal rebuild path, but these pins
guarantee parity even if a future code path bypasses the cache.

Idea from simpolism's reference PR NousResearch#25427 for NousResearch#25322.

Co-Authored-By: simpolism <32201324+simpolism@users.noreply.github.com>
jsboige pushed a commit to jsboige/hermes-agent that referenced this pull request May 14, 2026
Belt-and-suspenders complement to the cached-system-prompt inheritance:
pin session_start and session_id to the parent's so any code path that
re-renders parts of the system prompt (compression, plugin hooks)
still produces byte-identical output. The cached-prompt assignment
already short-circuits the normal rebuild path, but these pins
guarantee parity even if a future code path bypasses the cache.

Idea from simpolism's reference PR NousResearch#25427 for NousResearch#25322.

Co-Authored-By: simpolism <32201324+simpolism@users.noreply.github.com>
AlexFoxD pushed a commit to AlexFoxD/hermes-agent that referenced this pull request May 21, 2026
Belt-and-suspenders complement to the cached-system-prompt inheritance:
pin session_start and session_id to the parent's so any code path that
re-renders parts of the system prompt (compression, plugin hooks)
still produces byte-identical output. The cached-prompt assignment
already short-circuits the normal rebuild path, but these pins
guarantee parity even if a future code path bypasses the cache.

Idea from simpolism's reference PR NousResearch#25427 for NousResearch#25322.

Co-Authored-By: simpolism <32201324+simpolism@users.noreply.github.com>
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
Belt-and-suspenders complement to the cached-system-prompt inheritance:
pin session_start and session_id to the parent's so any code path that
re-renders parts of the system prompt (compression, plugin hooks)
still produces byte-identical output. The cached-prompt assignment
already short-circuits the normal rebuild path, but these pins
guarantee parity even if a future code path bypasses the cache.

Idea from simpolism's reference PR NousResearch#25427 for NousResearch#25322.

Co-Authored-By: simpolism <32201324+simpolism@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists tool/memory Memory tool and memory providers type/perf Performance improvement or optimization

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants