feat(agent): allow pre_api_request hook to inject request-scoped context by devteapot · Pull Request #9914 · NousResearch/hermes-agent

devteapot · 2026-04-14T23:19:44Z

What & Why

The pre_api_request hook currently exists as a fire-and-forget observer: in run_agent.py, its call site is wrapped in try / except: pass, so any return values are discarded. This means plugins cannot refresh the model's context between retries inside the tool loop — they have to fall back to pre_llm_call, which fires only once per turn.

That gap forces per-request context (live browser snapshots, dynamic app/state trees, progress hints, request-scoped guardrails) to go stale the moment the agent enters a multi-step tool loop. A snapshot attached at the start of the turn can easily be minutes and many tool calls out of date by the time the second or third provider request goes out.

This PR completes the hook by collecting its return values the same way pre_llm_call already does and appending the resulting context to the current turn's user message on the outbound request copy only. The injection is rebuilt from base_api_messages on every retry, so each attempt sees a fresh snapshot and nothing leaks into session persistence or trajectory logs.

The hook callback now also receives user_message, conversation_history, and is_first_turn so plugins can make injection decisions without reaching into private agent state.

Changes

run_agent.py
- Extract _collect_ephemeral_hook_context() and _append_ephemeral_context_to_user_message() helpers, and route the existing pre_llm_call injection through them.
- Split the API-messages build into a turn-scoped base_api_messages and a per-attempt api_messages = copy.deepcopy(base_api_messages) inside the retry loop.
- Invoke pre_api_request per attempt with the new kwargs, collect its return value, and append the context to the current turn's user message on that attempt's copy.
- Re-apply Anthropic cache-control, sanitization, and request-size metrics per attempt so downstream code sees consistent state after injection.
hermes_cli/plugins.py — docstring: note that pre_api_request (alongside pre_llm_call) may return a {"context": "..."} dict.
tests/run_agent/test_run_agent.py
- New test_pre_api_request_can_inject_request_scoped_context covering two requests in a single turn (first request triggers a tool call, second request gets a distinct injected string). Asserts both strings land in the right request and do not bleed across.
- Extend the existing test_pre_api_request_and_post_api_request_fire_per_request with assertions on the new user_message, conversation_history, and is_first_turn kwargs.
Docs — website/docs/user-guide/features/hooks.md, website/docs/user-guide/features/plugins.md, website/docs/guides/build-a-hermes-plugin.md:
- Add pre_api_request / post_api_request to the quick-reference tables.
- Document the full pre_api_request callback signature, parameters, fire semantics, return contract, injection site, and a "live state snapshot" example.
- Update the "all hooks are fire-and-forget" caveat to call out both pre_llm_call and pre_api_request as context-injecting.

How to test

uv run pytest tests/run_agent/ -o addopts=""

Targeted:

uv run pytest tests/run_agent/test_run_agent.py::TestRunConversation \
  -k "pre_api_request or pre_llm_call" -o addopts=""

Manual smoke test — register a plugin with:

def inject_live_state(api_call_count, conversation_history, **kwargs):
    return {"context": f"Snapshot for request {api_call_count} "
                       f"(history len={len(conversation_history)})"}

def register(ctx):
    ctx.register_hook("pre_api_request", inject_live_state)

Then run any multi-step task and confirm each outbound request's user message ends with a snapshot whose api_call_count matches that attempt.

Platforms tested

macOS 15 / Python 3.11.15 — uv run pytest tests/run_agent/ → 670 passed, 6 skipped.

No changes to file I/O, process management, terminal handling, or shell commands, so Windows/Linux behavior should be unaffected. Happy to run on another platform if requested.

Backwards compatibility

pre_api_request callbacks that return None (or any non-dict/non-string value) behave exactly as before — nothing is injected.
Existing pre_llm_call context-injection path is unchanged; the helpers it was inlining are now extracted but the semantics are identical.
run_agent.py's retry loop still rebuilds api_kwargs via _build_api_kwargs and still runs _preflight_codex_api_kwargs when api_mode == "codex_responses".

The pre_api_request hook was introduced as a fire-and-forget observer: its return values were caught and discarded, so plugins could not refresh the model's context between retries inside the tool loop. pre_llm_call fires only once per turn, which forces per-request state (live browser snapshots, dynamic app trees, progress hints) to go stale. This completes the hook by collecting its return values the same way pre_llm_call already does and appending the resulting context to the current turn's user message on the outbound request copy only. The injection is rebuilt from base_api_messages on every retry, so each attempt sees a fresh snapshot and nothing leaks into session persistence. The hook now also receives user_message, conversation_history, and is_first_turn so plugins can make injection decisions without reaching into private agent state. - Extract _collect_ephemeral_hook_context and _append_ephemeral_context_to_user_message helpers (shared with the existing pre_llm_call path). - Split the api_messages build into a turn-scoped base and a per-attempt copy; re-apply Anthropic cache control and sanitization per attempt. - Update the pre_api_request docstring, hooks.md, plugins.md, and the build-a-hermes-plugin guide to document the new contract. - Add test_pre_api_request_can_inject_request_scoped_context covering two requests in a single turn with distinct injected contexts, and extend the existing per-request hook test to assert the new kwargs. Tested on macOS 15 with pytest tests/run_agent/ (670 passed, 6 skipped).

devteapot force-pushed the feat/request-scoped-context-hook branch from a87e138 to 5e87b50 Compare April 14, 2026 23:25

alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have comp/plugins Plugin system and bundled plugins comp/agent Core agent loop, run_agent.py, prompt builder labels Apr 26, 2026

bbednarski9 mentioned this pull request May 22, 2026

feat(observability): add observer-grade telemetry hooks #29722

Closed

19 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(agent): allow pre_api_request hook to inject request-scoped context#9914

feat(agent): allow pre_api_request hook to inject request-scoped context#9914
devteapot wants to merge 1 commit into
NousResearch:mainfrom
devteapot:feat/request-scoped-context-hook

devteapot commented Apr 14, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

devteapot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What & Why

Changes

How to test

Platforms tested

Backwards compatibility

Related

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

devteapot commented Apr 14, 2026 •

edited

Loading