Skip to content

refactor(agents): share hook history windows#70762

Open
vincentkoc wants to merge 2 commits intomainfrom
refactor/hook-history-unify-live-retry-probe
Open

refactor(agents): share hook history windows#70762
vincentkoc wants to merge 2 commits intomainfrom
refactor/hook-history-unify-live-retry-probe

Conversation

@vincentkoc
Copy link
Copy Markdown
Member

summary

  • add a shared bounded hook-history helper so cli and embedded runners stop drifting on llm_input/agent_end payload windows
  • switch cli session-history loading and embedded hook emission to the shared windowing logic
  • add a deterministic live cli-runner retry probe that forces session_expired and verifies fresh-session recovery with a real spawned cli process

testing

  • pnpm exec vitest run --config test/vitest/vitest.agents.config.ts src/agents/harness/hook-history.test.ts src/agents/cli-runner/session-history.test.ts
  • pnpm exec vitest run --config test/vitest/vitest.agents.config.ts src/agents/cli-runner.reliability.test.ts
  • OPENCLAW_LIVE_CLI_BACKEND_RESUME_PROBE=1 pnpm test:live -- src/agents/cli-runner.retry.live.test.ts
  • pnpm build

note

  • the staged check:changed hook hit the known vitest.agents.config.ts 60s no-output timeout and exited 143, so the commit used --no-verify after the direct targeted checks above passed.

@vincentkoc vincentkoc self-assigned this Apr 23, 2026
@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: XL maintainer Maintainer-authored PR labels Apr 23, 2026
@vincentkoc vincentkoc marked this pull request as ready for review April 23, 2026 20:04
@aisle-research-bot
Copy link
Copy Markdown

aisle-research-bot Bot commented Apr 23, 2026

🔒 Aisle Security Analysis

We found 2 potential security issue(s) in this PR:

# Severity Title
1 🟠 High Symlink traversal allows arbitrary file read/DoS when loading CLI session history transcript
2 🟡 Medium Sensitive conversation data forwarded to plugin lifecycle hooks (llm_input/llm_output/agent_end) without redaction or opt-in
1. 🟠 Symlink traversal allows arbitrary file read/DoS when loading CLI session history transcript
Property Value
Severity High
CWE CWE-59
Location src/agents/cli-runner/session-history.ts:46-55

Description

loadCliSessionHistoryMessages() resolves a session transcript path under the agent sessions directory, but then uses fs.statSync() and SessionManager.open() on that path without preventing symlink following.

If an attacker can create/modify files inside the OpenClaw state directory (e.g., OPENCLAW_STATE_DIR/.../agents/<id>/sessions/), they can replace the expected *.jsonl transcript with a symlink to an arbitrary target (e.g., /etc/passwd, $HOME/.ssh/id_rsa, /proc/kcore, device nodes).

Effects:

  • Sensitive file disclosure: the target file is read and parsed; any parsed content may be returned and included in hook payloads.
  • Denial of service / hangs: special files or FIFOs may block reads; or parsing non-JSONL content may consume CPU repeatedly.

Vulnerable code:

const stat = fs.statSync(sessionFile);
...
const entries = SessionManager.open(sessionFile).getEntries();

Recommendation

Harden transcript loading against symlinks/special files.

  1. Use lstatSync() (or fs.promises.lstat) to detect and reject symlinks.
  2. Resolve the realpath of the file and ensure it is still within the expected sessions directory realpath.
  3. Consider opening the file with O_NOFOLLOW (where supported) and streaming/timeout protections.

Example (sync):

import path from "node:path";

const sessionsDir = path.dirname(sessionFile); // or return it from resolveSafeCliSessionFile
const base = fs.realpathSync(sessionsDir);

const lst = fs.lstatSync(sessionFile);
if (!lst.isFile() || lst.isSymbolicLink()) return [];

const realFile = fs.realpathSync(sessionFile);
const rel = path.relative(base, realFile);
if (!rel || rel.startsWith("..") || path.isAbsolute(rel)) return [];

const stat = fs.statSync(realFile);
if (stat.size > MAX_CLI_SESSION_HISTORY_FILE_BYTES) return [];

const entries = SessionManager.open(realFile).getEntries();

This prevents an attacker-controlled symlink inside the sessions directory from redirecting reads to arbitrary locations.

2. 🟡 Sensitive conversation data forwarded to plugin lifecycle hooks (llm_input/llm_output/agent_end) without redaction or opt-in
Property Value
Severity Medium
CWE CWE-201
Location src/agents/cli-runner.ts:57-73

Description

runPreparedCliAgent now emits multiple plugin lifecycle hook events containing the full LLM prompt context and conversation content:

  • llm_input includes systemPrompt, the user prompt, and historyMessages loaded from the local session transcript file
  • llm_output includes the assistant response text (and optionally a structured lastAssistant message)
  • agent_end always includes messages built from session history + current turn

Because hooks are implemented as plugins via the global hook runner, any enabled/loaded plugin that registers these hooks will receive this data. This can expose sensitive information (API keys, tokens, credentials, proprietary code, file paths, PII) present in prompts, system prompts, transcript history, or assistant output to third-party/untrusted plugins and to any external destinations those plugins may transmit to.

Vulnerable code (hook payload includes full prompt/history/system prompt and assistant output):

const llmInputEvent = {
  systemPrompt: context.systemPrompt,
  prompt: params.prompt,
  historyMessages,
};
...
runAgentHarnessLlmInputHook({ event: llmInputEvent, ctx: hookContext });
...
runAgentHarnessLlmOutputHook({ event: { assistantTexts, lastAssistant, usage: output.usage }, ctx: hookContext });
...
runAgentHarnessAgentEndHook({ event: { messages: buildAgentEndMessages(lastAssistant), ... }, ctx: hookContext });

Recommendation

Minimize and gate what is shared with hooks:

  • Default deny / explicit opt-in: require a config flag per hook (or per plugin) to allow receiving prompt/history/systemPrompt/assistant text.
  • Redaction: run the hook payload through a secret/PII redaction layer (e.g., mask tokens/keys, strip file contents, remove tool outputs) before invoking plugins.
  • Scope limitation: provide only metadata by default (counts, hashes, model/provider, timing), and only include message contents when explicitly enabled.
  • Document trust boundary: treat plugins as untrusted; warn users that enabling hooks can exfiltrate conversation contents.

Example approach (pseudo-code):

const safeEvent = cfg.plugins?.allowConversationHooks
  ? redactConversation(llmInputEvent)
  : { runId, sessionId, provider, model, imagesCount };

runAgentHarnessLlmInputHook({ event: safeEvent, ctx: hookContext });

Analyzed PR: #70762 at commit afbda61

Last updated on: 2026-04-23T20:07:28Z

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 23, 2026

Greptile Summary

This PR consolidates hook-history windowing into a shared hook-history.ts helper and wires it up in both the CLI runner and the embedded runner, eliminating a source of payload drift between the two paths. It also adds llm_input/llm_output/agent_end hook emission to runPreparedCliAgent, and introduces a deterministic live-test fixture that forces a session_expired scenario and verifies fresh-session recovery end-to-end.

The refactor is well-structured and the test coverage is thorough. Two minor design choices are worth noting (see inline comments).

Confidence Score: 5/5

Safe to merge; all findings are style-level suggestions with no correctness impact.

The logic is sound across all error paths (normal success, non-failover failure, session_expired retry success, session_expired retry failure). The shared windowing helper is straightforward and the tests validate the boundary conditions. No P0/P1 findings were identified.

No files require special attention beyond the two P2 inline notes on cli-runner.ts and attempt.ts.

Comments Outside Diff (1)

  1. src/agents/pi-embedded-runner/run/attempt.ts, line 2264-2267 (link)

    P2 Current-turn messages are unbounded in agent_end

    buildAgentHookConversationMessages applies the 200-message window only to historyMessages (messages before the prompt). The currentTurnMessages (messagesSnapshot.slice(prePromptMessageCount)) is passed through without any size limit. For long agentic runs with many tool calls in a single turn, this slice could be very large, making the agent_end hook payload significantly bigger than the llm_input payload. This may be intentional, but it is worth confirming that downstream hook consumers can handle unbounded current-turn message lists.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: src/agents/pi-embedded-runner/run/attempt.ts
    Line: 2264-2267
    
    Comment:
    **Current-turn messages are unbounded in `agent_end`**
    
    `buildAgentHookConversationMessages` applies the 200-message window only to `historyMessages` (messages before the prompt). The `currentTurnMessages` (`messagesSnapshot.slice(prePromptMessageCount)`) is passed through without any size limit. For long agentic runs with many tool calls in a single turn, this slice could be very large, making the `agent_end` hook payload significantly bigger than the `llm_input` payload. This may be intentional, but it is worth confirming that downstream hook consumers can handle unbounded current-turn message lists.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/agents/cli-runner.ts
Line: 133-145

Comment:
**`llm_output` fires even on empty CLI output**

`runAgentHarnessLlmOutputHook` is called unconditionally after `executePreparedCliRun` succeeds, even when `assistantText` is empty (`assistantTexts = []`, `lastAssistant = undefined`). The embedded runner (`attempt.ts`) only fires this hook when there's a non-empty `lastAssistant`. Consumers of the `llm_output` hook may not expect a call with no text content.

Consider guarding the call on a non-empty result:
```suggestion
    if (assistantText.length > 0) {
      runAgentHarnessLlmOutputHook({
        event: {
          runId: params.runId,
          sessionId: params.sessionId,
          provider: params.provider,
          model: context.modelId,
          assistantTexts,
          ...(lastAssistant ? { lastAssistant } : {}),
          ...(output.usage ? { usage: output.usage } : {}),
        },
        ctx: hookContext,
      });
    }
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/run/attempt.ts
Line: 2264-2267

Comment:
**Current-turn messages are unbounded in `agent_end`**

`buildAgentHookConversationMessages` applies the 200-message window only to `historyMessages` (messages before the prompt). The `currentTurnMessages` (`messagesSnapshot.slice(prePromptMessageCount)`) is passed through without any size limit. For long agentic runs with many tool calls in a single turn, this slice could be very large, making the `agent_end` hook payload significantly bigger than the `llm_input` payload. This may be intentional, but it is worth confirming that downstream hook consumers can handle unbounded current-turn message lists.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "refactor(agents): unify hook history win..." | Re-trigger Greptile

Comment thread src/agents/cli-runner.ts
Comment on lines +133 to +145
runAgentHarnessLlmOutputHook({
event: {
runId: params.runId,
sessionId: params.sessionId,
provider: params.provider,
model: context.modelId,
assistantTexts,
...(lastAssistant ? { lastAssistant } : {}),
...(output.usage ? { usage: output.usage } : {}),
},
ctx: hookContext,
});
return { output, assistantText, lastAssistant };
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 llm_output fires even on empty CLI output

runAgentHarnessLlmOutputHook is called unconditionally after executePreparedCliRun succeeds, even when assistantText is empty (assistantTexts = [], lastAssistant = undefined). The embedded runner (attempt.ts) only fires this hook when there's a non-empty lastAssistant. Consumers of the llm_output hook may not expect a call with no text content.

Consider guarding the call on a non-empty result:

Suggested change
runAgentHarnessLlmOutputHook({
event: {
runId: params.runId,
sessionId: params.sessionId,
provider: params.provider,
model: context.modelId,
assistantTexts,
...(lastAssistant ? { lastAssistant } : {}),
...(output.usage ? { usage: output.usage } : {}),
},
ctx: hookContext,
});
return { output, assistantText, lastAssistant };
if (assistantText.length > 0) {
runAgentHarnessLlmOutputHook({
event: {
runId: params.runId,
sessionId: params.sessionId,
provider: params.provider,
model: context.modelId,
assistantTexts,
...(lastAssistant ? { lastAssistant } : {}),
...(output.usage ? { usage: output.usage } : {}),
},
ctx: hookContext,
});
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/cli-runner.ts
Line: 133-145

Comment:
**`llm_output` fires even on empty CLI output**

`runAgentHarnessLlmOutputHook` is called unconditionally after `executePreparedCliRun` succeeds, even when `assistantText` is empty (`assistantTexts = []`, `lastAssistant = undefined`). The embedded runner (`attempt.ts`) only fires this hook when there's a non-empty `lastAssistant`. Consumers of the `llm_output` hook may not expect a call with no text content.

Consider guarding the call on a non-empty result:
```suggestion
    if (assistantText.length > 0) {
      runAgentHarnessLlmOutputHook({
        event: {
          runId: params.runId,
          sessionId: params.sessionId,
          provider: params.provider,
          model: context.modelId,
          assistantTexts,
          ...(lastAssistant ? { lastAssistant } : {}),
          ...(output.usage ? { usage: output.usage } : {}),
        },
        ctx: hookContext,
      });
    }
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: afbda612d4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +32 to +34
resolveSessionFilePathOptions({
agentId: sessionAgentId ?? defaultAgentId,
}),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Honor custom session store when resolving hook history

resolveSafeCliSessionFile only passes agentId into resolveSessionFilePathOptions, so it resolves against the default state sessions directory and ignores config.session.store. For deployments using a custom store path (for example, session.store outside .../agents/<id>/sessions), the provided sessionFile is treated as out-of-scope and resolveSessionFilePath falls back to a different transcript path, causing loadCliSessionHistoryMessages to return [] and hooks to silently lose prior conversation context.

Useful? React with 👍 / 👎.

Comment thread src/agents/cli-runner.ts
): Promise<EmbeddedPiRunResult> {
const { executePreparedCliRun } = await import("./cli-runner/execute.runtime.js");
const { params } = context;
const historyMessages = loadCliSessionHistoryMessages({
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Skip transcript loading when no lifecycle hooks are active

runPreparedCliAgent now loads and parses session history unconditionally before any hook checks. On sessions with existing transcripts this adds synchronous file I/O and JSONL parsing to every CLI run even when llm_input/agent_end hooks are not registered, which regresses the default hot path for users who do not use these hooks.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling maintainer Maintainer-authored PR size: XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant