refactor(agents): share hook history windows by vincentkoc · Pull Request #70762 · openclaw/openclaw

vincentkoc · 2026-04-23T20:02:12Z

summary

add a shared bounded hook-history helper so cli and embedded runners stop drifting on llm_input/agent_end payload windows
switch cli session-history loading and embedded hook emission to the shared windowing logic
add a deterministic live cli-runner retry probe that forces session_expired and verifies fresh-session recovery with a real spawned cli process

testing

pnpm exec vitest run --config test/vitest/vitest.agents.config.ts src/agents/harness/hook-history.test.ts src/agents/cli-runner/session-history.test.ts
pnpm exec vitest run --config test/vitest/vitest.agents.config.ts src/agents/cli-runner.reliability.test.ts
OPENCLAW_LIVE_CLI_BACKEND_RESUME_PROBE=1 pnpm test:live -- src/agents/cli-runner.retry.live.test.ts
pnpm build

note

the staged check:changed hook hit the known vitest.agents.config.ts 60s no-output timeout and exited 143, so the commit used --no-verify after the direct targeted checks above passed.

aisle-research-bot · 2026-04-23T20:04:38Z

🔒 Aisle Security Analysis

We found 2 potential security issue(s) in this PR:

#	Severity	Title
1	🟠 High	Symlink traversal allows arbitrary file read/DoS when loading CLI session history transcript
2	🟡 Medium	Sensitive conversation data forwarded to plugin lifecycle hooks (llm_input/llm_output/agent_end) without redaction or opt-in

1. 🟠 Symlink traversal allows arbitrary file read/DoS when loading CLI session history transcript

Property	Value
Severity	High
CWE	CWE-59
Location	`src/agents/cli-runner/session-history.ts:46-55`

Description

loadCliSessionHistoryMessages() resolves a session transcript path under the agent sessions directory, but then uses fs.statSync() and SessionManager.open() on that path without preventing symlink following.

If an attacker can create/modify files inside the OpenClaw state directory (e.g., OPENCLAW_STATE_DIR/.../agents/<id>/sessions/), they can replace the expected *.jsonl transcript with a symlink to an arbitrary target (e.g., /etc/passwd, $HOME/.ssh/id_rsa, /proc/kcore, device nodes).

Effects:

Sensitive file disclosure: the target file is read and parsed; any parsed content may be returned and included in hook payloads.
Denial of service / hangs: special files or FIFOs may block reads; or parsing non-JSONL content may consume CPU repeatedly.

Vulnerable code:

const stat = fs.statSync(sessionFile);
...
const entries = SessionManager.open(sessionFile).getEntries();

Recommendation

Harden transcript loading against symlinks/special files.

Use lstatSync() (or fs.promises.lstat) to detect and reject symlinks.
Resolve the realpath of the file and ensure it is still within the expected sessions directory realpath.
Consider opening the file with O_NOFOLLOW (where supported) and streaming/timeout protections.

Example (sync):

import path from "node:path";

const sessionsDir = path.dirname(sessionFile); // or return it from resolveSafeCliSessionFile
const base = fs.realpathSync(sessionsDir);

const lst = fs.lstatSync(sessionFile);
if (!lst.isFile() || lst.isSymbolicLink()) return [];

const realFile = fs.realpathSync(sessionFile);
const rel = path.relative(base, realFile);
if (!rel || rel.startsWith("..") || path.isAbsolute(rel)) return [];

const stat = fs.statSync(realFile);
if (stat.size > MAX_CLI_SESSION_HISTORY_FILE_BYTES) return [];

const entries = SessionManager.open(realFile).getEntries();

This prevents an attacker-controlled symlink inside the sessions directory from redirecting reads to arbitrary locations.

2. 🟡 Sensitive conversation data forwarded to plugin lifecycle hooks (llm_input/llm_output/agent_end) without redaction or opt-in

Property	Value
Severity	Medium
CWE	CWE-201
Location	`src/agents/cli-runner.ts:57-73`

Description

runPreparedCliAgent now emits multiple plugin lifecycle hook events containing the full LLM prompt context and conversation content:

llm_input includes systemPrompt, the user prompt, and historyMessages loaded from the local session transcript file
llm_output includes the assistant response text (and optionally a structured lastAssistant message)
agent_end always includes messages built from session history + current turn

Because hooks are implemented as plugins via the global hook runner, any enabled/loaded plugin that registers these hooks will receive this data. This can expose sensitive information (API keys, tokens, credentials, proprietary code, file paths, PII) present in prompts, system prompts, transcript history, or assistant output to third-party/untrusted plugins and to any external destinations those plugins may transmit to.

Vulnerable code (hook payload includes full prompt/history/system prompt and assistant output):

const llmInputEvent = {
  systemPrompt: context.systemPrompt,
  prompt: params.prompt,
  historyMessages,
};
...
runAgentHarnessLlmInputHook({ event: llmInputEvent, ctx: hookContext });
...
runAgentHarnessLlmOutputHook({ event: { assistantTexts, lastAssistant, usage: output.usage }, ctx: hookContext });
...
runAgentHarnessAgentEndHook({ event: { messages: buildAgentEndMessages(lastAssistant), ... }, ctx: hookContext });

Recommendation

Minimize and gate what is shared with hooks:

Default deny / explicit opt-in: require a config flag per hook (or per plugin) to allow receiving prompt/history/systemPrompt/assistant text.
Redaction: run the hook payload through a secret/PII redaction layer (e.g., mask tokens/keys, strip file contents, remove tool outputs) before invoking plugins.
Scope limitation: provide only metadata by default (counts, hashes, model/provider, timing), and only include message contents when explicitly enabled.
Document trust boundary: treat plugins as untrusted; warn users that enabling hooks can exfiltrate conversation contents.

Example approach (pseudo-code):

const safeEvent = cfg.plugins?.allowConversationHooks
  ? redactConversation(llmInputEvent)
  : { runId, sessionId, provider, model, imagesCount };

runAgentHarnessLlmInputHook({ event: safeEvent, ctx: hookContext });

Analyzed PR: #70762 at commit afbda61

_{Last updated on: 2026-04-23T20:07:28Z}

greptile-apps · 2026-04-23T20:07:49Z

Greptile Summary

This PR consolidates hook-history windowing into a shared hook-history.ts helper and wires it up in both the CLI runner and the embedded runner, eliminating a source of payload drift between the two paths. It also adds llm_input/llm_output/agent_end hook emission to runPreparedCliAgent, and introduces a deterministic live-test fixture that forces a session_expired scenario and verifies fresh-session recovery end-to-end.

The refactor is well-structured and the test coverage is thorough. Two minor design choices are worth noting (see inline comments).

Confidence Score: 5/5

Safe to merge; all findings are style-level suggestions with no correctness impact.

The logic is sound across all error paths (normal success, non-failover failure, session_expired retry success, session_expired retry failure). The shared windowing helper is straightforward and the tests validate the boundary conditions. No P0/P1 findings were identified.

No files require special attention beyond the two P2 inline notes on cli-runner.ts and attempt.ts.

Comments Outside Diff (1)

src/agents/pi-embedded-runner/run/attempt.ts, line 2264-2267 (link)

Current-turn messages are unbounded in agent_end

buildAgentHookConversationMessages applies the 200-message window only to historyMessages (messages before the prompt). The currentTurnMessages (messagesSnapshot.slice(prePromptMessageCount)) is passed through without any size limit. For long agentic runs with many tool calls in a single turn, this slice could be very large, making the agent_end hook payload significantly bigger than the llm_input payload. This may be intentional, but it is worth confirming that downstream hook consumers can handle unbounded current-turn message lists.

Prompt To Fix With AI

This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/run/attempt.ts
Line: 2264-2267

Comment:
**Current-turn messages are unbounded in `agent_end`**

`buildAgentHookConversationMessages` applies the 200-message window only to `historyMessages` (messages before the prompt). The `currentTurnMessages` (`messagesSnapshot.slice(prePromptMessageCount)`) is passed through without any size limit. For long agentic runs with many tool calls in a single turn, this slice could be very large, making the `agent_end` hook payload significantly bigger than the `llm_input` payload. This may be intentional, but it is worth confirming that downstream hook consumers can handle unbounded current-turn message lists.

How can I resolve this? If you propose a fix, please make it concise.

Prompt To Fix All With AI

This is a comment left during a code review.
Path: src/agents/cli-runner.ts
Line: 133-145

Comment:
**`llm_output` fires even on empty CLI output**

`runAgentHarnessLlmOutputHook` is called unconditionally after `executePreparedCliRun` succeeds, even when `assistantText` is empty (`assistantTexts = []`, `lastAssistant = undefined`). The embedded runner (`attempt.ts`) only fires this hook when there's a non-empty `lastAssistant`. Consumers of the `llm_output` hook may not expect a call with no text content.

Consider guarding the call on a non-empty result:
```suggestion
    if (assistantText.length > 0) {
      runAgentHarnessLlmOutputHook({
        event: {
          runId: params.runId,
          sessionId: params.sessionId,
          provider: params.provider,
          model: context.modelId,
          assistantTexts,
          ...(lastAssistant ? { lastAssistant } : {}),
          ...(output.usage ? { usage: output.usage } : {}),
        },
        ctx: hookContext,
      });
    }
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/run/attempt.ts
Line: 2264-2267

Comment:
**Current-turn messages are unbounded in `agent_end`**

`buildAgentHookConversationMessages` applies the 200-message window only to `historyMessages` (messages before the prompt). The `currentTurnMessages` (`messagesSnapshot.slice(prePromptMessageCount)`) is passed through without any size limit. For long agentic runs with many tool calls in a single turn, this slice could be very large, making the `agent_end` hook payload significantly bigger than the `llm_input` payload. This may be intentional, but it is worth confirming that downstream hook consumers can handle unbounded current-turn message lists.

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (1): Last reviewed commit: "refactor(agents): unify hook history win..." | Re-trigger Greptile}

greptile-apps · 2026-04-23T20:07:53Z

+    runAgentHarnessLlmOutputHook({
+      event: {
+        runId: params.runId,
+        sessionId: params.sessionId,
+        provider: params.provider,
+        model: context.modelId,
+        assistantTexts,
+        ...(lastAssistant ? { lastAssistant } : {}),
+        ...(output.usage ? { usage: output.usage } : {}),
+      },
+      ctx: hookContext,
+    });
+    return { output, assistantText, lastAssistant };


llm_output fires even on empty CLI output

runAgentHarnessLlmOutputHook is called unconditionally after executePreparedCliRun succeeds, even when assistantText is empty (assistantTexts = [], lastAssistant = undefined). The embedded runner (attempt.ts) only fires this hook when there's a non-empty lastAssistant. Consumers of the llm_output hook may not expect a call with no text content.

Consider guarding the call on a non-empty result:

Suggested change

runAgentHarnessLlmOutputHook({

event: {

runId: params.runId,

sessionId: params.sessionId,

provider: params.provider,

model: context.modelId,

assistantTexts,

...(lastAssistant ? { lastAssistant } : {}),

...(output.usage ? { usage: output.usage } : {}),

},

ctx: hookContext,

});

return { output, assistantText, lastAssistant };

if (assistantText.length > 0) {

runAgentHarnessLlmOutputHook({

event: {

runId: params.runId,

sessionId: params.sessionId,

provider: params.provider,

model: context.modelId,

assistantTexts,

...(lastAssistant ? { lastAssistant } : {}),

...(output.usage ? { usage: output.usage } : {}),

},

ctx: hookContext,

});

}

Prompt To Fix With AI

This is a comment left during a code review. Path: src/agents/cli-runner.ts Line: 133-145 Comment: **`llm_output` fires even on empty CLI output** `runAgentHarnessLlmOutputHook` is called unconditionally after `executePreparedCliRun` succeeds, even when `assistantText` is empty (`assistantTexts = []`, `lastAssistant = undefined`). The embedded runner (`attempt.ts`) only fires this hook when there's a non-empty `lastAssistant`. Consumers of the `llm_output` hook may not expect a call with no text content. Consider guarding the call on a non-empty result: ```suggestion if (assistantText.length > 0) { runAgentHarnessLlmOutputHook({ event: { runId: params.runId, sessionId: params.sessionId, provider: params.provider, model: context.modelId, assistantTexts, ...(lastAssistant ? { lastAssistant } : {}), ...(output.usage ? { usage: output.usage } : {}), }, ctx: hookContext, }); } ``` How can I resolve this? If you propose a fix, please make it concise.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: afbda612d4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-23T20:11:14Z

+    resolveSessionFilePathOptions({
+      agentId: sessionAgentId ?? defaultAgentId,
+    }),


Honor custom session store when resolving hook history

resolveSafeCliSessionFile only passes agentId into resolveSessionFilePathOptions, so it resolves against the default state sessions directory and ignores config.session.store. For deployments using a custom store path (for example, session.store outside .../agents/<id>/sessions), the provided sessionFile is treated as out-of-scope and resolveSessionFilePath falls back to a different transcript path, causing loadCliSessionHistoryMessages to return [] and hooks to silently lose prior conversation context.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-23T20:11:15Z

 ): Promise<EmbeddedPiRunResult> {
  const { executePreparedCliRun } = await import("./cli-runner/execute.runtime.js");
  const { params } = context;
+  const historyMessages = loadCliSessionHistoryMessages({


Skip transcript loading when no lifecycle hooks are active

runPreparedCliAgent now loads and parses session history unconditionally before any hook checks. On sessions with existing transcripts this adds synchronous file I/O and JSONL parsing to every CLI run even when llm_input/agent_end hooks are not registered, which regresses the default hot path for users who do not use these hooks.

Useful? React with 👍 / 👎.

vincentkoc added 2 commits April 23, 2026 12:26

fix(agents): harden cli runner hook followups

7a859a9

refactor(agents): unify hook history windows

afbda61

vincentkoc self-assigned this Apr 23, 2026

openclaw-barnacle Bot added agents Agent runtime and tooling size: XL maintainer Maintainer-authored PR labels Apr 23, 2026

vincentkoc marked this pull request as ready for review April 23, 2026 20:04

greptile-apps Bot reviewed Apr 23, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Apr 23, 2026

View reviewed changes

BradGroux mentioned this pull request Apr 29, 2026

WORKING: All Microsoft Issues and PRs (refresh) #74163

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor(agents): share hook history windows#70762

refactor(agents): share hook history windows#70762
vincentkoc wants to merge 2 commits intomainfrom
refactor/hook-history-unify-live-retry-probe

vincentkoc commented Apr 23, 2026

Uh oh!

aisle-research-bot Bot commented Apr 23, 2026 •

edited

Loading

Description

Recommendation

Description

Recommendation

Uh oh!

greptile-apps Bot commented Apr 23, 2026 •

edited

Loading

Comments Outside Diff (1)

Uh oh!

greptile-apps Bot Apr 23, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Apr 23, 2026

Uh oh!

chatgpt-codex-connector Bot Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

vincentkoc commented Apr 23, 2026

summary

testing

note

Uh oh!

aisle-research-bot Bot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔒 Aisle Security Analysis

Description

Recommendation

Description

Recommendation

Uh oh!

greptile-apps Bot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Comments Outside Diff (1)

Uh oh!

greptile-apps Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

aisle-research-bot Bot commented Apr 23, 2026 •

edited

Loading

greptile-apps Bot commented Apr 23, 2026 •

edited

Loading