Skip to content

fix: clear phantom Claude CLI resumes#70317

Merged
steipete merged 1 commit into
mainfrom
fix/70177-claude-transcript-resume
Apr 22, 2026
Merged

fix: clear phantom Claude CLI resumes#70317
steipete merged 1 commit into
mainfrom
fix/70177-claude-transcript-resume

Conversation

@steipete

Copy link
Copy Markdown
Contributor

Summary

  • verify stored Claude CLI session ids have a readable ~/.claude/projects/*/<sessionId>.jsonl transcript with assistant content before reusing them
  • clear missing-transcript bindings before the CLI runner can pass --resume
  • add regression coverage for missing transcripts, valid transcripts, and path-like session ids

Fixes #70177.

Testing

  • pnpm test src/agents/command/attempt-execution.test.ts src/agents/command/attempt-execution.cli.test.ts src/agents/command/session-store.test.ts
  • pnpm check:changed
  • pnpm check
  • OPENCLAW_VITEST_MAX_WORKERS=1 pnpm check:changed --staged

Full suite note

pnpm test was also attempted. The run hit unrelated failures outside this patch:

  • src/gateway/server-startup.test.ts timed out in skips static warmup for configured CLI backends
  • extensions/openai/provider-runtime.contract.test.ts failed during OpenAI Codex OAuth refresh with Failed to refresh OpenAI Codex token
  • the run later hung in vitest.contracts-channel-surface.config.ts after that shard had already printed a pass summary, so it was stopped with SIGTERM

@aisle-research-bot

aisle-research-bot Bot commented Apr 22, 2026

Copy link
Copy Markdown

🔒 Aisle Security Analysis

We found 3 potential security issue(s) in this PR:

# Severity Title
1 🟡 Medium TOCTOU symlink race when inspecting JSONL transcripts (lstat then open)
2 🔵 Low Unbounded enumeration of ~/.claude/projects and parsing of arbitrary-size JSONL lines can cause local DoS
3 🔵 Low Log injection via unsanitized sessionKey/sessionId in warning message
1. 🟡 TOCTOU symlink race when inspecting JSONL transcripts (lstat then open)
Property Value
Severity Medium
CWE CWE-367
Location src/agents/command/attempt-execution.helpers.ts:30-36

Description

jsonlFileHasAssistantMessage() attempts to prevent symlink-following by checking lstat() and rejecting symlinks, but then later opens the path by name. This creates a time-of-check/time-of-use (TOCTOU) window where an attacker who can write to the directory containing the transcript can swap the file after the lstat() check (e.g., replace it with a symlink or different file) before open() occurs.

Impact depends on the privilege boundary:

  • If the process runs with higher privileges than the attacker (or reads in a shared directory), this can lead to unintended file reads (information disclosure).
  • Even without privilege escalation, it can be used for denial-of-service by swapping in special files or very large files.

Vulnerable flow:

  • Check: fs.lstat(filePath) and stat.isSymbolicLink()/stat.isFile()
  • Use: fs.open(filePath, "r") then stream/parse lines

Vulnerable code:

const stat = await fs.lstat(filePath);
if (stat.isSymbolicLink() || !stat.isFile()) {
  return false;
}

const fh = await fs.open(filePath, "r");

Recommendation

Eliminate the path-based TOCTOU by opening the file safely and validating the opened file descriptor, or by using OS flags that prevent symlink following.

Recommended approach (POSIX): open with O_NOFOLLOW and validate via fh.stat():

import fs from "node:fs/promises";
import { constants as FS } from "node:fs";

const fh = await fs.open(filePath, FS.O_RDONLY | FS.O_NOFOLLOW);
try {
  const st = await fh.stat();
  if (!st.isFile()) return false;// proceed to stream from fh
} finally {
  await fh.close();
}

If O_NOFOLLOW is unavailable on some platforms, use an fd-based validation pattern:

  • fh = open(filePath, "r")
  • st = fh.stat() and ensure it’s a regular file
  • optionally compare (dev, ino) from a fresh lstat() to fh.stat() to detect swaps.

This closes the race because the checks apply to the already-open file, not a name that can be swapped.

2. 🔵 Unbounded enumeration of ~/.claude/projects and parsing of arbitrary-size JSONL lines can cause local DoS
Property Value
Severity Low
CWE CWE-400
Location src/agents/command/attempt-execution.helpers.ts:84-100

Description

The new claudeCliSessionTranscriptHasContent helper can be forced to consume significant CPU/IO (and potentially memory) by scanning an unbounded number of project directories and parsing JSONL transcripts with no line-length/file-size safeguards.

  • Unbounded directory enumeration: fs.readdir(projectsDir, { withFileTypes: true }) reads all entries under ~/.claude/projects and then iterates every subdirectory, attempting to open and scan ${sessionId}.jsonl.
  • Large-line JSON parsing: jsonlFileHasAssistantMessage uses readline and JSON.parse(line) per line. Although there is a 500-record cap, a single line can be arbitrarily large; readline will buffer the entire line and JSON.parse will attempt to parse it, causing heavy memory/CPU usage.

This can be triggered whenever runAgentAttempt() runs with provider claude-cli and a stored cliSessionBinding.sessionId exists, because it calls claudeCliSessionTranscriptHasContent({ sessionId }) before deciding whether to resume. If an attacker can write to the executing user's ~/.claude/projects (or influence HOME/homeDir in the process environment), they can create many directories and/or craft very large JSONL lines to degrade performance (denial of service).

Recommendation

Add explicit resource limits for both directory scanning and JSONL parsing.

Suggested mitigations:

  1. Cap project directories scanned (and optionally short-circuit based on known workspace/project name if available):
const MAX_PROJECT_DIRS = 50;
const projectEntries = (await fs.readdir(projectsDir, { withFileTypes: true }))
  .filter((e) => e.isDirectory())
  .slice(0, MAX_PROJECT_DIRS);
  1. Skip oversized files before opening, and enforce a maximum line length / abort on very long lines:
const MAX_TRANSCRIPT_BYTES = 2 * 1024 * 1024; // 2MB
const st = await fs.stat(filePath);
if (st.size > MAX_TRANSCRIPT_BYTES) return false;

const MAX_LINE_CHARS = 64 * 1024;
for await (const line of rl) {
  if (line.length > MAX_LINE_CHARS) return false; // or break
  ...
}
  1. Consider adding a time budget / AbortSignal for the scan, so the caller can bound how long resume-checking can take.
3. 🔵 Log injection via unsanitized sessionKey/sessionId in warning message
Property Value
Severity Low
CWE CWE-117
Location src/agents/command/attempt-execution.ts:280-282

Description

A new warning log line interpolates params.sessionKey ?? params.sessionId without any sanitization/escaping.

  • sanitizeForLog() is correctly applied to providerOverride, but not to sessionKey/sessionId.
  • If an attacker can influence sessionKey or sessionId (e.g., via CLI/API inputs, stored session metadata, or crafted values containing newlines/control characters/ANSI escapes), they can:
    • forge log lines (CWE-117)
    • inject terminal escape sequences into operator terminals
    • corrupt downstream structured log parsing/pipelines
  • These identifiers may also be sensitive correlation identifiers; logging them verbatim increases the risk of identifier exposure.

Vulnerable code:

log.warn(
  `cli session reset: provider=${sanitizeForLog(params.providerOverride)} reason=transcript-missing sessionKey=${params.sessionKey ?? params.sessionId}`,
);

Recommendation

Apply sanitizeForLog() to all interpolated untrusted values and consider redacting session identifiers.

Minimal fix (sanitization):

const sessionKeyForLog = sanitizeForLog(String(params.sessionKey ?? params.sessionId ?? ""));
log.warn(
  `cli session reset: provider=${sanitizeForLog(params.providerOverride)} reason=transcript-missing sessionKey=${sessionKeyForLog}`,
);

If session IDs/keys are considered sensitive, prefer redaction/truncation:

const raw = String(params.sessionKey ?? params.sessionId ?? "");
const sessionKeyForLog = sanitizeForLog(raw).slice(0, 8) + "…";

Even better, use structured logging fields (if supported) rather than string interpolation.


Analyzed PR: #70317 at commit c216cf9

Last updated on: 2026-04-22T19:37:54Z

@openclaw-barnacle openclaw-barnacle Bot added the agents Agent runtime and tooling label Apr 22, 2026
@cursor

cursor Bot commented Apr 22, 2026

Copy link
Copy Markdown

PR Summary

Medium Risk
Medium risk because it changes CLI session-resume and session-store clearing behavior, which could alter continuity of ongoing claude-cli conversations if transcript detection is wrong.

Overview
Stops claude-cli runs from resuming stored session ids unless a corresponding ~/.claude/projects/*/<sessionId>.jsonl transcript contains an assistant message; otherwise it logs reason=transcript-missing and clears the binding via clearCliSessionInStore before running without --resume.

Refactors transcript-content checks into a shared JSONL scanner with basic path/symlink/file-type safeguards, and adds regression tests covering missing transcripts, valid transcripts, and path-like session ids (plus HOME isolation in CLI attempt tests).

Reviewed by Cursor Bugbot for commit c216cf9. Bugbot is set up for automated code reviews on this repo. Configure here.

@openclaw-barnacle openclaw-barnacle Bot added size: M maintainer Maintainer-authored PR labels Apr 22, 2026
@steipete steipete force-pushed the fix/70177-claude-transcript-resume branch from 9230bd8 to c216cf9 Compare April 22, 2026 19:29
@steipete steipete merged commit f7a5257 into main Apr 22, 2026
11 of 12 checks passed
@steipete steipete deleted the fix/70177-claude-transcript-resume branch April 22, 2026 19:29
@steipete

Copy link
Copy Markdown
Contributor Author

Landed via rebase onto main.

  • Gate: pnpm test src/agents/command/attempt-execution.test.ts src/agents/command/attempt-execution.cli.test.ts src/agents/command/session-store.test.ts; pnpm check:changed; pnpm check; OPENCLAW_VITEST_MAX_WORKERS=1 pnpm check:changed --staged
  • Full suite note: pnpm test was attempted but blocked by unrelated failures: src/gateway/server-startup.test.ts timed out in skips static warmup for configured CLI backends, extensions/openai/provider-runtime.contract.test.ts failed on OpenAI Codex OAuth refresh, then vitest.contracts-channel-surface.config.ts hung after printing a pass summary and was stopped.
  • Source head: c216cf9
  • Merge commit: f7a5257

Fixes #70177.

@greptile-apps

greptile-apps Bot commented Apr 22, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes "phantom" Claude CLI resume attempts by verifying that a stored claudeCliSessionId has a readable transcript with assistant content in ~/.claude/projects/*/ before passing --resume to the CLI runner; if the transcript is missing, the binding is cleared upfront with reason=transcript-missing. The path-traversal guard in normalizeClaudeCliSessionId (rejecting /, \, and null bytes) and the lstat symlink check in the file reader are solid security hygiene. Tests cover the three key cases well.

Confidence Score: 5/5

Safe to merge; the logic is correct and well-tested, with no P0/P1 issues found.

The only finding is a P2 note about broadening the sessionFileHasContent match by removing the type === "message" guard — this is intentional to support the Claude CLI JSONL format and is unlikely to cause false positives in practice. All other changes are correct and backed by regression tests.

No files require special attention.

Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/agents/command/attempt-execution.helpers.ts
Line: 54-56

Comment:
**`type` guard removed — `sessionFileHasContent` now matches more broadly**

The original check was `rec?.type === "message" && message?.role === "assistant"`. Removing the `type === "message"` guard unifies behaviour with the Claude CLI format (`type: "assistant"`), but it also changes what `sessionFileHasContent` accepts for OpenClaw's own session files. Any JSONL record that happens to carry a `message.role === "assistant"` field will now return `true`, even if `type` is something other than `"message"`.

This is intentional for the new Claude CLI path, but worth confirming that no caller of `sessionFileHasContent` relied on the stricter type filter to distinguish real assistant turns from, e.g., metadata or tool-result records.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix: clear phantom Claude CLI resumes (#..." | Re-trigger Greptile

Comment on lines +54 to 56
if ((rec?.message as Record<string, unknown> | undefined)?.role === "assistant") {
return true;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 type guard removed — sessionFileHasContent now matches more broadly

The original check was rec?.type === "message" && message?.role === "assistant". Removing the type === "message" guard unifies behaviour with the Claude CLI format (type: "assistant"), but it also changes what sessionFileHasContent accepts for OpenClaw's own session files. Any JSONL record that happens to carry a message.role === "assistant" field will now return true, even if type is something other than "message".

This is intentional for the new Claude CLI path, but worth confirming that no caller of sessionFileHasContent relied on the stricter type filter to distinguish real assistant turns from, e.g., metadata or tool-result records.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/command/attempt-execution.helpers.ts
Line: 54-56

Comment:
**`type` guard removed — `sessionFileHasContent` now matches more broadly**

The original check was `rec?.type === "message" && message?.role === "assistant"`. Removing the `type === "message"` guard unifies behaviour with the Claude CLI format (`type: "assistant"`), but it also changes what `sessionFileHasContent` accepts for OpenClaw's own session files. Any JSONL record that happens to carry a `message.role === "assistant"` field will now return `true`, even if `type` is something other than `"message"`.

This is intentional for the new Claude CLI path, but worth confirming that no caller of `sessionFileHasContent` relied on the stricter type filter to distinguish real assistant turns from, e.g., metadata or tool-result records.

How can I resolve this? If you propose a fix, please make it concise.

medikoo pushed a commit to medikoo/openclaw that referenced this pull request Apr 24, 2026
Verify Claude CLI session transcripts before reuse and clear phantom bindings with transcript-missing instead of passing stale --resume ids.\n\nFixes openclaw#70177.
ogt-redknie pushed a commit to ogt-redknie/OPENX that referenced this pull request May 2, 2026
Verify Claude CLI session transcripts before reuse and clear phantom bindings with transcript-missing instead of passing stale --resume ids.\n\nFixes openclaw#70177.
zhonghe0615 pushed a commit to zhonghe0615/openclaw that referenced this pull request May 7, 2026
Verify Claude CLI session transcripts before reuse and clear phantom bindings with transcript-missing instead of passing stale --resume ids.\n\nFixes openclaw#70177.
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026
Verify Claude CLI session transcripts before reuse and clear phantom bindings with transcript-missing instead of passing stale --resume ids.\n\nFixes openclaw#70177.
globalcaos pushed a commit to globalcaos/tinkerclaw that referenced this pull request May 13, 2026
Verify Claude CLI session transcripts before reuse and clear phantom bindings with transcript-missing instead of passing stale --resume ids.\n\nFixes openclaw#70177.
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 24, 2026
Verify Claude CLI session transcripts before reuse and clear phantom bindings with transcript-missing instead of passing stale --resume ids.\n\nFixes openclaw#70177.
jameslcowan pushed a commit to jameslcowan/openclaw that referenced this pull request Jun 2, 2026
Verify Claude CLI session transcripts before reuse and clear phantom bindings with transcript-missing instead of passing stale --resume ids.\n\nFixes openclaw#70177.
sablehead pushed a commit to sablehead/openclaw that referenced this pull request Jun 10, 2026
Verify Claude CLI session transcripts before reuse and clear phantom bindings with transcript-missing instead of passing stale --resume ids.\n\nFixes openclaw#70177.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling maintainer Maintainer-authored PR size: M

Projects

None yet

1 participant