fix(agents): seed claude-cli fallback prompts with prior-session context (#69973)#72069
Conversation
Greptile SummaryThis PR seeds fallback retry prompts with prior claude-cli session context by parsing Claude Code's local JSONL (summary + post-boundary turns) and prepending a labeled prelude to the fallback candidate's prompt, mirroring the resume-after-compaction shape Claude Code uses itself. The implementation is clean, well-tested, and the integration guard ( Confidence Score: 4/5Safe to merge; only P2 edge-case findings, no critical logic or security issues. All findings are P2. The main concern is that a src/gateway/cli-session-history.claude.ts — the summary/compact_boundary interaction in
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: add644115b
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
Addressed the review on 9987e7797f:
Verified: 33 attempt-execution + 14 cli-session-history tests; broader |
|
Codex automated review: keeping this open. Keep this PR open. Current main still does not implement the PR’s intended cross-provider fallback seed from Claude CLI session history, while the live PR remains open, unmerged, focused on the linked fallback handoff bug, and now includes review-followup commits for source-provider gating and compact-boundary pairing. Best possible solution: Keep this PR open for maintainer review. The best path is to land this PR or an equivalent implementation that seeds a bounded Claude CLI summary/recent-turn prelude only when the fallback chain started on What I checked:
Remaining risk / open question:
Codex Review notes: model gpt-5.5, reasoning high; reviewed against fb40ed99a7f0. |
5510d33 to
12d8e44
Compare
26dc4ad to
e06f042
Compare
e06f042 to
23d1e50
Compare
…ext (openclaw#69973) When a claude-cli attempt failed with a fallbackable error (e.g. a 402 billing limit), the next candidate -- typically a non-CLI provider -- ran with no prior conversation context. Claude Code keeps its own JSONL session under ~/.claude/projects/, but the fallback runner only sees what OpenClaw assembles from its own transcript, which is empty for claude-cli sessions. The fallback model therefore behaved as if the conversation just started, even though Claude later resumed fine. Resolution mirrors what Claude Code itself does on resume after compaction: prefer the explicit `/compact` summary, then append the most recent post-boundary turns up to a char budget. Concretely: - `readClaudeCliFallbackSeed` (gateway): walks the Claude JSONL with awareness of `type: "summary"` and `type: "system", subtype: "compact_boundary"` entries. Pre-boundary turns are dropped (they are represented by the summary); post-boundary turns become the recent-window. Multiple compactions are handled by preferring the latest summary. Path safety reuses the existing `resolveClaudeCliSessionFilePath` validation. - `formatClaudeCliFallbackPrelude` / `buildClaudeCliFallbackContext\ Prelude` (agents helpers): format the harvested seed into a labeled prelude. Tool blocks are coalesced to compact "(tool call: name)" / "(tool result: …)" hints to keep the prompt budget honest. Newest turns are kept first when truncating; the summary is clearly labeled "(truncated)" if it overflows. - `resolveFallbackRetryPrompt`: gains an optional `priorContextPrelude` that prepends before the existing retry marker. Empty/whitespace preludes are ignored; first-attempt prompts are unchanged. - `runAgentAttempt`: builds the prelude when `isFallbackRetry === true` AND the new candidate is non-claude-cli AND a Claude-cli session binding is present. Same-provider fallbacks (claude-cli to claude-cli) are unaffected because Claude's own --resume still works. Verified the new tests (12 in cli-session-history, 12 added to attempt-execution) catch the regression: removing the prelude prepend in resolveFallbackRetryPrompt makes both new prelude cases fail, restoring the original cold-start behavior. References: - https://code.claude.com/docs/en/how-claude-code-works - "Inside Claude Code: The Session File Format" https://databunny.medium.com/inside-claude-code-the-session-file-format-and-how-to-inspect-it-b9998e66d56b
…formatter Local default oxlint did not run --type-aware so the warning was missed on the initial commit; CI surfaced it via check-lint. Hoist the heading into a named const so its length is read directly without the assertion.
…ndary Addresses review on openclaw#72069: - Codex P1 ("Gate Claude prelude seeding by source provider"): the guard checked the *current* fallback candidate but not the failed attempt. A session that still carried a stale cliSessionBindings["claude-cli"] from an unrelated past run would inject Claude transcript context into a fallback chain that started on a different provider (e.g. openai -> openai-codex), leaking irrelevant prior conversation. Plumb `originalProvider` (the user-requested provider for the chain) through to runAgentAttempt and require `isClaudeCliProvider(originalProvider)` before reading Claude history. - Codex P2 ("Prefer latest compact boundary when summary is missing"): the resolver always preferred the most recent explicit summary, so a later compaction without its own summary entry (rare crash case) paired stale summary text with post-latest-boundary turns. Restructure readClaudeCliFallbackSeed to queue summaries into pendingSummary and flush each boundary's pair atomically. A boundary with no preceding summary now correctly falls back to the boundary's own content rather than serving an older summary alongside fresh turns. - Greptile P2 (newest-first break vs sparse coverage): the formatFallbackTurns walk intentionally stops on the first oversized turn so the prelude stays a contiguous "what was happening just before the failure" window. Document the design choice inline so a future maintainer doesn't reflexively change it to skip-and-continue. Tests: - New gateway cases for the boundary-without-summary edge case and for trailing summaries written without a paired boundary. - existing 33 attempt-execution + 14 cli-session-history tests still pass; broader src/agents/command suite stays green (63/63).
The required-typed param introduced in 9987e7797f broke attempt-execution.cli.test.ts and auth-profile-runtime-contract.test.ts which construct runAgentAttempt params without an originalProvider field. Make it optional and explicitly require the typeof check before passing to isClaudeCliProvider so a missing field correctly skips the seed (defensive default for fallback paths that didn't plumb the original provider through, no-op for non-fallback paths).
23d1e50 to
aac8e06
Compare
obviyus
left a comment
There was a problem hiding this comment.
Verified the claude-cli fallback handoff: when the primary claude-cli attempt fails, the fallback prompt now carries Claude Code session context instead of starting cold.
Maintainer follow-up: made the source provider explicit, trimmed review comments, added contiguous-window coverage, and added the Unreleased changelog entry.
Local gate: pnpm test src/agents/command/attempt-execution.test.ts src/agents/command/attempt-execution.cli.test.ts src/gateway/cli-session-history.test.ts src/agents/auth-profile-runtime-contract.test.ts; pnpm tsgo:core; pnpm tsgo:test:src.
…ndary Addresses review on #72069: - Codex P1 ("Gate Claude prelude seeding by source provider"): the guard checked the *current* fallback candidate but not the failed attempt. A session that still carried a stale cliSessionBindings["claude-cli"] from an unrelated past run would inject Claude transcript context into a fallback chain that started on a different provider (e.g. openai -> openai-codex), leaking irrelevant prior conversation. Plumb `originalProvider` (the user-requested provider for the chain) through to runAgentAttempt and require `isClaudeCliProvider(originalProvider)` before reading Claude history. - Codex P2 ("Prefer latest compact boundary when summary is missing"): the resolver always preferred the most recent explicit summary, so a later compaction without its own summary entry (rare crash case) paired stale summary text with post-latest-boundary turns. Restructure readClaudeCliFallbackSeed to queue summaries into pendingSummary and flush each boundary's pair atomically. A boundary with no preceding summary now correctly falls back to the boundary's own content rather than serving an older summary alongside fresh turns. - Greptile P2 (newest-first break vs sparse coverage): the formatFallbackTurns walk intentionally stops on the first oversized turn so the prelude stays a contiguous "what was happening just before the failure" window. Document the design choice inline so a future maintainer doesn't reflexively change it to skip-and-continue. Tests: - New gateway cases for the boundary-without-summary edge case and for trailing summaries written without a paired boundary. - existing 33 attempt-execution + 14 cli-session-history tests still pass; broader src/agents/command suite stays green (63/63).
…ndary Addresses review on openclaw#72069: - Codex P1 ("Gate Claude prelude seeding by source provider"): the guard checked the *current* fallback candidate but not the failed attempt. A session that still carried a stale cliSessionBindings["claude-cli"] from an unrelated past run would inject Claude transcript context into a fallback chain that started on a different provider (e.g. openai -> openai-codex), leaking irrelevant prior conversation. Plumb `originalProvider` (the user-requested provider for the chain) through to runAgentAttempt and require `isClaudeCliProvider(originalProvider)` before reading Claude history. - Codex P2 ("Prefer latest compact boundary when summary is missing"): the resolver always preferred the most recent explicit summary, so a later compaction without its own summary entry (rare crash case) paired stale summary text with post-latest-boundary turns. Restructure readClaudeCliFallbackSeed to queue summaries into pendingSummary and flush each boundary's pair atomically. A boundary with no preceding summary now correctly falls back to the boundary's own content rather than serving an older summary alongside fresh turns. - Greptile P2 (newest-first break vs sparse coverage): the formatFallbackTurns walk intentionally stops on the first oversized turn so the prelude stays a contiguous "what was happening just before the failure" window. Document the design choice inline so a future maintainer doesn't reflexively change it to skip-and-continue. Tests: - New gateway cases for the boundary-without-summary edge case and for trailing summaries written without a paired boundary. - existing 33 attempt-execution + 14 cli-session-history tests still pass; broader src/agents/command suite stays green (63/63).
…ndary Addresses review on openclaw#72069: - Codex P1 ("Gate Claude prelude seeding by source provider"): the guard checked the *current* fallback candidate but not the failed attempt. A session that still carried a stale cliSessionBindings["claude-cli"] from an unrelated past run would inject Claude transcript context into a fallback chain that started on a different provider (e.g. openai -> openai-codex), leaking irrelevant prior conversation. Plumb `originalProvider` (the user-requested provider for the chain) through to runAgentAttempt and require `isClaudeCliProvider(originalProvider)` before reading Claude history. - Codex P2 ("Prefer latest compact boundary when summary is missing"): the resolver always preferred the most recent explicit summary, so a later compaction without its own summary entry (rare crash case) paired stale summary text with post-latest-boundary turns. Restructure readClaudeCliFallbackSeed to queue summaries into pendingSummary and flush each boundary's pair atomically. A boundary with no preceding summary now correctly falls back to the boundary's own content rather than serving an older summary alongside fresh turns. - Greptile P2 (newest-first break vs sparse coverage): the formatFallbackTurns walk intentionally stops on the first oversized turn so the prelude stays a contiguous "what was happening just before the failure" window. Document the design choice inline so a future maintainer doesn't reflexively change it to skip-and-continue. Tests: - New gateway cases for the boundary-without-summary edge case and for trailing summaries written without a paired boundary. - existing 33 attempt-execution + 14 cli-session-history tests still pass; broader src/agents/command suite stays green (63/63).
…ndary Addresses review on openclaw#72069: - Codex P1 ("Gate Claude prelude seeding by source provider"): the guard checked the *current* fallback candidate but not the failed attempt. A session that still carried a stale cliSessionBindings["claude-cli"] from an unrelated past run would inject Claude transcript context into a fallback chain that started on a different provider (e.g. openai -> openai-codex), leaking irrelevant prior conversation. Plumb `originalProvider` (the user-requested provider for the chain) through to runAgentAttempt and require `isClaudeCliProvider(originalProvider)` before reading Claude history. - Codex P2 ("Prefer latest compact boundary when summary is missing"): the resolver always preferred the most recent explicit summary, so a later compaction without its own summary entry (rare crash case) paired stale summary text with post-latest-boundary turns. Restructure readClaudeCliFallbackSeed to queue summaries into pendingSummary and flush each boundary's pair atomically. A boundary with no preceding summary now correctly falls back to the boundary's own content rather than serving an older summary alongside fresh turns. - Greptile P2 (newest-first break vs sparse coverage): the formatFallbackTurns walk intentionally stops on the first oversized turn so the prelude stays a contiguous "what was happening just before the failure" window. Document the design choice inline so a future maintainer doesn't reflexively change it to skip-and-continue. Tests: - New gateway cases for the boundary-without-summary edge case and for trailing summaries written without a paired boundary. - existing 33 attempt-execution + 14 cli-session-history tests still pass; broader src/agents/command suite stays green (63/63).
…ndary Addresses review on openclaw#72069: - Codex P1 ("Gate Claude prelude seeding by source provider"): the guard checked the *current* fallback candidate but not the failed attempt. A session that still carried a stale cliSessionBindings["claude-cli"] from an unrelated past run would inject Claude transcript context into a fallback chain that started on a different provider (e.g. openai -> openai-codex), leaking irrelevant prior conversation. Plumb `originalProvider` (the user-requested provider for the chain) through to runAgentAttempt and require `isClaudeCliProvider(originalProvider)` before reading Claude history. - Codex P2 ("Prefer latest compact boundary when summary is missing"): the resolver always preferred the most recent explicit summary, so a later compaction without its own summary entry (rare crash case) paired stale summary text with post-latest-boundary turns. Restructure readClaudeCliFallbackSeed to queue summaries into pendingSummary and flush each boundary's pair atomically. A boundary with no preceding summary now correctly falls back to the boundary's own content rather than serving an older summary alongside fresh turns. - Greptile P2 (newest-first break vs sparse coverage): the formatFallbackTurns walk intentionally stops on the first oversized turn so the prelude stays a contiguous "what was happening just before the failure" window. Document the design choice inline so a future maintainer doesn't reflexively change it to skip-and-continue. Tests: - New gateway cases for the boundary-without-summary edge case and for trailing summaries written without a paired boundary. - existing 33 attempt-execution + 14 cli-session-history tests still pass; broader src/agents/command suite stays green (63/63).
…ndary Addresses review on openclaw#72069: - Codex P1 ("Gate Claude prelude seeding by source provider"): the guard checked the *current* fallback candidate but not the failed attempt. A session that still carried a stale cliSessionBindings["claude-cli"] from an unrelated past run would inject Claude transcript context into a fallback chain that started on a different provider (e.g. openai -> openai-codex), leaking irrelevant prior conversation. Plumb `originalProvider` (the user-requested provider for the chain) through to runAgentAttempt and require `isClaudeCliProvider(originalProvider)` before reading Claude history. - Codex P2 ("Prefer latest compact boundary when summary is missing"): the resolver always preferred the most recent explicit summary, so a later compaction without its own summary entry (rare crash case) paired stale summary text with post-latest-boundary turns. Restructure readClaudeCliFallbackSeed to queue summaries into pendingSummary and flush each boundary's pair atomically. A boundary with no preceding summary now correctly falls back to the boundary's own content rather than serving an older summary alongside fresh turns. - Greptile P2 (newest-first break vs sparse coverage): the formatFallbackTurns walk intentionally stops on the first oversized turn so the prelude stays a contiguous "what was happening just before the failure" window. Document the design choice inline so a future maintainer doesn't reflexively change it to skip-and-continue. Tests: - New gateway cases for the boundary-without-summary edge case and for trailing summaries written without a paired boundary. - existing 33 attempt-execution + 14 cli-session-history tests still pass; broader src/agents/command suite stays green (63/63).
…ndary Addresses review on openclaw#72069: - Codex P1 ("Gate Claude prelude seeding by source provider"): the guard checked the *current* fallback candidate but not the failed attempt. A session that still carried a stale cliSessionBindings["claude-cli"] from an unrelated past run would inject Claude transcript context into a fallback chain that started on a different provider (e.g. openai -> openai-codex), leaking irrelevant prior conversation. Plumb `originalProvider` (the user-requested provider for the chain) through to runAgentAttempt and require `isClaudeCliProvider(originalProvider)` before reading Claude history. - Codex P2 ("Prefer latest compact boundary when summary is missing"): the resolver always preferred the most recent explicit summary, so a later compaction without its own summary entry (rare crash case) paired stale summary text with post-latest-boundary turns. Restructure readClaudeCliFallbackSeed to queue summaries into pendingSummary and flush each boundary's pair atomically. A boundary with no preceding summary now correctly falls back to the boundary's own content rather than serving an older summary alongside fresh turns. - Greptile P2 (newest-first break vs sparse coverage): the formatFallbackTurns walk intentionally stops on the first oversized turn so the prelude stays a contiguous "what was happening just before the failure" window. Document the design choice inline so a future maintainer doesn't reflexively change it to skip-and-continue. Tests: - New gateway cases for the boundary-without-summary edge case and for trailing summaries written without a paired boundary. - existing 33 attempt-execution + 14 cli-session-history tests still pass; broader src/agents/command suite stays green (63/63).
Closes #69973.
Why
When a
claude-cliattempt fails with a fallbackable error (e.g. 402 billing limit), the next candidate runs cold. Claude Code keeps full session history in~/.claude/projects/<id>/<sessionId>.jsonl, but the fallback runner only sees what OpenClaw assembles from its own transcript — and that transcript is empty for claude-cli sessions because OpenClaw doesn't mirror Claude's local JSONL turns into its own session file.@steipete's triage: "the next candidate only sees what OpenClaw assembles from its own transcript/context. If the OpenClaw transcript is truncated or missing the prior CLI-backed turns, the fallback candidate starts cold while Claude later resumes fine. Concrete fix: when a claude-cli attempt fails with a fallbackable error, seed the fallback prompt from the OpenClaw transcript/compacted summary before trying the non-CLI candidate."
Pattern
Mirrors what Claude Code itself does on resume after
/compact: prefer the explicit summary, then append the most recent post-boundary turns up to a char budget. Inferred from the upstream JSONL shape (type: "summary",type: "system" subtype: "compact_boundary"withcompactMetadata) and confirmed against the published session-management behavior. References:Last-N raw replay was rejected as the wrong shape: it diverges from how Claude Code itself replays after compaction, misses context if
/compactalready happened (the older turns are gone, only the summary remains), and chews more prompt budget than the equivalent summary.What changed
src/gateway/cli-session-history.claude.tsreadClaudeCliFallbackSeedwalks the JSONL with awareness oftype: "summary"andcompact_boundaryentries. Pre-boundary turns are dropped (they're now in the summary). Multiple compactions: latest summary wins.src/gateway/cli-session-history.tssrc/agents/command/attempt-execution.helpers.tsformatClaudeCliFallbackPrelude/buildClaudeCliFallbackContextPrelude. Tool blocks coalesce to compact(tool call: …)/(tool result: …)hints. Newest turns kept when truncating. Summary clearly labeled(truncated)if it overflows.resolveFallbackRetryPromptgains an optionalpriorContextPrelude.src/agents/command/attempt-execution.tsrunAgentAttemptbuilds the prelude whenisFallbackRetry && providerOverride !== claude-cli && cliSessionBindings.claude-cli.sessionId is set. Same-provider fallbacks (claude-cli → claude-cli) are unaffected — Claude's own--resumealready works there.Verified
src/agents/command/attempt-execution.test.ts(12 added)src/gateway/cli-session-history.test.ts(7 added)pnpm tsgo:core+pnpm tsgo:core:testcleanpnpm format+pnpm lintclean on touched filesresolveFallbackRetryPromptfails both new prelude cases, restoring the cold-start behavior — confirming the test pins the behavior (not just the helper signature).Risks flagged
augmentChatHistoryWithCliSessionImports— we already read user's Claude JSONL for display. No new ground.resolveClaudeCliSessionFilePathvalidation, which already rejects..-style escapes./compactproduced a malformed summary (see #46602), we inherit that. Mitigation: the prelude is clearly labeled "Prior session context (from claude-cli)" so the fallback model treats it as a hint, not authoritative state.…so the model never sees a mid-summary fragment.Out of scope
openai-codexas primary). Claude is the driver of the user-pain reports here. The same mechanism could be extended provider-by-provider as those CLIs expose their own structured transcripts.--resumeand a session id of its own; cross-pollination there is a separate problem.Files
src/gateway/cli-session-history.claude.ts:302+— parser extensionsrc/gateway/cli-session-history.ts:11+— re-exportsrc/agents/command/attempt-execution.helpers.ts:104+— formatter + builder + extendedresolveFallbackRetryPromptsrc/agents/command/attempt-execution.ts:265+— integration inrunAgentAttempt