fix(cli-runner): keep recent tail when reseed history exceeds maxHistoryChars (#83117)

adele-with-a-b · steipete · web-flow · commit 5c4a733912a8 · 2026-05-24T00:07:11.000+01:00
* fix(cli-runner): keep recent tail when reseed history exceeds maxHistoryChars `buildCliSessionHistoryPrompt` was prefix-slicing the rendered history, dropping the most recent assistant turns from the reseed prompt. After #80934 made the Claude-CLI reseed default-on, every Claude-CLI user is exposed to this on session_expired when the rendered transcript exceeds 12288 chars. The truncation marker landed mid-word in real reproductions. Fix: - Tail-slice (keep the recent suffix, drop the older prefix) - Pin the compaction summary as a prefix when present, only cap the post-summary transcript (loadCliSessionReseedMessages deliberately places the summary first) - When the summary alone exceeds maxHistoryChars, head-slice the summary itself to honor the cap; drop the post-summary tail in that case - Move the truncation marker to the lead since what follows is the recent tail, not what was dropped Closes #83157 * fix(cli-runner): retain recent tail with oversize summaries * fix(cli-runner): cap summary block plus marker against maxHistoryChars ClawSweeper P2 on #83117 flagged that when `summaryRendered.length` is less than `maxHistoryChars` but `summaryBlock.length` (summary + `\n\n` separator) meets or exceeds it, the `remainingBudget <= 0` arm of `buildCliSessionHistoryPrompt` appends the truncation marker after the already-full summary block. A 199-char rendered summary under a 200-char cap produced a 257-char history block — defeating the cap that prevents reseeding fresh CLI sessions with unexpectedly huge prompts. Fix the budget edge by truncating the summary in this branch as well so `summary + separator + marker` stays within `maxHistoryChars`. The tail still drops (the summary alone consumes the budget) and the marker still leads its own line so the prompt announces what was discarded. Mirrors the existing oversize-summary branch's pattern of head-slicing the summary against an explicit budget that reserves marker + separator. Add a focused regression in `session-history.test.ts` covering exactly the gap the finding called out: `summaryRendered.length < maxHistoryChars` with a non-empty post-summary tail. Asserts the rendered history block stays within `maxHistoryChars` and the truncation marker is present. * fix(cli-runner): keep tail for near-cap summaries --------- Co-authored-by: Peter Steinberger <steipete@gmail.com>
diff --git a/src/agents/cli-runner/session-history.test.ts b/src/agents/cli-runner/session-history.test.ts
@@ -577,13 +577,163 @@ describe("buildCliSessionHistoryPrompt", () => {
 
   it("caps rendered reseed history before adding the next user message", () => {
     const prompt = buildCliSessionHistoryPrompt({
-      messages: [{ role: "compactionSummary", summary: "x".repeat(100) }],
+      messages: [
+        { role: "user", content: "x".repeat(100) },
+        { role: "assistant", content: "y".repeat(100) },
+      ],
       prompt: "current ask must survive",
       maxHistoryChars: 20,
     });
 
-    expect(prompt).toContain("[OpenClaw reseed history truncated]");
+    expect(prompt).toContain("[OpenClaw reseed history truncated; older turns dropped]");
     expect(prompt).toContain("<next_user_message>\ncurrent ask must survive\n</next_user_message>");
+    // Older 100-char prefix must be dropped by the tail slice; the
+    // post-cap rendered tail is shorter than the dropped prefix.
     expect(prompt).not.toContain("x".repeat(80));
   });
+
+  it("keeps the most recent turns when rendered history exceeds the cap", () => {
+    // Older turns plus a final marker turn whose content is exactly what a
+    // head-slice would drop first. Asserting the marker survives in the
+    // rendered prompt locks in tail-slice semantics: a session-recovery
+    // feature must keep the latest context, not the oldest.
+    const prompt = buildCliSessionHistoryPrompt({
+      messages: [
+        { role: "user", content: "x".repeat(8000) },
+        { role: "assistant", content: "y".repeat(8000) },
+        { role: "user", content: "FINAL_USER_MARKER" },
+        { role: "assistant", content: "FINAL_ASSISTANT_MARKER" },
+      ],
+      prompt: "next ask",
+    });
+
+    expect(prompt).toBeDefined();
+    expect(prompt).toContain("FINAL_USER_MARKER");
+    expect(prompt).toContain("FINAL_ASSISTANT_MARKER");
+    expect(prompt).toContain("[OpenClaw reseed history truncated; older turns dropped]");
+    // The oldest 8000-char block must have been dropped — a head-slice
+    // would have kept it instead of the recent tail.
+    expect(prompt).not.toContain("x".repeat(8000));
+    expect(prompt).toContain("<next_user_message>\nnext ask\n</next_user_message>");
+  });
+
+  it("preserves the compaction summary when the post-summary transcript exceeds the cap", () => {
+    // loadCliSessionReseedMessages places a compactionSummary entry first
+    // so the compacted prior context survives reseed. A blind tail slice
+    // of the joined history would drop that summary whenever the
+    // post-summary tail alone exceeds the cap. The structure-aware
+    // truncation pins the summary as a prefix and caps only the tail.
+    const prompt = buildCliSessionHistoryPrompt({
+      messages: [
+        { role: "compactionSummary", summary: "COMPACTION_SUMMARY_MARKER pinned context" },
+        { role: "user", content: "z".repeat(8000) },
+        { role: "assistant", content: "w".repeat(8000) },
+        { role: "user", content: "POST_SUMMARY_FINAL_USER" },
+        { role: "assistant", content: "POST_SUMMARY_FINAL_ASSISTANT" },
+      ],
+      prompt: "next ask",
+    });
+
+    expect(prompt).toBeDefined();
+    // Compaction summary must be pinned as a prefix, not sliced away.
+    expect(prompt).toContain("Compaction summary: COMPACTION_SUMMARY_MARKER pinned context");
+    // Recent tail still preserved within the post-summary budget.
+    expect(prompt).toContain("POST_SUMMARY_FINAL_USER");
+    expect(prompt).toContain("POST_SUMMARY_FINAL_ASSISTANT");
+    expect(prompt).toContain("[OpenClaw reseed history truncated; older turns dropped]");
+    // Head of post-summary tail (oldest 8000-char `z` block) must be
+    // dropped so the cap is honored.
+    expect(prompt).not.toContain("z".repeat(8000));
+    expect(prompt).toContain("<next_user_message>\nnext ask\n</next_user_message>");
+  });
+
+  it("caps oversize compaction summary while preserving recent post-summary tail", () => {
+    // Two regressions covered here:
+    // 1. `tailRaw.slice(-0)` would return the entire tail (JS quirk:
+    //    `String.prototype.slice(-0) === slice(0)`), defeating the cap when
+    //    the summary block consumes the budget.
+    // 2. Pinning the full summary as-is when the summary itself exceeds
+    //    `maxHistoryChars` would blow past the cap that prevents
+    //    reseeding fresh CLI sessions with unexpectedly huge prompts.
+    //    The summary must itself be truncated to fit the budget while still
+    //    preserving the recent post-summary exact turns.
+    const summaryText = "OVERSIZE_SUMMARY_MARKER ".repeat(50).trim();
+    const maxHistoryChars = 200;
+    const prompt = buildCliSessionHistoryPrompt({
+      messages: [
+        { role: "compactionSummary", summary: summaryText },
+        { role: "user", content: "POST_SUMMARY_USER_DROPPED" },
+        { role: "assistant", content: "POST_SUMMARY_ASSISTANT_DROPPED" },
+      ],
+      prompt: "next ask",
+      // Cap well below the rendered summary block so the summary itself
+      // must be truncated and the tail budget would naively be 0.
+      maxHistoryChars,
+    });
+
+    expect(prompt).toBeDefined();
+    // The truncated summary still leads with recognizable load-bearing
+    // text — head-slicing preserves the orientation/intro of the summary.
+    expect(prompt).toContain("OVERSIZE_SUMMARY_MARKER");
+    expect(prompt).toContain("Compaction summary:");
+    // The leading truncation marker is present so the prompt announces
+    // what was discarded.
+    expect(prompt).toContain("[OpenClaw reseed history truncated; older turns dropped]");
+    // The cap is honored: the rendered <conversation_history> block
+    // must not blow past `maxHistoryChars` plus a small wrapper allowance.
+    const historyMatch = prompt?.match(
+      /<conversation_history>\n([\s\S]*?)\n<\/conversation_history>/,
+    );
+    expect(historyMatch).not.toBeNull();
+    const renderedHistory = historyMatch?.[1] ?? "";
+    expect(renderedHistory.length).toBeLessThanOrEqual(maxHistoryChars);
+    // The full untruncated summary must NOT appear — that would defeat
+    // the cap.
+    expect(prompt).not.toContain(summaryText);
+    // Post-summary exact turns are newer than the summary and must still
+    // survive inside the reserved tail budget.
+    expect(prompt).toContain("POST_SUMMARY_USER_DROPPED");
+    expect(prompt).toContain("POST_SUMMARY_ASSISTANT_DROPPED");
+    expect(prompt).toContain("<next_user_message>\nnext ask\n</next_user_message>");
+  });
+
+  it("honors the cap when the summary block plus marker crosses it", () => {
+    // Edge case: `summaryRendered.length < maxHistoryChars` (the gate that
+    // routes to the oversize-summary branch is not taken) BUT
+    // `summaryBlock.length >= maxHistoryChars` once the `\n\n` separator
+    // is appended, making `remainingBudget <= 0`. Without summary
+    // truncation in that branch, the rendered history block is
+    // `summary + separator + marker` — well over `maxHistoryChars`. A
+    // 199-char rendered summary under a 200-char cap would otherwise
+    // produce a 257-char history block.
+    const maxHistoryChars = 200;
+    // `renderHistoryMessage` prefixes "Compaction summary: " (20 chars)
+    // before the summary text, so a 179-char summary renders to 199 chars
+    // — strictly less than the cap, but `summaryBlock = rendered + "\n\n"`
+    // is 201 chars and `remainingBudget` is negative.
+    const summaryPrefix = "Compaction summary: ";
+    const summaryText = "S".repeat(maxHistoryChars - 1 - summaryPrefix.length);
+    const prompt = buildCliSessionHistoryPrompt({
+      messages: [
+        { role: "compactionSummary", summary: summaryText },
+        { role: "user", content: "POST_SUMMARY_TAIL_USER" },
+        { role: "assistant", content: "POST_SUMMARY_TAIL_ASSISTANT" },
+      ],
+      prompt: "next ask",
+      maxHistoryChars,
+    });
+
+    expect(prompt).toBeDefined();
+    const historyMatch = prompt?.match(
+      /<conversation_history>\n([\s\S]*?)\n<\/conversation_history>/,
+    );
+    expect(historyMatch).not.toBeNull();
+    const renderedHistory = historyMatch?.[1] ?? "";
+    expect(renderedHistory.length).toBeLessThanOrEqual(maxHistoryChars);
+    // Marker is still present so the prompt announces what was discarded.
+    expect(prompt).toContain("[OpenClaw reseed history truncated; older turns dropped]");
+    // Near-cap summaries still reserve room for the newest exact turns.
+    expect(prompt).toContain("POST_SUMMARY_TAIL_USER");
+    expect(prompt).toContain("POST_SUMMARY_TAIL_ASSISTANT");
+  });
 });
diff --git a/src/agents/cli-runner/session-history.ts b/src/agents/cli-runner/session-history.ts
@@ -114,41 +114,109 @@ function loadContextEngineMessagesFromEntries(entries: unknown[]): AgentMessage[
   });
 }
 
+function renderHistoryMessage(message: unknown): string | undefined {
+  if (!message || typeof message !== "object") {
+    return undefined;
+  }
+  const entry = message as HistoryMessage;
+  const role =
+    entry.role === "assistant"
+      ? "Assistant"
+      : entry.role === "user"
+        ? "User"
+        : entry.role === "compactionSummary"
+          ? "Compaction summary"
+          : undefined;
+  if (!role) {
+    return undefined;
+  }
+  const text =
+    entry.role === "compactionSummary" && typeof entry.summary === "string"
+      ? entry.summary.trim()
+      : coerceHistoryText(entry.content);
+  return text ? `${role}: ${text}` : undefined;
+}
+
 export function buildCliSessionHistoryPrompt(params: {
   messages: unknown[];
   prompt: string;
   maxHistoryChars?: number;
 }): string | undefined {
   const maxHistoryChars = params.maxHistoryChars ?? MAX_CLI_SESSION_RESEED_HISTORY_CHARS;
-  const renderedHistoryRaw = params.messages
+
+  // loadCliSessionReseedMessages deliberately places a `compactionSummary`
+  // entry first when the session was compacted, so the compacted prior
+  // context survives reseed. Pin that summary as a prefix and only
+  // tail-truncate the post-summary transcript — a blind tail-slice of the
+  // joined history would drop the summary whenever the post-summary tail
+  // alone exceeds the cap.
+  const firstEntry = params.messages[0];
+  const firstIsCompaction =
+    !!firstEntry &&
+    typeof firstEntry === "object" &&
+    (firstEntry as HistoryMessage).role === "compactionSummary";
+  const summaryRendered = firstIsCompaction ? renderHistoryMessage(firstEntry) : undefined;
+  const tailMessages = firstIsCompaction ? params.messages.slice(1) : params.messages;
+
+  const tailRaw = tailMessages
     .flatMap((message) => {
-      if (!message || typeof message !== "object") {
-        return [];
-      }
-      const entry = message as HistoryMessage;
-      const role =
-        entry.role === "assistant"
-          ? "Assistant"
-          : entry.role === "user"
-            ? "User"
-            : entry.role === "compactionSummary"
-              ? "Compaction summary"
-              : undefined;
-      if (!role) {
-        return [];
-      }
-      const text =
-        entry.role === "compactionSummary" && typeof entry.summary === "string"
-          ? entry.summary.trim()
-          : coerceHistoryText(entry.content);
-      return text ? [`${role}: ${text}`] : [];
+      const rendered = renderHistoryMessage(message);
+      return rendered ? [rendered] : [];
     })
     .join("\n\n")
     .trim();
-  const renderedHistory =
-    renderedHistoryRaw.length > maxHistoryChars
-      ? `${renderedHistoryRaw.slice(0, maxHistoryChars).trimEnd()}\n[OpenClaw reseed history truncated]`
-      : renderedHistoryRaw;
+
+  const truncationMarker = "[OpenClaw reseed history truncated; older turns dropped]";
+  const renderTruncatedSummaryWithTail = (renderedSummary: string): string => {
+    const tailBudget =
+      tailRaw.length > 0 ? Math.min(tailRaw.length, Math.floor(maxHistoryChars / 2)) : 0;
+    const separatorBudget = tailBudget > 0 ? 2 : 1;
+    const summaryBudget = Math.max(
+      0,
+      maxHistoryChars - truncationMarker.length - separatorBudget - tailBudget,
+    );
+    const summaryTruncated = renderedSummary.slice(0, summaryBudget).trimEnd();
+    const tailTruncated = tailBudget > 0 ? tailRaw.slice(-tailBudget).trimStart() : "";
+    return [truncationMarker, summaryTruncated, tailTruncated].filter(Boolean).join("\n");
+  };
+
+  let renderedHistory: string;
+  if (summaryRendered) {
+    // Reserve the summary from the budget so the post-summary tail cap is
+    // the remaining headroom. If the summary alone meets or exceeds the
+    // cap, the summary itself must be truncated — pinning a summary that
+    // blows past `maxHistoryChars` would defeat the cap that prevents
+    // reseeding fresh CLI sessions with unexpectedly huge prompts.
+    if (summaryRendered.length >= maxHistoryChars) {
+      // Truncate the summary to fit the budget (less the marker line),
+      // keeping the head. Still reserve budget for the post-summary tail so
+      // recent exact turns survive even when the summary itself is oversize.
+      renderedHistory = renderTruncatedSummaryWithTail(summaryRendered);
+    } else if (tailRaw.length === 0) {
+      renderedHistory = summaryRendered;
+    } else {
+      const summaryBlock = `${summaryRendered}\n\n`;
+      const remainingBudget = maxHistoryChars - summaryBlock.length;
+      if (remainingBudget <= 0) {
+        // The summary plus separator already consumes the cap. Reuse the
+        // oversize-summary path so recent post-summary turns still get
+        // reserved tail budget instead of being dropped wholesale.
+        renderedHistory = renderTruncatedSummaryWithTail(summaryRendered);
+      } else if (tailRaw.length > remainingBudget) {
+        renderedHistory = `${summaryBlock}${truncationMarker}\n${tailRaw.slice(-remainingBudget).trimStart()}`;
+      } else {
+        renderedHistory = `${summaryBlock}${tailRaw}`;
+      }
+    }
+  } else {
+    // No compaction summary to pin: tail-slice the full rendered history
+    // and lead with the marker so it correctly describes what follows
+    // (older turns dropped, recent tail retained).
+    renderedHistory =
+      tailRaw.length > maxHistoryChars
+        ? `${truncationMarker}\n${tailRaw.slice(-maxHistoryChars).trimStart()}`
+        : tailRaw;
+  }
 
   if (!renderedHistory) {
     return undefined;