Skip to content

fix(agent): bound compaction summary and mechanically fold on failure#4138

Merged
esengine merged 1 commit into
main-v2from
fix/compact-timeout-progress
Jun 12, 2026
Merged

fix(agent): bound compaction summary and mechanically fold on failure#4138
esengine merged 1 commit into
main-v2from
fix/compact-timeout-progress

Conversation

@esengine

Copy link
Copy Markdown
Owner

Problem

A user reported that /compact showed "compacting conversation…" and then nothing — no card, no error, indefinitely.

Root cause (traced through the code): after the CompactionStarted event, every fast failure already surfaces (compact failed: … for archive/stream/empty-output errors). The one gap was an unbounded wait:

  • The CLI calls ctrl.Compact(context.Background(), …) — no deadline anywhere down the chain.
  • summarize did for chunk := range ch, which only unblocks when the provider closes the channel — it never watched ctx.

So a stalled summarizer stream (open but never delivering/closing — a hung connection, or a reasoning model streaming thinking with no final text) pinned the "compacting…" placeholder forever: no CompactionDone, no error, no resolution.

Fix

All in internal/agent/compact.go; the fold/keep algorithm is unchanged.

  • 90s timeout per summary call (summaryTimeout).
  • select on ctx.Done() in the stream loop, so a stalled stream unblocks even if the provider never closes the channel.
  • One retry on a non-timeout failure (transient stream drop / rate blip).
  • Mechanical fold fallback when the summarizer is genuinely unreachable: the foldable region is already archived, so it's replaced with a deterministic marker. /compact then always frees context instead of aborting on a still-full window (and auto-compaction can't loop on one). Verbatim user turns are untouched.

Cache safety

The fold/keep logic is unchanged; the fallback produces a digest with the same structure as a normal compaction (head + kept-verbatim + digest + tail), just with deterministic content. No new prompt-history cache hazard beyond what compaction already does.

Tests

  • TestCompactFallsBackToMechanicalFoldWhenSummaryFails — summarizer errors → session still compacts, CompactionDone carries a mechanical-fold summary.
  • TestSummarizeRespectsContextCancel — a stalled (never-closing) stream returns on cancellation rather than hanging.
  • Full internal/agent suite green.

Deferred

Live progress streaming ("thinking…" during a slow summary) touches the CLI + desktop renderers and is a separate change; the 90s bound + always-resolving fallback already removes the "infinite nothing" symptom.

A manual /compact (or auto-compaction) could show "compacting…" then hang
forever: the call passed context.Background() with no deadline and summarize
ranged over the provider channel without watching ctx, so a stalled stream
pinned the placeholder with no card and no error. Fast failures already
surfaced; an unbounded wait did not.

Bound each summary call to 90s, select on ctx.Done so a stalled stream
unblocks, retry one transient failure, and — when the summarizer is genuinely
unreachable — fold mechanically (the region is already archived) to a
deterministic marker so /compact always frees context instead of aborting on a
still-full window. Verbatim user turns stay untouched and the fold/keep
algorithm is unchanged, so cache behavior matches a normal compaction.
@esengine esengine requested a review from SivanCola as a code owner June 12, 2026 06:54
@github-actions github-actions Bot added v2 Go rewrite (1.x) — main-v2 branch, active development agent Core agent loop (internal/agent, internal/control) labels Jun 12, 2026
@esengine esengine merged commit dc1f28c into main-v2 Jun 12, 2026
14 checks passed
@esengine esengine deleted the fix/compact-timeout-progress branch June 12, 2026 06:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent Core agent loop (internal/agent, internal/control) v2 Go rewrite (1.x) — main-v2 branch, active development

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant