Summary
Context compaction currently feels too abrupt in long-running sessions: when compaction triggers, a large middle chunk can be summarized in one pass, which may drop too much in-flight task context at once.
This is especially noticeable in messaging-platform workflows (e.g. Feishu group chats) where users cannot easily reset with /new and expect continuity during active tasks.
Problem
When context pressure reaches threshold, compaction can remove a broad middle range in one step. Even with summary handoff, this can cause:
- sudden loss of nuanced task state
- assistant responses that feel disconnected from just-executed work
- poor UX during active multi-step execution
Expected behavior
Compaction should be smoother/rolling so that recent in-flight context is preserved longer, instead of one-shot middle truncation.
Proposal
Introduce rolling compaction:
- cap one compaction pass to a bounded number of oldest turns (instead of summarizing the entire middle in one pass)
- allow multiple smaller compaction passes over time
- keep the most recent active-task turns intact as long as possible
Repro pattern
- Run a long multi-turn session with frequent tool calls and long outputs.
- Reach ~85-95% compaction pressure.
- Observe that one compaction pass summarizes a large middle span.
- Continue task and compare response continuity before/after compaction.
Reference implementation
I opened a draft PR from a fork with a rolling-compaction approach + test:
If helpful, I can split that into a focused PR scoped only to compaction behavior.
Summary
Context compaction currently feels too abrupt in long-running sessions: when compaction triggers, a large middle chunk can be summarized in one pass, which may drop too much in-flight task context at once.
This is especially noticeable in messaging-platform workflows (e.g. Feishu group chats) where users cannot easily reset with
/newand expect continuity during active tasks.Problem
When context pressure reaches threshold, compaction can remove a broad middle range in one step. Even with summary handoff, this can cause:
Expected behavior
Compaction should be smoother/rolling so that recent in-flight context is preserved longer, instead of one-shot middle truncation.
Proposal
Introduce rolling compaction:
Repro pattern
Reference implementation
I opened a draft PR from a fork with a rolling-compaction approach + test:
If helpful, I can split that into a focused PR scoped only to compaction behavior.