Research Finding
Anthropic introduced a server-side Compaction API (launched Feb 5, 2026, beta header compact-2026-01-12) for Opus 4.6 and Sonnet 4.6. When enabled, the API automatically summarizes conversation history when input tokens approach a threshold, inserting a compaction content block in the response. The client does not need to maintain summarization logic — just append the response as usual and the API handles history pruning on the next turn.
Applicability
Zeph currently implements client-side chunked compaction in zeph-memory and zeph-core context management. The server-side API could:
- Replace the custom compaction pipeline for Claude sessions (
apply_deferred_summaries, compact_context)
- Eliminate token-counting heuristics and summarization LLM calls
- Reduce code complexity in
crates/zeph-core/src/context_manager.rs
This is a complement/alternative to #1338 (tiered context compaction).
Design Decision Needed
Evaluate whether to:
- Use server-side compaction as the primary path for Claude sessions, keeping client-side as fallback for non-Claude providers
- Or keep existing pipeline and only adopt the server-side API for Sonnet 4.6/Opus 4.6 sessions
Action
- Review the Compaction API spec at https://platform.claude.com/docs/en/build-with-claude/context-management
- Prototype adding
context_management: {auto_truncate: true} (or equivalent) to the Claude request builder
- Parse the
compaction content block type in response deserialization
- Add feature-flag or config option to opt-in per session
Source
Research session 2026-03-13. Anthropic changelog Feb 5, 2026.
Research Finding
Anthropic introduced a server-side Compaction API (launched Feb 5, 2026, beta header
compact-2026-01-12) for Opus 4.6 and Sonnet 4.6. When enabled, the API automatically summarizes conversation history when input tokens approach a threshold, inserting acompactioncontent block in the response. The client does not need to maintain summarization logic — just append the response as usual and the API handles history pruning on the next turn.Applicability
Zeph currently implements client-side chunked compaction in
zeph-memoryandzeph-corecontext management. The server-side API could:apply_deferred_summaries,compact_context)crates/zeph-core/src/context_manager.rsThis is a complement/alternative to #1338 (tiered context compaction).
Design Decision Needed
Evaluate whether to:
Action
context_management: {auto_truncate: true}(or equivalent) to the Claude request buildercompactioncontent block type in response deserializationSource
Research session 2026-03-13. Anthropic changelog Feb 5, 2026.