Skip to content

Add context compaction event system and tiered compaction strategy#22

Merged
v3g42 merged 1 commit intomainfrom
claude/context-management-design-GYOC0
Mar 12, 2026
Merged

Add context compaction event system and tiered compaction strategy#22
v3g42 merged 1 commit intomainfrom
claude/context-management-design-GYOC0

Conversation

@v3g42
Copy link
Copy Markdown
Contributor

@v3g42 v3g42 commented Mar 12, 2026

Summary

This PR introduces a comprehensive context management system with tiered compaction strategies to handle token budget constraints in long-running agent conversations. It adds event emission, configuration thresholds, and the foundational infrastructure for both mechanical and semantic (LLM-powered) context compression.

Key Changes

Core Infrastructure

  • New CompactionTier enum (distri-types/events.rs): Defines three compaction levels:

    • Trim: Mechanical truncation of old entries and payload compaction
    • Summarize: LLM-powered semantic compression (recommended at >80% usage)
    • Reset: Emergency mode preserving only essentials (>95% usage)
  • New ContextCompaction event (distri-types/events.rs): Emitted when compaction occurs, includes:

    • Tier applied, token counts before/after, entries affected
    • Usage ratio and context limit for client awareness
    • Optional summary text for Tier 2 summarization
  • New CompactionSummary scratchpad entry type (distri-types/execution.rs): Stores LLM-generated summaries of compacted history with metadata (entries summarized, timestamp range, tokens saved)

Context Size Manager Enhancements

  • New configuration thresholds (context_size_manager.rs):

    • trim_threshold (default 0.6): Trigger mechanical compaction
    • summarize_threshold (default 0.8): Trigger semantic compaction
    • reset_threshold (default 0.95): Trigger emergency reset
    • post_compaction_target (default 0.4): Target usage after compaction
  • CompactionResult struct: Encapsulates compaction evaluation results with tier, token counts, affected entries, and usage ratio

  • evaluate_and_compact() method: Evaluates current context usage and applies appropriate tier without modifying entries (caller responsible for LLM summarization)

  • emergency_reset() method: Preserves only user task + last 2 entries for extreme cases

Executor Context Integration

  • evaluate_compaction() method (context.rs): Evaluates context and emits ContextCompaction event; should be called before each LLM call in agent loop

  • store_compaction_summary() method (context.rs): Stores LLM-generated summaries in scratchpad after Tier 2 compaction

CLI Support

  • Event rendering (distri-server-cli/printer.rs): Displays compaction events with tier label, token savings, entry count, and usage percentage; includes summary preview for Tier 2

Scratchpad Formatting

  • Summary entry rendering (formatter.rs): Compaction summaries formatted as assistant messages in context
  • Summary entry filtering (scratchpad.rs): Summaries included in scratchpad context alongside tasks and plan steps

Client-Side Types (distrijs)

  • TypeScript types (distrijs-context-management.patch):
    • CompactionTier, ContextCompactionEvent, ContextHealth types
    • useContextHealth() hook for tracking context usage and compaction events
    • ContextIndicator React component for visual context health display with color-coded progress bar

Implementation Notes

  • Tier 2 (Summarize) is signaled but not auto-executed: The evaluate_and_compact() method returns a result indicating summarization is needed, but the actual LLM call is left to the caller (agent loop) to implement
  • Mechanical compaction (Tier 1) is applied immediately: Truncation and entry dropping happen in evaluate_and_compact()
  • Event emission happens in executor context: Allows clients to track compaction across the event stream
  • Backward compatible: New scratchpad entry type and event are additive; existing code paths unaffected
  • Design mirrors production systems: Follows patterns from Claude Code (auto-compress) and OpenAI Codex (context condensing)

Testing Considerations

  • Verify compaction thresholds trigger at correct usage ratios
  • Confirm

https://claude.ai/code/session_01BG3BRtrAJf7C7uE11X8TXK

Introduces a structured context management system inspired by Claude Code's
auto-compress and OpenAI Codex's context condensing patterns:

- Add ContextCompaction event type with CompactionTier (Trim/Summarize/Reset)
- Add CompactionSummary scratchpad entry for LLM-generated summaries
- Implement evaluate_and_compact() with configurable usage thresholds
- Add emergency_reset() for >95% context usage scenarios
- Wire compaction events into CLI printer with visual output
- Handle Summary entries in scratchpad formatter and native history
- Include distrijs patch for @distri/core types and @distri/react hooks
- Add design document covering architecture and integration points

https://claude.ai/code/session_01BG3BRtrAJf7C7uE11X8TXK
@v3g42 v3g42 merged commit 04185e3 into main Mar 12, 2026
@v3g42 v3g42 deleted the claude/context-management-design-GYOC0 branch March 12, 2026 09:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants