research: evaluate agent-controlled compaction for engine hybrid loop

## Context

LangChain's [Autonomous Context Compression](https://blog.langchain.com/autonomous-context-compression/) proposes exposing compaction as an **agent tool** rather than triggering it at a fixed token threshold. The agent decides when to compact at semantically meaningful moments (task boundaries, before large new inputs, after extracting results).

## Why This Matters

SynthOrg's compaction is currently threshold-based. Our design spec already mandates "task boundary only, never mid-execution" for auto-downgrade -- the same principle should apply to compaction. Agent-controlled compaction is architecturally cleaner and avoids mid-reasoning interruptions.

## Action Items

- [ ] Review current compaction trigger mechanism in `engine/`
- [ ] Evaluate exposing a `compress_context` tool in the hybrid loop
- [ ] Design how agent-triggered compaction interacts with context budget system
- [ ] Consider fallback: system-triggered compaction if agent never invokes tool (safety net)

## References

- [LangChain blog post](https://blog.langchain.com/autonomous-context-compression/)
- Related: SRLM paper (arXiv:2603.15653) on trajectory selection using uncertainty signals
- Existing research log entry #24 (2026-03-14) covers initial assessment

---

## Additional Research (2026-03-26)

### Epistemic Marker Preservation
**Source**: [Self-Distillation & Epistemic Verbalization (arXiv:2603.24472)](https://huggingface.co/papers/2603.24472)

Compaction MUST preserve epistemic markers ("wait", "hmm", "actually", "perhaps", "alternatively", "check") -- these are functionally important for out-of-distribution reasoning. Removing them degraded AIME24 by up to 63%. Current `engine/compaction/summarizer.py` uses simple text concatenation with no marker detection -- this is a concrete gap.

**Rule**: Narrow/repetitive tasks can use concise reasoning; diverse/novel tasks need full uncertainty-aware style preserved.

### Surprisal-Based Semantic Token Cost
**Source**: [Reasoning as Compression / CIB (arXiv:2603.08462)](https://huggingface.co/papers/2603.08462) (ICML 2025)

Use surprisal under a frozen base model (instead of flat length penalties) to assign per-token compression cost. High-surprisal tokens carry essential reasoning; low-surprisal tokens are predictable filler. Results: 41% token reduction with <1.5% accuracy drop. The beta parameter provides a smooth control knob on the accuracy-efficiency Pareto frontier -- maps directly to quota degradation under budget pressure.

### Concrete Compression Thresholds
**Source**: [Deep Agents Context Engineering](https://docs.langchain.com/oss/python/deepagents/context-engineering)

Reference thresholds from LangChain Deep Agents:
- **20,000 tokens**: triggers offloading (large tool results replaced with file path + 10-line preview)
- **85% of model max_input_tokens**: triggers summarization (in-context LLM summary of session intent + artifacts + next steps)
- **10% of recent tokens**: always retained verbatim
- Catches `ContextOverflowError` and retries with summary

Compare against current `CompactionConfig.fill_threshold_percent=80.0` and `preserve_recent_turns=3`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

research: evaluate agent-controlled compaction for engine hybrid loop #687

Context

Why This Matters

Action Items

References

Additional Research (2026-03-26)

Epistemic Marker Preservation

Surprisal-Based Semantic Token Cost

Concrete Compression Thresholds

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

research: evaluate agent-controlled compaction for engine hybrid loop #687

Description

Context

Why This Matters

Action Items

References

Additional Research (2026-03-26)

Epistemic Marker Preservation

Surprisal-Based Semantic Token Cost

Concrete Compression Thresholds

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions