Skip to content

research(core): TACO self-evolving observation compression — auto-discover and refine terminal output compression rules (arXiv:2604.19572) #3306

@bug-ops

Description

@bug-ops

Description

arXiv:2604.19572 "A Self-Evolving Framework for Efficient Terminal Agents via Observational Context Compression" (April 21 2026) introduces TACO: a plug-and-play framework that automatically discovers and refines rules for compressing terminal/tool output, addressing the quadratic token growth problem in long-horizon agentic tasks.

Key mechanism:

  • Global Rule Pool: reusable compression rules accumulated across tasks
  • Task-specific adaptation: online refinement during the current session
  • Rule format: each rule is a function that decides whether/how to compress a specific tool output pattern

Results: 1-4% accuracy gain on TerminalBench, ~10% token reduction.

How It Applies to Zeph

Zeph's ShellExecutor and tool audit log accumulate raw outputs that are injected verbatim into context. Long shell commands (cargo build, cargo test) produce hundreds of lines that are mostly noise. TACO's pattern is directly applicable:

  1. Compression rule registry: store rules like "cargo build output: keep only errors and warnings lines" or "cargo test output: keep only failed test names + count"
  2. Self-evolution: after each session, analyze which output portions were actually referenced in LLM decisions → refine rules
  3. Integration point: ToolExecutor trait → post-process output before context injection

Implementation Sketch

  • Add trait to with a default identity implementation
  • Implement rule-based compressor backed by a SQLite rules table
  • Add self-evolution step in session cleanup: diff what was referenced vs full output, generate candidate rules via LLM
  • Expose config section with , , fields

Complexity

Medium-low. No ML training required — rule discovery is LLM-prompted from session traces.

Source

Metadata

Metadata

Assignees

Labels

P3Research — medium-high complexityresearchResearch-driven improvement

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions