Skip to content

research(tools): tool result cache — avoid redundant executions within a session #1822

@bug-ops

Description

@bug-ops

Source

Speculative Tool Calls: Overlapping Tool Execution with Generation
https://arxiv.org/abs/2512.15834 — December 2025

Summary

The paper proposes a "tool cache" that indexes results by normalized (tool_name, canonicalized_args) key to avoid redundant executions across turns. A companion speculative-dispatch mechanism predicts tool calls before the LLM finishes decoding, but the cache is independently useful.

Applicability to Zeph

HIGH. Within a single session it is common for the LLM to call the same deterministic tool multiple times (e.g., re-reading the same file, repeated web scrape of the same URL). Each call wastes latency and tokens in the context.

Proposed implementation

  • Scope: in-memory, per-session only (no persistence across sessions)
  • Key: {tool_name}:{canonicalized_args_json} (sorted keys, normalized values)
  • TTL: configurable (default 5 min), reset on /clear
  • Opt-out: non-deterministic tools (shell commands with side effects, memory_save) must be excluded via a cacheable = false flag in the ToolExecutor trait
  • Location: CompositeExecutor in zeph-tools wraps inner executors with a CachingExecutor layer
  • Config: [tools] result_cache = { enabled = true, ttl_secs = 300 } (new section)
  • Metrics: cache hit count tracked in MetricsSnapshot

Expected benefit

  • Eliminates redundant file reads and identical web scrapes
  • Reduces token count when cached result is re-injected (same tool_result content, but no re-execution latency)
  • No LLM inference engine changes required — pure application-layer optimization

Non-goals

  • Speculative dispatch (requires inference engine access, not feasible at app layer)
  • Cross-session caching (stale results risk — too dangerous for a general cache)

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High ROI, low complexity — do next sprintresearchResearch-driven improvement

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions