research(tools): tool result cache — avoid redundant executions within a session

## Source

Speculative Tool Calls: Overlapping Tool Execution with Generation
https://arxiv.org/abs/2512.15834 — December 2025

## Summary

The paper proposes a "tool cache" that indexes results by normalized `(tool_name, canonicalized_args)` key to avoid redundant executions across turns. A companion speculative-dispatch mechanism predicts tool calls before the LLM finishes decoding, but the cache is independently useful.

## Applicability to Zeph

**HIGH.** Within a single session it is common for the LLM to call the same deterministic tool multiple times (e.g., re-reading the same file, repeated web scrape of the same URL). Each call wastes latency and tokens in the context.

## Proposed implementation

- **Scope**: in-memory, per-session only (no persistence across sessions)
- **Key**: `{tool_name}:{canonicalized_args_json}` (sorted keys, normalized values)
- **TTL**: configurable (default 5 min), reset on `/clear`
- **Opt-out**: non-deterministic tools (shell commands with side effects, `memory_save`) must be excluded via a `cacheable = false` flag in the `ToolExecutor` trait
- **Location**: `CompositeExecutor` in `zeph-tools` wraps inner executors with a `CachingExecutor` layer
- **Config**: `[tools] result_cache = { enabled = true, ttl_secs = 300 }` (new section)
- **Metrics**: cache hit count tracked in `MetricsSnapshot`

## Expected benefit

- Eliminates redundant file reads and identical web scrapes
- Reduces token count when cached result is re-injected (same tool_result content, but no re-execution latency)
- No LLM inference engine changes required — pure application-layer optimization

## Non-goals

- Speculative dispatch (requires inference engine access, not feasible at app layer)
- Cross-session caching (stale results risk — too dangerous for a general cache)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

research(tools): tool result cache — avoid redundant executions within a session #1822

Source

Summary

Applicability to Zeph

Proposed implementation

Expected benefit

Non-goals

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

research(tools): tool result cache — avoid redundant executions within a session #1822

Description

Source

Summary

Applicability to Zeph

Proposed implementation

Expected benefit

Non-goals

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions