Skip to content

fix(cache): make tool catalog byte-stable across calls and sessions (#263)#284

Merged
Hmbown merged 1 commit into
feat/v0.8.4from
feat/cache-stability-deterministic-tool-order
May 2, 2026
Merged

fix(cache): make tool catalog byte-stable across calls and sessions (#263)#284
Hmbown merged 1 commit into
feat/v0.8.4from
feat/cache-stability-deterministic-tool-order

Conversation

@Hmbown

@Hmbown Hmbown commented May 2, 2026

Copy link
Copy Markdown
Owner

Summary

The biggest single cache-prefix-stability win surfaced by the side-by-side comparison with reference-cc (Claude Code source). Refs #263.

The DeepSeek KV cache hits on the longest matching byte prefix of the request. Two places in the tool-array path were silently busting it:

# What Before After
1 `ToolRegistry::to_api_tools()` iterated `self.tools.values()` directly. `HashMap` uses `RandomState` per process → every `deepseek` launch produced a different tool order, so cross-session prefix cache never hit. Sort by tool name before emit. Stable across calls AND across launches.
2 `build_model_tool_catalog()` took caller-supplied Vecs as-is. Sort each partition; built-ins stay as a contiguous prefix, MCP tools follow.
3 `active_tool_list_from_catalog()` filtered the catalog Vec in catalog order. ToolSearch activating a previously-deferred tool mid-conversation shifted every later tool's byte offset, busting the cache from there. Two-pass: always-loaded tools head, deferred-but-now-active tools tail.

This mirrors how reference-cc's `assembleToolPool` (`reference-cc/src/tools.ts:345-367`) handles the same problem. Their comment on it: "Sort each partition for prompt-cache stability, keeping built-ins as a contiguous prefix. … a flat sort would interleave MCP tools into built-ins and invalidate all downstream cache keys whenever an MCP tool sorts between existing built-ins."

Regression tests added

  • `to_api_tools_emits_alphabetical_order_regardless_of_registration_order` — register the same tools in two different orders, assert the API output is identical
  • `model_tool_catalog_sorts_each_partition_for_prefix_cache_stability` — pin builtins-then-MCP-each-alphabetical
  • `active_tool_list_pushes_deferred_activations_to_the_tail` — pin that deferred-tool activation doesn't shift earlier tools

What this does NOT fix (separate follow-ups, in priority order)

  1. Working-set summary block churn (working_set::summary_block churns the cache prefix every turn ( / interpolation) #280) — `age` and `touches` interpolation in the system prompt. Filed; ready for a focused PR.
  2. Handoff block placement — volatile content in the middle of the system prompt. (Finding 6 in the audit; not yet filed as an issue.)
  3. Tool schema cache — `tool.description()` re-invoked every turn; OK today (descriptions are `&'static str`) but no guard against future MCP-reconnect or conditional descriptions. (Finding 4.)

Test plan

  • `cargo test -p deepseek-tui --bin deepseek-tui --locked` (1741/1743 — 2 ignored unrelated)
  • `cargo fmt --all -- --check`
  • `cargo clippy -p deepseek-tui --all-targets --locked -- -D warnings`

🤖 Generated with Claude Code

DeepSeek's KV prefix cache hits on the longest matching byte prefix of
the request. Two places in the tool-array path were silently introducing
divergence:

1. `ToolRegistry::to_api_tools()` iterated `self.tools.values()` directly.
   Rust's default `HashMap` is seeded with `RandomState` per process, so
   every `deepseek` launch produced a different tool order — the cross-
   session resume case (the one with the biggest cache wins) never hit.

2. `active_tool_list_from_catalog()` filtered the catalog `Vec` by the
   active set in catalog order. When ToolSearch activated a previously-
   deferred tool mid-conversation, the new tool appeared at its catalog
   index, shifting every later tool's byte offset and busting the cached
   prefix from there onwards.

Fixes:

- `to_api_tools()` now sorts by tool name before emitting the API tool
  array. Stable across calls AND across launches.
- `build_model_tool_catalog()` sorts each partition (built-ins first,
  contiguous; MCP tools after, also alphabetical). Mirrors Claude Code's
  `assembleToolPool` strategy where they explicitly call out cache
  stability as the reason: "a flat sort would interleave MCP tools into
  built-ins and invalidate all downstream cache keys whenever an MCP
  tool sorts between existing built-ins."
- `active_tool_list_from_catalog()` puts always-loaded tools in catalog
  order at the head and deferred-but-now-active tools at the tail. A
  deferred-tool activation during ToolSearch no longer shifts earlier
  tools' positions.

Adds three regression tests:

- `to_api_tools_emits_alphabetical_order_regardless_of_registration_order`
- `model_tool_catalog_sorts_each_partition_for_prefix_cache_stability`
- `active_tool_list_pushes_deferred_activations_to_the_tail`

Refs #263. Findings produced by reading reference Claude Code source
side-by-side with our request-building flow; full delta analysis in
the PR description.
Copilot AI review requested due to automatic review settings May 2, 2026 01:26

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves DeepSeek KV prefix-cache hit rates by making the model-visible tool catalog deterministic (byte-stable) across calls, turns, and process launches, aligning the tool-array behavior with the cache-stability goals tracked in #263.

Changes:

  • Sort ToolRegistry::to_api_tools() output by tool name to avoid HashMap iteration nondeterminism across launches.
  • Sort native vs MCP tool partitions independently when building the per-turn model tool catalog to keep built-ins as a stable contiguous prefix.
  • Stabilize active tool-list ordering by appending newly-activated deferred tools to the tail, preventing mid-turn tool activations from shifting earlier tool offsets.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
crates/tui/src/tools/registry.rs Sorts API tool emission by name and adds a regression test for registration-order independence.
crates/tui/src/core/engine/tool_catalog.rs Sorts native/MCP partitions and makes active tool listing two-pass to preserve prefix stability when deferred tools become active.
crates/tui/src/core/engine/tests.rs Adds regression tests to pin tool-catalog partition ordering and deferred-activation tail behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@Hmbown Hmbown merged commit 6ea1b34 into feat/v0.8.4 May 2, 2026
6 checks passed
@Hmbown Hmbown deleted the feat/cache-stability-deterministic-tool-order branch May 2, 2026 01:30

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces deterministic tool ordering to ensure prefix-cache stability for models like DeepSeek. Key changes include sorting tool partitions alphabetically within the catalog and refactoring the active tool list logic to append deferred tools at the tail, preventing byte-offset shifts. Additionally, the tool registry now sorts tools by name when converting to the API format. Review feedback suggests minor performance optimizations, such as pre-allocating vector capacity and using unstable sorting where element stability is not required.

Comment on lines +204 to +205
let mut head: Vec<Tool> = Vec::new();
let mut tail: Vec<Tool> = Vec::new();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To avoid multiple reallocations as the vectors grow, you can pre-allocate capacity based on the catalog size. Since the total number of active tools won't exceed the catalog size, this is a safe optimization.

Suggested change
let mut head: Vec<Tool> = Vec::new();
let mut tail: Vec<Tool> = Vec::new();
let mut head: Vec<Tool> = Vec::with_capacity(catalog.len());
let mut tail: Vec<Tool> = Vec::new();

Comment on lines +145 to +146
let mut tools: Vec<&Arc<dyn ToolSpec>> = self.tools.values().collect();
tools.sort_by(|a, b| a.name().cmp(b.name()));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Since tool names are unique within the registry, you can use sort_unstable_by for a slight performance gain over sort_by, as stability is not required when there are no duplicate keys.

Suggested change
let mut tools: Vec<&Arc<dyn ToolSpec>> = self.tools.values().collect();
tools.sort_by(|a, b| a.name().cmp(b.name()));
let mut tools: Vec<&Arc<dyn ToolSpec>> = self.tools.values().collect();
tools.sort_unstable_by(|a, b| a.name().cmp(b.name()));

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants