fix(cache): make tool catalog byte-stable across calls and sessions (#263)#284
Conversation
DeepSeek's KV prefix cache hits on the longest matching byte prefix of the request. Two places in the tool-array path were silently introducing divergence: 1. `ToolRegistry::to_api_tools()` iterated `self.tools.values()` directly. Rust's default `HashMap` is seeded with `RandomState` per process, so every `deepseek` launch produced a different tool order — the cross- session resume case (the one with the biggest cache wins) never hit. 2. `active_tool_list_from_catalog()` filtered the catalog `Vec` by the active set in catalog order. When ToolSearch activated a previously- deferred tool mid-conversation, the new tool appeared at its catalog index, shifting every later tool's byte offset and busting the cached prefix from there onwards. Fixes: - `to_api_tools()` now sorts by tool name before emitting the API tool array. Stable across calls AND across launches. - `build_model_tool_catalog()` sorts each partition (built-ins first, contiguous; MCP tools after, also alphabetical). Mirrors Claude Code's `assembleToolPool` strategy where they explicitly call out cache stability as the reason: "a flat sort would interleave MCP tools into built-ins and invalidate all downstream cache keys whenever an MCP tool sorts between existing built-ins." - `active_tool_list_from_catalog()` puts always-loaded tools in catalog order at the head and deferred-but-now-active tools at the tail. A deferred-tool activation during ToolSearch no longer shifts earlier tools' positions. Adds three regression tests: - `to_api_tools_emits_alphabetical_order_regardless_of_registration_order` - `model_tool_catalog_sorts_each_partition_for_prefix_cache_stability` - `active_tool_list_pushes_deferred_activations_to_the_tail` Refs #263. Findings produced by reading reference Claude Code source side-by-side with our request-building flow; full delta analysis in the PR description.
There was a problem hiding this comment.
Pull request overview
This PR improves DeepSeek KV prefix-cache hit rates by making the model-visible tool catalog deterministic (byte-stable) across calls, turns, and process launches, aligning the tool-array behavior with the cache-stability goals tracked in #263.
Changes:
- Sort
ToolRegistry::to_api_tools()output by tool name to avoidHashMapiteration nondeterminism across launches. - Sort native vs MCP tool partitions independently when building the per-turn model tool catalog to keep built-ins as a stable contiguous prefix.
- Stabilize active tool-list ordering by appending newly-activated deferred tools to the tail, preventing mid-turn tool activations from shifting earlier tool offsets.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
crates/tui/src/tools/registry.rs |
Sorts API tool emission by name and adds a regression test for registration-order independence. |
crates/tui/src/core/engine/tool_catalog.rs |
Sorts native/MCP partitions and makes active tool listing two-pass to preserve prefix stability when deferred tools become active. |
crates/tui/src/core/engine/tests.rs |
Adds regression tests to pin tool-catalog partition ordering and deferred-activation tail behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Code Review
This pull request introduces deterministic tool ordering to ensure prefix-cache stability for models like DeepSeek. Key changes include sorting tool partitions alphabetically within the catalog and refactoring the active tool list logic to append deferred tools at the tail, preventing byte-offset shifts. Additionally, the tool registry now sorts tools by name when converting to the API format. Review feedback suggests minor performance optimizations, such as pre-allocating vector capacity and using unstable sorting where element stability is not required.
| let mut head: Vec<Tool> = Vec::new(); | ||
| let mut tail: Vec<Tool> = Vec::new(); |
There was a problem hiding this comment.
To avoid multiple reallocations as the vectors grow, you can pre-allocate capacity based on the catalog size. Since the total number of active tools won't exceed the catalog size, this is a safe optimization.
| let mut head: Vec<Tool> = Vec::new(); | |
| let mut tail: Vec<Tool> = Vec::new(); | |
| let mut head: Vec<Tool> = Vec::with_capacity(catalog.len()); | |
| let mut tail: Vec<Tool> = Vec::new(); |
| let mut tools: Vec<&Arc<dyn ToolSpec>> = self.tools.values().collect(); | ||
| tools.sort_by(|a, b| a.name().cmp(b.name())); |
There was a problem hiding this comment.
Since tool names are unique within the registry, you can use sort_unstable_by for a slight performance gain over sort_by, as stability is not required when there are no duplicate keys.
| let mut tools: Vec<&Arc<dyn ToolSpec>> = self.tools.values().collect(); | |
| tools.sort_by(|a, b| a.name().cmp(b.name())); | |
| let mut tools: Vec<&Arc<dyn ToolSpec>> = self.tools.values().collect(); | |
| tools.sort_unstable_by(|a, b| a.name().cmp(b.name())); |
Summary
The biggest single cache-prefix-stability win surfaced by the side-by-side comparison with reference-cc (Claude Code source). Refs #263.
The DeepSeek KV cache hits on the longest matching byte prefix of the request. Two places in the tool-array path were silently busting it:
This mirrors how reference-cc's `assembleToolPool` (`reference-cc/src/tools.ts:345-367`) handles the same problem. Their comment on it: "Sort each partition for prompt-cache stability, keeping built-ins as a contiguous prefix. … a flat sort would interleave MCP tools into built-ins and invalidate all downstream cache keys whenever an MCP tool sorts between existing built-ins."
Regression tests added
What this does NOT fix (separate follow-ups, in priority order)
Test plan
🤖 Generated with Claude Code