Add DeepSeek cache-aware prompt design and /cache inspect diagnostics#1196
Add DeepSeek cache-aware prompt design and /cache inspect diagnostics#1196wplll wants to merge 5 commits into
Conversation
There was a problem hiding this comment.
Code Review
This PR implements a "Project Context Pack" feature that creates a deterministic workspace summary for the system prompt, alongside new /cache inspect and /cache warmup debug commands for analyzing prompt stability and priming provider caches. It also enhances the TUI footer with detailed cache telemetry. Feedback recommends expanding the ignored directory list (e.g., target, .vscode) and increasing the variety of recognized configuration and source file extensions to improve the context pack's efficiency and coverage.
|
Tool result budget and deduplication Adds wire-only budgeting for tool result messages sent to DeepSeek. Large tool outputs are now compacted before they enter the rendered API request while preserving the full output in the local UI/session state. By default, a single tool result keeps up to 12,000 chars, including:
Repeated identical tool results are deduplicated in the rendered wire history using a compact stable reference instead of resending the full output. This reduces cache miss pressure from large repeated dynamic tool messages without deleting local logs or changing the visible transcript. /cache inspect now also reports tool result budget metadata:
Turn metadata deduplication Adds wire-only deduplication for repeated The first rendered If the metadata changes, the full block is sent again and becomes the new comparison point. This keeps repeated per-turn metadata from inflating multi-turn request payloads while preserving:
/cache inspect now reports turn metadata diagnostics:
Additional automated verification:
Commands run: cargo fmt --check
cargo check
cargo clippy --workspace --all-targets --all-features
cargo test -p deepseek-tui turn_meta
cargo test -p deepseek-tui cache_inspect
cargo test
|
|
This PR was opened before the v0.8.41 rebrand and is now stale. Feel free to rebase onto current |
Summary
This PR adds DeepSeek prompt cache awareness to the TUI and introduces a new
/cache inspectcommand for diagnosing cache-related prompt structure.
The main goal is to make DeepSeek context caching easier to reason about by separating stable reusable prompt prefixes from session history and dynamic request content.
Changes
DeepSeek cache-aware prompt structure
Cache usage metrics
prompt_cache_hit_tokensprompt_cache_miss_tokens
/cache inspect
/cache inspectcommand to inspect the rendered prompt structure without printing the full prompt text.Base static prefix hashFull request prefix hashstatichistorydynamic
Safety and privacy
/cache inspectdoes not print full prompt contents by default.
Motivation
DeepSeek context caching can significantly reduce input cost when requests share a stable prefix. However, before this change it was difficult to tell whether cache misses were caused by changes in the reusable static prefix or by normal conversation history growth.
This PR adds both the cache-aware prompt design and the inspection tooling needed to debug cache behavior in real sessions.
In particular,
/cache inspectmakes it possible to verify that the static base prefix remains stable across turns while allowing the full rendered request to change as history, tool results, and user inputs evolve.
Example behavior
Across multiple turns in the same session,
/cache inspectcan now show:
Base static prefix hashFull request prefix hashStatic base prefix stability: OK
This makes it easier to confirm that the reusable DeepSeek cache prefix is stable even when the full request changes.
Testing
Manual verification performed:
/cache inspectafter each request.Base static prefix hashremains stable across turns.Full request prefix hashchanges as history grows./cache inspect.