Add DeepSeek cache-aware prompt design and /cache inspect diagnostics#1196
Add DeepSeek cache-aware prompt design and /cache inspect diagnostics#1196wplll wants to merge 5 commits intoHmbown:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This PR implements a "Project Context Pack" feature that creates a deterministic workspace summary for the system prompt, alongside new /cache inspect and /cache warmup debug commands for analyzing prompt stability and priming provider caches. It also enhances the TUI footer with detailed cache telemetry. Feedback recommends expanding the ignored directory list (e.g., target, .vscode) and increasing the variety of recognized configuration and source file extensions to improve the context pack's efficiency and coverage.
|
Tool result budget and deduplication Adds wire-only budgeting for tool result messages sent to DeepSeek. Large tool outputs are now compacted before they enter the rendered API request while preserving the full output in the local UI/session state. By default, a single tool result keeps up to 12,000 chars, including:
Repeated identical tool results are deduplicated in the rendered wire history using a compact stable reference instead of resending the full output. This reduces cache miss pressure from large repeated dynamic tool messages without deleting local logs or changing the visible transcript. /cache inspect now also reports tool result budget metadata:
Turn metadata deduplication Adds wire-only deduplication for repeated The first rendered If the metadata changes, the full block is sent again and becomes the new comparison point. This keeps repeated per-turn metadata from inflating multi-turn request payloads while preserving:
/cache inspect now reports turn metadata diagnostics:
Additional automated verification:
Commands run: cargo fmt --check
cargo check
cargo clippy --workspace --all-targets --all-features
cargo test -p deepseek-tui turn_meta
cargo test -p deepseek-tui cache_inspect
cargo test
|
…tion (Hmbown#1196) Merge of PR Hmbown#1196 by wplll. Adds: Cache-aware prompt layering: - PromptBuilder struct separates prompt construction from inspection - System prompt split into named layers with stability classification - Layers classified as static/history/dynamic for cache debugging /cache inspect command: - SHA-256 hashes of each rendered prompt layer - Base static prefix hash vs full request prefix hash - Static prefix stability status across turns - First-divergence tracking from previous request Wire payload optimization: - Tool result budget: large outputs compacted before API request - Tool result dedup: repeated outputs replaced by compact refs - Turn metadata dedup: repeated <turn_meta> blocks deduplicated - Wire-only: local session messages remain unchanged Project context pack: - Deterministic workspace summary injected into stable prefix - Configurable via [context] project_pack = false Cache warmup and improved footer cache display. Thanks to wplll for the contribution.
…fault CHANGELOG additions: - Top-line credit summary: wplll, Liu-Vince, Giggitycountless, SamhandsomeLee, barjatiyasaurabh, tyculw, hongyuatcufe, ljlbit. - New "Added" section properly documenting Hmbown#1196 (cache-aware diagnostics, /cache inspect, /cache warmup, payload optimization, Project Context Pack). Calls out that the Pack is default-on, adds ~1–10 KB to every prompt, and how to opt out via [context] project_pack = false. - Per-item issue reporter credits across the Fixed section. - Removed Hmbown#1129 from the i18n entry — that's a separate bug we did not actually fix (wrong env var name in HTTP system prompt). README updates: rewrote the "What's New" section in both README.md and README.zh-CN.md to v0.8.24 with all the same credits and the project_pack opt-out note.
Summary
This PR adds DeepSeek prompt cache awareness to the TUI and introduces a new
/cache inspectcommand for diagnosing cache-related prompt structure.
The main goal is to make DeepSeek context caching easier to reason about by separating stable reusable prompt prefixes from session history and dynamic request content.
Changes
DeepSeek cache-aware prompt structure
Cache usage metrics
prompt_cache_hit_tokensprompt_cache_miss_tokens
/cache inspect
/cache inspectcommand to inspect the rendered prompt structure without printing the full prompt text.Base static prefix hashFull request prefix hashstatichistorydynamic
Safety and privacy
/cache inspectdoes not print full prompt contents by default.
Motivation
DeepSeek context caching can significantly reduce input cost when requests share a stable prefix. However, before this change it was difficult to tell whether cache misses were caused by changes in the reusable static prefix or by normal conversation history growth.
This PR adds both the cache-aware prompt design and the inspection tooling needed to debug cache behavior in real sessions.
In particular,
/cache inspectmakes it possible to verify that the static base prefix remains stable across turns while allowing the full rendered request to change as history, tool results, and user inputs evolve.
Example behavior
Across multiple turns in the same session,
/cache inspectcan now show:
Base static prefix hashFull request prefix hashStatic base prefix stability: OK
This makes it easier to confirm that the reusable DeepSeek cache prefix is stable even when the full request changes.
Testing
Manual verification performed:
/cache inspectafter each request.Base static prefix hashremains stable across turns.Full request prefix hashchanges as history grows./cache inspect.