fix: critical bugs in tools, client, and commands (9 bugs)#2880
Conversation
Add prompt_persist.rs module that caches the immutable base section of the system prompt on disk for cross-session reuse. The base section (mode prompt, project context, skills, context management, compaction template) is stable across sessions for the same workspace. By caching this section and reusing it when the SHA-256 matches, we can skip the entire base-section assembly on session start and immediately provide byte-identical bytes to the API. This is especially valuable for DeepSeek's service-side prefix cache: when the base section bytes are identical across sessions, the server can reuse its cached KV states for the entire base section, giving ~90% discount on cached tokens. Cache layout: ~/.codewhale/prompt_cache/<system_hash>.bin — the base section text ~/.codewhale/prompt_cache/<system_hash>.meta — JSON metadata The cache key is the SHA-256 of the base section text. The metadata includes the workspace path and its mtime for invalidation on workspace changes. Stale entries are evicted lazily based on age and workspace mtime consistency. The module exposes three public functions: - load_cached_base_section(): try to load a cached base section - save_cached_base_section(): save a base section to disk - evict_stale_entries(): clean up old cache entries This is the infrastructure layer only. Wiring it into the prompt assembly pipeline (splitting base_section() + volatile_section()) will be done in a follow-up change.
1. Fix UTF-8 boundary panic in clean_pdf_text (tools/file.rs:295) - rfind returns byte index of char start, i+1 may not be char boundary - Use char-aware byte offset calculation instead 2. Fix integer overflow in context_lines (tools/search.rs:103) - Clamp model-provided context_lines to 1000 to prevent massive allocations - On 32-bit, usize::try_from(u64::MAX) falls back to usize::MAX causing overflow 3. Fix u64->usize truncating cast (tools/file.rs:117,127) - Use usize::try_from() with proper error instead of silent truncation - Prevents reading from wrong line on 32-bit platforms 4. Fix ContentBlockStop wrong index in SSE stream cleanup (client/chat.rs:435) - saturating_sub(1) on 0u32 wraps to u32::MAX when stream breaks during thinking - Merge thinking/text close into single guard to avoid duplicate stops 5. Fix missing providers in provider_accepts_reasoning_content (client/chat.rs:1966) - Add SiliconflowCn and Volcengine which are in apply_reasoning_effort - Without this, non-DeepSeek reasoning models on these providers lose thinking traces 6. Fix TOCTOU double-call in run_skill_by_name (commands/mod.rs:671) - Replace is_some()+unwrap() with if-let-Some pattern - Prevents potential panic if state changes between calls 7. Fix incomplete hex decoding in from_api_tool_name (client.rs:54-76) - Require exactly 6 hex digits to match encoder output - Short sequences from malformed model output now pass through as-is 8. Fix token count u64->u32 truncation (client.rs:1298-1299) - Use .min(u32::MAX) saturating cast consistent with sanitizer at line 1807 - Prevents silent wraparound for extremely large token counts 9. Fix HTTP response body read errors silently swallowed (client/chat.rs:170, client.rs:750) - Replace unwrap_or_default() with .context()? propagation - Connection drops mid-body now surface as clear error instead of JSON parse failure
There was a problem hiding this comment.
HUQIANTAO has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.
|
Thanks @HUQIANTAO for taking the time to contribute. This repository is currently observing a maintainer-managed contribution gate in dry-run mode, so this pull request is staying open. When enforcement is enabled, pull requests from contributors who are not listed in Please read |
There was a problem hiding this comment.
Code Review
This pull request introduces cross-session persistence for the system prompt's base section to optimize DeepSeek's KV prefix cache, along with various bug fixes, safety improvements (such as safe integer conversions and UTF-8 boundary handling), and API provider updates. The code review highlights two main issues: a logical error in crates/tui/src/client/chat.rs where using content_index.saturating_sub(1) targets the wrong block index when closing open stream blocks, and a design flaw in crates/tui/src/prompt_persist.rs where using the workspace directory's mtime for cache invalidation leads to unnecessary cache misses.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| // Close any open blocks — use the current content_index | ||
| // (which points to the next unused slot, so -1 gives the | ||
| // last-opened block) but guard against underflow when no | ||
| // content block was ever opened. | ||
| if thinking_started || text_started { | ||
| let idx = content_index.saturating_sub(1); | ||
| yield Ok(StreamEvent::ContentBlockStop { index: idx }); | ||
| } |
There was a problem hiding this comment.
这里存在一个严重的逻辑错误。在 parse_sse_chunk 中,当 Thinking 或 Text 数据块处于打开状态时,content_index 并没有被递增(它只有在数据块被关闭时才会递增)。因此,当前打开的数据块的实际索引就是 content_index。
如果在这里使用 content_index.saturating_sub(1),会导致关闭的是前一个已经关闭的数据块(或者在第一个数据块时关闭索引 0,这只是碰巧正确),而当前真正打开的数据块将永远不会被关闭。这会导致流事件的索引错乱。
应当直接使用 content_index 作为关闭的索引,无需减 1。
// Close any open blocks — use the current content_index
// which points to the currently active open block.
if thinking_started || text_started {
yield Ok(StreamEvent::ContentBlockStop { index: content_index });
}| // Verify workspace mtime hasn't changed (guards against hash collisions). | ||
| let current_mtime = dir_mtime_secs(workspace); | ||
| if current_mtime != meta.workspace_mtime_secs { | ||
| logging::info(format!( | ||
| "Prompt cache stale: workspace mtime changed ({meta_mtime} → {current_mtime})", | ||
| meta_mtime = meta.workspace_mtime_secs | ||
| )); | ||
| return None; | ||
| } |
The content_index is only incremented AFTER a block is closed, not when opened. Using saturating_sub(1) would close the wrong block. The reviewer correctly identified this logic error.
There was a problem hiding this comment.
HUQIANTAO has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.
There was a problem hiding this comment.
HUQIANTAO has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.
5121cb6 to
186b5b4
Compare
There was a problem hiding this comment.
HUQIANTAO has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.
Harvests 7 safe fixes from PR #2880 by @HUQIANTAO: tool-name hex-digit guard, token-usage u32 clamp, read-file line usize::try_from, grep context-lines cap, UTF-8 PDF trim, run_skill dedup, and Volcengine/SiliconflowCn reasoning_content support. Excludes the DeepSeek stream-stop change and the unwired prompt_persist module (deferred for separate review). Co-Authored-By: HUQIANTAO <58421104+HUQIANTAO@users.noreply.github.com>
|
Merged — thank you @HUQIANTAO. All nine fixes reviewed individually; each is a real bug with the right minimal fix. The One note: |
Summary
Fixes 9 critical bugs across the TUI tools, LLM client, and command system that could cause panics, data corruption, or incorrect behavior.
Bugs Fixed
Panics / Crashes
UTF-8 boundary panic in
clean_pdf_text(tools/file.rs:295)str::rfindreturns the byte index of the start of the matching character. Usingi + 1may produce an offset that is not a valid UTF-8 char boundary, causing a runtime panic on&strslicing.i + char.len_utf8()to compute the correct byte offset.ContentBlockStop wrong index on SSE stream cleanup (
client/chat.rs:435)content_index.saturating_sub(1)which would produceu32::MAXwhencontent_indexis 0 (first block). The index should becontent_indexdirectly since it is only incremented AFTER a block is closed.content_indexdirectly for the close event.TOCTOU double-call in
run_skill_by_name(commands/mod.rs:671)is_some()followed byunwrap()calls the function twice. If state changes between calls, the second call may returnNoneand panic.if let Some(result) = ...pattern.Silent token count truncation (
client.rs:1298)u64tou32cast viaassilently truncates values above 4,294,967,295. This is inconsistent with the sanitizer at line 1807 which uses.min(u32::MAX)..min(u64::from(u32::MAX)) as u32for saturating conversion.Data Integrity
Incomplete hex decoding in
from_api_tool_name(client.rs:54-76)HTTP response body read errors silently swallowed (
client/chat.rs:170,client.rs:750)unwrap_or_default()converts connection drops mid-body to empty string"", causing an unhelpful JSON parse error instead of surfacing the real cause..context()?error propagation.Integer overflow in
context_lines(tools/search.rs:103)context_lineshas no upper bound. On 32-bit platforms,usize::MAXfallback fromtry_fromcombined with addition causes overflow..min(1000)to prevent excessive memory allocation.u64 to usize truncating cast (
tools/file.rs:117,127)as usizesilently truncates on 32-bit platforms. A model passingstart_line = 5_000_000_000would read from the wrong line.usize::try_from()with proper error propagation.Missing providers in
provider_accepts_reasoning_content(client/chat.rs:1966)SiliconflowCnandVolcengineare present inapply_reasoning_effortbut missing fromprovider_accepts_reasoning_content. Non-DeepSeek reasoning models on these providers lose thinking trace support.