Skip to content

fix: UTF-8 safe truncation, deduplicate url_encode, Display impls, pricing fix, model simplification#3

Merged
Hmbown merged 1 commit into
mainfrom
devin/1771993734-upgrades-and-fixes
Feb 25, 2026
Merged

fix: UTF-8 safe truncation, deduplicate url_encode, Display impls, pricing fix, model simplification#3
Hmbown merged 1 commit into
mainfrom
devin/1771993734-upgrades-and-fixes

Conversation

@devin-ai-integration

@devin-ai-integration devin-ai-integration Bot commented Feb 25, 2026

Copy link
Copy Markdown
Contributor

Summary

A batch of small correctness fixes and ergonomic improvements across the codebase:

File Change
src/utils.rs Fix UTF-8 panic in truncate_with_ellipsis — old code sliced raw bytes (s[..n]), which panics on multi-byte chars. Now uses char_indices() to find a safe boundary.
src/utils.rs Deduplicate url_encode — moved identical copies from tools/weather.rs and tools/finance.rs into utils.rs as a shared helper.
src/pricing.rs Fix misleading format_cost — threshold < 0.0001 was displaying "<$0.01" (off by 100×); now displays "<$0.0001".
src/error_taxonomy.rs Add Display + Error impls for ErrorCategory, ErrorSeverity, ErrorEnvelope — makes the error taxonomy usable as proper Rust error types.
src/features.rs Add Display for Stage — enables user-facing formatting without debug output.
src/models.rs Simplify context_window_for_model — collapsed three redundant DeepSeek branches (all returning 128k) into one.
src/models.rs Add PartialEq to Tool — enables equality comparison in tests.
CHANGELOG.md Fix missing [0.3.23] comparison link and update [Unreleased] base to v0.3.23.

Most important things for review

:

  1. truncate_with_ellipsis UTF-8 fix — No test added for multi-byte characters. The fix is correct (uses char_indices), but worth manually verifying the logic: we find the last char boundary <= budget bytes, then slice there. Edge case: if budget is 0, last() returns Noneunwrap_or(0) → empty string + ellipsis.

  2. format_cost string change — Changed from "<$0.01" to "<$0.0001". This is a bug fix (the threshold was always < 0.0001), but could affect any code that pattern-matches on the old string or has UI snapshot tests.

  3. context_window_for_model simplification — Collapsed three branches into one. All DeepSeek models currently use 128k, but if future models need different windows, this will need to be re-expanded. The existing tests still pass.

Testing

  • cargo test --all-features — all tests pass
  • cargo fmt --all -- --check — formatted
  • cargo clippy --all-targets --all-features — 10 pre-existing warnings (unrelated to this PR)

Checklist

  • Updated docs/comments as needed (added doc comments for UTF-8 safety and url_encode)
  • ⚠️ No test added for UTF-8 truncation — the fix is correct but untested
  • Verified TUI behavior manually if UI changes — format_cost string changed; should verify cost display in TUI

Link to Devin run: https://app.devin.ai/sessions/782c14f63c5649128219616998785f8e
Requested by: @Hmbown


Open with Devin

…, fix pricing display, simplify model matching, fix changelog links, add PartialEq for Tool

Co-Authored-By: unknown <>
@devin-ai-integration

Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 5 additional findings.

Open in Devin Review

@Hmbown Hmbown merged commit afc5ab6 into main Feb 25, 2026
9 checks passed
@Hmbown Hmbown deleted the devin/1771993734-upgrades-and-fixes branch February 25, 2026 04:36
Hmbown added a commit that referenced this pull request Apr 25, 2026
…, fix pricing display, simplify model matching, fix changelog links, add PartialEq for Tool (#3)
Hmbown added a commit that referenced this pull request Apr 28, 2026
Sub-area #3 of the v0.6.6 UI redesign (issue #121).

Introduces `crates/tui/src/tui/widgets/tool_card.rs` — a small,
self-contained vocabulary module that owns:

- `ToolFamily` enum (Read / Patch / Run / Find / Delegate / Fanout /
  Think / Generic) and the verb-glyph + label per family
  (▷ read, ◆ patch, ▶ run, ⌕ find, ◐ delegate, ⋮⋮ fanout, … think)
- `tool_family_for_title` — maps the legacy header titles
  (`"Shell"`, `"Patch"`, `"Workspace"`, `"Search"`, `"Diff"`, `"Image"`)
  to a family, so existing call sites pick up the new glyph without
  re-architecture
- `tool_family_for_name` — maps actual tool names
  (`agent_spawn`, `apply_patch`, etc.) for `GenericToolCell`, which
  shares the catch-all `"Tool"` title across every model-facing tool
- `CardRail` + `rail_glyph` — the `╭ │ ╰` rail vocabulary, declared
  here so any future per-card refactor has the matching glyphs

Wires the verb glyph + label into `render_tool_header` and adds a
`render_tool_header_with_family` overload so `GenericToolCell` can
route by tool name rather than the generic title. The header now reads
`<spinner> <verb-glyph> <verb> <state>` instead of
`<spinner> <Title-Case-Word> <state>`.

Existing parity tests for ExecCell / PlanUpdate are updated to assert
against the new header structure (verb + glyph) — the colour wiring is
unchanged. New tests pin the verb-glyph format end-to-end:
`agent_spawn` → `◐ delegate`, exec → `▶ run`.

Spinner cadence (TOOL_STATUS_SYMBOL_MS = 720 ms) is unchanged — the
spec already matched.

Deferred to a follow-up: full per-card rail (`╭ │ ╰`) refactor that
threads `CardRail` through every cell render path. The vocabulary is
in place; the layout pass is the next bite.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
MMMarcinho pushed a commit to MMMarcinho/DeepSeek-TUI that referenced this pull request May 6, 2026
Sub-area Hmbown#3 of the v0.6.6 UI redesign (issue Hmbown#121).

Introduces `crates/tui/src/tui/widgets/tool_card.rs` — a small,
self-contained vocabulary module that owns:

- `ToolFamily` enum (Read / Patch / Run / Find / Delegate / Fanout /
  Think / Generic) and the verb-glyph + label per family
  (▷ read, ◆ patch, ▶ run, ⌕ find, ◐ delegate, ⋮⋮ fanout, … think)
- `tool_family_for_title` — maps the legacy header titles
  (`"Shell"`, `"Patch"`, `"Workspace"`, `"Search"`, `"Diff"`, `"Image"`)
  to a family, so existing call sites pick up the new glyph without
  re-architecture
- `tool_family_for_name` — maps actual tool names
  (`agent_spawn`, `apply_patch`, etc.) for `GenericToolCell`, which
  shares the catch-all `"Tool"` title across every model-facing tool
- `CardRail` + `rail_glyph` — the `╭ │ ╰` rail vocabulary, declared
  here so any future per-card refactor has the matching glyphs

Wires the verb glyph + label into `render_tool_header` and adds a
`render_tool_header_with_family` overload so `GenericToolCell` can
route by tool name rather than the generic title. The header now reads
`<spinner> <verb-glyph> <verb> <state>` instead of
`<spinner> <Title-Case-Word> <state>`.

Existing parity tests for ExecCell / PlanUpdate are updated to assert
against the new header structure (verb + glyph) — the colour wiring is
unchanged. New tests pin the verb-glyph format end-to-end:
`agent_spawn` → `◐ delegate`, exec → `▶ run`.

Spinner cadence (TOOL_STATUS_SYMBOL_MS = 720 ms) is unchanged — the
spec already matched.

Deferred to a follow-up: full per-card rail (`╭ │ ╰`) refactor that
threads `CardRail` through every cell render path. The vocabulary is
in place; the layout pass is the next bite.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
YaYII added a commit to YaYII/DeepSeek-TUI that referenced this pull request May 9, 2026
🔴 Fix Hmbown#1 (Embedder blocks async runtime):
- Pre-warm embedder in init_vector_db() via warmup_embedder()
- Model downloaded during startup, not first retrieval
- Eliminates 5-10s UI freeze on first embed call

🔴 Fix Hmbown#2 (Compaction summary pollution):
- store_compaction_summary_to_vector_db() now extracts only
  the summary section from the SystemPrompt, stripping
  boilerplate headers and workflow context
- Uses regex extraction between ## Summary and ## Workflow markers

🔴 Fix Hmbown#3 (No vector index → full table scan):
- New ensure_vector_index() in LanceDbBackend
- Checks list_indices() for existing IVF-PQ index on embedding column
- Creates missing index via table.create_index(["embedding"], Index::Auto)
- Called from ensure_tables() for all three tables

🔴 Fix Hmbown#4 (Compaction threshold regression for non-vector users):
- Added MINIMUM_AUTO_COMPACTION_TOKENS_WITHOUT_VECTOR = 500K
- should_compact() now accepts vector_db_enabled: bool
- Non-vector users keep 500K floor; vector users use 50K floor
- auto_floor_tokens=0 (explicit disable by tests) bypasses both floors
- token_threshold default restored to 800K (fallback only;
  real values come from compaction_threshold_for_model_and_effort)

🟡 Fix Hmbown#7 (No score threshold → noise injection):
- search_memories() and search_summaries() now filter
  results where score < 0.4 before returning
- Prevents low-quality retrieval from polluting system prompt

Tests: 62/62 compaction ✓, 17/17 vector_db ✓
YaYII added a commit to YaYII/DeepSeek-TUI that referenced this pull request May 9, 2026
🔴 Hmbown#1 Embedder blocks async runtime:
  - Pre-warm embedder in init_vector_db() via warmup_embedder()
  - Model downloaded at startup, eliminating 5-10s UI freeze

🔴 Hmbown#2 Compaction summary boilerplate pollution:
  - store_compaction_summary_to_vector_db() now extracts only
    the actual summary section using regex, stripping headers
    and workflow context from embeddings

🔴 Hmbown#3 No vector index → full table scan:
  - New ensure_vector_index() checks list_indices() for IVF-PQ
  - Creates missing index via create_index(["embedding"], Auto)
  - Called from ensure_tables() for all 3 tables

🔴 Hmbown#4 Compaction threshold regression:
  - Added MINIMUM_AUTO_COMPACTION_TOKENS_WITHOUT_VECTOR = 500K
  - should_compact() accepts vector_db_enabled: bool
  - Non-vector users keep 500K floor; vector users get 50K
  - auto_floor_tokens=0 bypasses both (test compatibility)
  - token_threshold default restored to 800K

🟡 Hmbown#7 Score threshold filtering added:
  - search_memories() and search_summaries() filter score < 0.4

Tests: 62/62 compaction ✓, 17/17 vector_db ✓, 99/100 engine ✓
@Hmbown Hmbown mentioned this pull request May 10, 2026
8 tasks
encyc added a commit to encyc/DeepSeek-TUI that referenced this pull request May 31, 2026
…trol, verify fast-path, naming

- combined_hash now hashes full tool JSON (name + description + schema),
  not just tool names, so schema/description changes are detected (Hmbown#1).
- build_messages preserves cache_control from SystemPrompt::Blocks so
  DeepSeek context-cache breakpoints are not lost (Hmbown#2).
- FrozenPrefix::verify() compares raw text before hashing to avoid
  redundant SHA-256 computation (Hmbown#3).
- build_messages uses self.message_count() for vector capacity (Hmbown#4).
- Strict-mode error message no longer references nonexistent
  /prefix-unfreeze command — suggests restart or config toggle (Hmbown#5).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant