feat: image vision analysis, fix first-message 400, fix /model default reset by imkingjh999 · Pull Request #1642 · Hmbown/CodeWhale

imkingjh999 · 2026-05-14T14:48:16Z

Summary

Add vision analysis with object bounding boxes (0–1000 normalised coordinates)
Restructure: move vision_bridge.rs → vision/bridge.rs module with structured types (BBox, VisualPrimitive)
Slim down tools.rs to delegate to bridge module
Add primitives config flag for VisionModelConfig
Add image_analyze instruction to system prompt
Fix 400 error on first message after entering TUI (sync provider to config)
Fix /model re-selecting same provider resetting model to default

`primitives = true` vs `false`

	`primitives = true`	`primitives = false`
Bounding boxes	Each object gets a normalised bbox `[x1,y1,x2,y2]` (0–1000)	No coordinates returned
Distance comparison	Can accurately compare distances between multiple points in the image	Cannot — text-only description lacks spatial precision
Recommended model	`qwen3.5-omni-plus-2026-03-15` (free, 1M calls)	Any vision model

With primitives = true + qwen3.5-omni-plus-2026-03-15 (free tier), multi-point distance accuracy exceeds ChatGPT's vision capability.

Test plan

Paste an image and verify vision analysis runs automatically
Verify bounding boxes are returned in the vision context
Start TUI, send first message — should not 400
Switch to deepseek provider, pick a non-default model, re-select deepseek — model should persist
Run cargo test -p deepseek-tui -- vision — 19 tests pass

gemini-code-assist

Code Review

This pull request introduces a standalone vision analysis bridge, adding the vision_bridge module to handle structured image descriptions and visual primitives with normalized bounding boxes. It includes logic for prompt construction, response parsing, and an HTTP client for vision API requests. Reviewers identified several improvement opportunities: removing the unused PerModelContextConfig struct, replacing eprintln! calls with a proper logging framework to prevent TUI display corruption, reusing the reqwest::Client instance for better connection pooling, and extending the request timeout to cover response body parsing.

…0, fix /model reset - Add vision analysis with 0-1000 normalised bounding boxes (bridge.rs) - Slim down tools.rs to delegate to bridge module - Add primitives config flag to VisionModelConfig - Add image_analyze instruction to system prompt - Fix 400 error on first message (sync provider to config) - Fix /model re-selecting same provider resetting model to default

Hmbown · 2026-05-14T22:04:08Z

Provider/model switching bug-fix portions were harvested into #1649 and merged to main with contributor credit. That covered the /model default reset/provider-selected model behavior for #1632 without taking the image vision feature into v0.8.38.

Leaving this PR open for maintainer judgment on the remaining vision feature and any other non-harvested scope.

Hmbown · 2026-05-17T09:02:06Z

Thanks for the contribution. We harvested the narrow provider/model default-reset fix into #1649, but this PR mixes model selection fixes with a larger vision feature surface. For v0.8.40 we are keeping a high bar after the /model regression: small repro-backed fixes only, no mixed feature bundles. Please split any remaining work into one focused PR with tests and a single user-facing behavior.

gemini-code-assist Bot reviewed May 14, 2026

View reviewed changes

Comment thread crates/tui/src/config.rs Outdated

Comment thread crates/tui/src/vision_bridge.rs Outdated

Comment thread crates/tui/src/vision_bridge.rs Outdated

Comment thread crates/tui/src/vision_bridge.rs Outdated

imkingjh999 changed the title ~~feat: image parsing & vision analysis, fix first-message 400, fix /model default~~ feat: image parsing and vision analysis with bounding boxes May 14, 2026

imkingjh999 mentioned this pull request May 14, 2026

fix: first-message 400 error and /model default reset #1644

Closed

2 tasks

imkingjh999 changed the title ~~feat: image parsing and vision analysis with bounding boxes~~ feat: image vision analysis, fix first-message 400, fix /model default reset May 14, 2026

imkingjh999 force-pushed the feature/image-parsing branch from ff44868 to dbb75e9 Compare May 14, 2026 16:16

chore: remove unused PerModelContextConfig struct

357298b

imkingjh999 marked this pull request as ready for review May 14, 2026 16:44

Hmbown mentioned this pull request May 14, 2026

fix: preserve provider-selected models #1649

Merged

Hmbown mentioned this pull request May 15, 2026

chore(release): prepare v0.8.38 #1698

Merged

Hmbown closed this May 17, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: image vision analysis, fix first-message 400, fix /model default reset#1642

feat: image vision analysis, fix first-message 400, fix /model default reset#1642
imkingjh999 wants to merge 2 commits into
Hmbown:mainfrom
imkingjh999:feature/image-parsing

imkingjh999 commented May 14, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Hmbown commented May 14, 2026

Uh oh!

Hmbown commented May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

imkingjh999 commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

primitives = true vs false

Test plan

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Hmbown commented May 14, 2026

Uh oh!

Hmbown commented May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

imkingjh999 commented May 14, 2026 •

edited

Loading

`primitives = true` vs `false`