feat: image vision analysis, fix first-message 400, fix /model default reset#1642
feat: image vision analysis, fix first-message 400, fix /model default reset#1642imkingjh999 wants to merge 2 commits into
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a standalone vision analysis bridge, adding the vision_bridge module to handle structured image descriptions and visual primitives with normalized bounding boxes. It includes logic for prompt construction, response parsing, and an HTTP client for vision API requests. Reviewers identified several improvement opportunities: removing the unused PerModelContextConfig struct, replacing eprintln! calls with a proper logging framework to prevent TUI display corruption, reusing the reqwest::Client instance for better connection pooling, and extending the request timeout to cover response body parsing.
…0, fix /model reset - Add vision analysis with 0-1000 normalised bounding boxes (bridge.rs) - Slim down tools.rs to delegate to bridge module - Add primitives config flag to VisionModelConfig - Add image_analyze instruction to system prompt - Fix 400 error on first message (sync provider to config) - Fix /model re-selecting same provider resetting model to default
ff44868 to
dbb75e9
Compare
|
Provider/model switching bug-fix portions were harvested into #1649 and merged to main with contributor credit. That covered the /model default reset/provider-selected model behavior for #1632 without taking the image vision feature into v0.8.38. Leaving this PR open for maintainer judgment on the remaining vision feature and any other non-harvested scope. |
|
Thanks for the contribution. We harvested the narrow provider/model default-reset fix into #1649, but this PR mixes model selection fixes with a larger vision feature surface. For v0.8.40 we are keeping a high bar after the /model regression: small repro-backed fixes only, no mixed feature bundles. Please split any remaining work into one focused PR with tests and a single user-facing behavior. |
Summary
vision_bridge.rs→vision/bridge.rsmodule with structured types (BBox,VisualPrimitive)tools.rsto delegate to bridge moduleprimitivesconfig flag forVisionModelConfigimage_analyzeinstruction to system prompt/modelre-selecting same provider resetting model to defaultprimitives = truevsfalseprimitives = trueprimitives = false[x1,y1,x2,y2](0–1000)qwen3.5-omni-plus-2026-03-15(free, 1M calls)With
primitives = true+qwen3.5-omni-plus-2026-03-15(free tier), multi-point distance accuracy exceeds ChatGPT's vision capability.Test plan
cargo test -p deepseek-tui -- vision— 19 tests pass