feat(vision): downscale attached images before sending; add detail knob by esengine · Pull Request #4210 · esengine/DeepSeek-Reasonix

esengine · 2026-06-12T15:29:37Z

Follow-up to #4204. That PR could send images but did nothing about size — an attached photo went out at full resolution (up to the 10 MB cap), wasting both request bytes and image tokens, since vision models downscale server-side anyway.

What changed

Downscale before send (internal/control). A vision-only path, visionImageDataURL, downscales an oversized image to 1568px on its longest side (where OpenAI/Anthropic cap server-side) and re-encodes it: PNG/GIF stay lossless (screenshots, text, transparency), JPEG/WebP → JPEG q85. Guarded against decompression bombs (DecodeConfig dimension check before decode). Best-effort — an undecodable format (bmp/tiff/svg) passes through untouched. The desktop preview path (ImageDataURL) is left at full resolution; only the model-send path shrinks.
vision_detail knob — a per-model low|high config flag sets the openai image_url.detail hint (empty = auto, field omitted). low pins an image to a fixed ~85 tokens for cheap coarse reads; anthropic has no such knob and ignores it.

On compression headers (the original ask)

Request-body gzip was considered and deliberately skipped: it only shrinks the wire (~25%, undoing the base64 inflation) and is provider-support-dependent (OpenAI doesn't guarantee gzipped request bodies — sending one risks a 400), and it does nothing for tokens or context. Downscaling cuts both bytes and tokens, so it's the right lever.

Also

Corrects the comment introduced in #4204 that said embedding images "breaks prefix-cache stability." They don't — images are vision-gated, append-only, and byte-stable across turns, so the prefix cache is unaffected; the real concern was always token cost.

Tests

internal/control: oversized PNG → 1568px (pixel count reduced), in-budget image passes through verbatim, JPEG stays JPEG, undecodable mime passes through. openai: detail emitted from config, omitted by default. go vet + touched packages green locally; both go.mod/go.sum (main + desktop module) tidied for golang.org/x/image.

Follow-up to #4204. Sent images had no size control — an attached photo went out at full resolution (up to the 10 MB cap), wasting request bytes and image tokens since vision models downscale server-side anyway. - internal/control: a vision-only send path (visionImageDataURL) downscales an oversized image to 1568px on its longest side and re-encodes it — PNG/GIF stay lossless (screenshots, text, transparency), JPEG/WebP go to JPEG q85 — guarded against decompression bombs. Best-effort: an undecodable format passes through untouched. The desktop preview path (ImageDataURL) is unchanged, full res. - A per-model `vision_detail` (low|high) config flag sets the openai image_url detail hint; empty = auto/omit. "low" pins an image to ~85 tokens. - Deliberately no request-body gzip: it only helps the wire (~25%, and provider-support-dependent) and nothing for tokens, so downscaling is the lever. Also corrects the #4204 comment that claimed images "break prefix-cache stability" — they don't (vision-gated, append-only, byte-stable); the real concern was always cost.

esengine requested a review from SivanCola as a code owner June 12, 2026 15:29

esengine merged commit 62645d1 into main-v2 Jun 12, 2026
14 checks passed

esengine deleted the feat/vision-image-downscale branch June 12, 2026 15:34

This was referenced Jun 13, 2026

[Feature]: 功能请求：支持剪贴板图片直接粘贴到聊天输入框 #4178

Closed

[Feature]: 上传视频/图片进行多模态分析的支持 #4158

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(vision): downscale attached images before sending; add detail knob#4210

feat(vision): downscale attached images before sending; add detail knob#4210
esengine merged 1 commit into
main-v2from
feat/vision-image-downscale

esengine commented Jun 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

esengine commented Jun 12, 2026

What changed

On compression headers (the original ask)

Also

Tests

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant