Skip to content

fix(gateway): keep auto vision preprocess concise#10852

Open
gnanirahulnutakki wants to merge 1 commit into
NousResearch:mainfrom
gnanirahulnutakki:codengr/vision-preprocess-defaults
Open

fix(gateway): keep auto vision preprocess concise#10852
gnanirahulnutakki wants to merge 1 commit into
NousResearch:mainfrom
gnanirahulnutakki:codengr/vision-preprocess-defaults

Conversation

@gnanirahulnutakki

Copy link
Copy Markdown
Contributor

Summary

  • replace the auto image-preprocess prompt in GatewayRunner._enrich_message_with_vision() with a concise summary-oriented prompt
  • add an internal max_tokens override to vision_analyze_tool()
  • make the gateway auto-preprocess path use max_tokens=500 without changing explicit vision_analyze call sites

Fixes #10809.

Why this shape

The issue is specifically about the automatic image pre-process path inflating latency and prompt size. Lowering the global vision_analyze_tool() default would also make explicit user-invoked vision analysis shorter, which is a separate product decision.

This PR keeps the explicit tool behavior intact and tightens only the automatic gateway path.

Validation

  • pytest -q -o addopts= tests/tools/test_vision_tools.py -k "TestHandleVisionAnalyze or TestVisionAnalyzeTokenBudget"
    • 6 passed, 63 deselected
  • pytest -q -o addopts= tests/gateway/test_vision_preprocess.py
    • 1 passed

Notes

  • no docs updated because this only changes internal defaults for the eager gateway image-analysis path
  • timeout behavior was left unchanged in this PR; that should stay a separate discussion

@alt-glitch alt-glitch added type/perf Performance improvement or optimization P2 Medium — degraded but workaround exists comp/gateway Gateway runner, session dispatch, delivery tool/vision Vision analysis and image generation labels Apr 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists tool/vision Vision analysis and image generation type/perf Performance improvement or optimization

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Default vision pre-process prompt generates overly long descriptions (~2000 chars), significantly slowing down image-bearing requests on local models

2 participants