Skip to content

fix(vlm): pass thinking flag to dashscope openai backend#939

Merged
qin-ctx merged 1 commit intovolcengine:mainfrom
sacloudy:fix/dashscope-thinking
Mar 25, 2026
Merged

fix(vlm): pass thinking flag to dashscope openai backend#939
qin-ctx merged 1 commit intovolcengine:mainfrom
sacloudy:fix/dashscope-thinking

Conversation

@sacloudy
Copy link
Copy Markdown
Contributor

@sacloudy sacloudy commented Mar 24, 2026

Description

This PR fixes the OpenAI-compatible VLM backend so the thinking flag is propagated to DashScope-compatible endpoints via extra_body.enable_thinking.

The change is scoped to DashScope detection only, so official OpenAI and Azure requests keep their existing behavior.

Related Issue

Fixes #923

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactoring (no functional changes)
  • Performance improvement
  • Test update

Changes Made

  • Detect DashScope-compatible OpenAI backends from the configured model prefix and api_base host.
  • Pass extra_body={"enable_thinking": <bool>} through all four OpenAI VLM completion paths.
  • Add regression tests covering DashScope text/async vision calls and non-DashScope guard rails for official OpenAI and Azure.

Testing

  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have tested this on the following platforms:
    • Linux
    • macOS
    • Windows

Validated with:

  • python3 -m py_compile openviking/models/vlm/backends/openai_vlm.py tests/unit/test_extra_headers_vlm.py
  • Isolated behavior verification for DashScope sync text, DashScope async vision, official OpenAI, and Azure.

Checklist

  • My code follows the project's coding style
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

Screenshots (if applicable)

N/A

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


wangruotian seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@github-actions
Copy link
Copy Markdown

Failed to generate code suggestions for PR

@qin-ctx qin-ctx merged commit b4a49de into volcengine:main Mar 25, 2026
1 of 2 checks passed
@github-project-automation github-project-automation bot moved this from Backlog to Done in OpenViking project Mar 25, 2026
deepakdevp added a commit to deepakdevp/OpenViking that referenced this pull request Mar 25, 2026
Follow-up to volcengine#939 (which fixed OpenAI backends). The LiteLLM backend
now sends enable_thinking via extra_body only when the detected
provider is DashScope. Non-DashScope providers (OpenAI, Azure, Ollama)
never receive this vendor-specific field.

Fixes volcengine#923.
qin-ctx pushed a commit that referenced this pull request Mar 26, 2026
)

Follow-up to #939 (which fixed OpenAI backends). The LiteLLM backend
now sends enable_thinking via extra_body only when the detected
provider is DashScope. Non-DashScope providers (OpenAI, Azure, Ollama)
never receive this vendor-specific field.

Fixes #923.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

[Bug]: VLM backend thinking parameter defined but never passed to API (causes auto-capture timeout with thinking-enabled models)

3 participants