Skip to content

fix(agent): scope think=False to Ollama endpoints only (#11237)#12914

Closed
Tranquil-Flow wants to merge 1 commit into
NousResearch:mainfrom
Tranquil-Flow:fix/think-param-custom-provider
Closed

fix(agent): scope think=False to Ollama endpoints only (#11237)#12914
Tranquil-Flow wants to merge 1 commit into
NousResearch:mainfrom
Tranquil-Flow:fix/think-param-custom-provider

Conversation

@Tranquil-Flow

@Tranquil-Flow Tranquil-Flow commented Apr 20, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

The think parameter is Ollama-native and was being sent to ALL custom providers when reasoning was disabled, causing HTTP 422 rejections from Mistral, Fireworks, Together.ai, vLLM, and LM Studio — all of which validate request bodies and reject unrecognized parameters.

Root cause: _build_api_kwargs() used provider == "custom" as the guard condition, matching every custom endpoint regardless of whether it understands the think parameter.

Fix:

  • Add _is_ollama() detection method using two complementary signals:
    1. _ollama_num_ctx — set during init via config or /api/show probe (proves endpoint is Ollama)
    2. URL heuristics — "ollama" or ":11434" in the base URL (fallback for when probe failed)
  • Replace provider == "custom" guard with _is_ollama() in the think=False injection path

Related Issue

Closes #11237

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 🔒 Security fix
  • 📝 Documentation update
  • ✅ Tests (adding or improving test coverage)
  • ♻️ Refactor (no behavior change)
  • 🎯 New skill (bundled or hub)

Changes Made

  • Add _is_ollama() detection method (num_ctx probe + URL heuristics)
  • Replace provider == "custom" guard with _is_ollama() in the think=False injection path

How to Test

  1. Configure a non-Ollama custom provider (e.g. Mistral, Fireworks) with reasoning_effort: none
  2. Send a message — should no longer get HTTP 422
  3. Configure an Ollama endpoint with reasoning_effort: none
  4. Send a message to a thinking model (e.g. Qwen3) — should still suppress <think/> blocks

Automated: 23 new tests in tests/run_agent/test_think_param_ollama_scope.py:

  • 7 _is_ollama() unit tests (num_ctx probe, URL patterns, cloud/LM Studio/vLLM rejections)
  • 5 parametrized cloud custom provider tests (Mistral, Fireworks, Together, vLLM, generic)
  • 1 LM Studio test (local but not Ollama)
  • 7 Ollama positive tests (URL detection, probe detection, effort/enabled combos)
  • 3 non-custom provider tests (OpenRouter, Anthropic unaffected)

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix/feature (no unrelated commits)
  • I've run pytest tests/ -q and all tests pass
  • I've added tests for my changes (required for bug fixes, strongly encouraged for features)
  • I've tested on my platform: macOS 15 (Darwin 24.6.0)

Documentation & Housekeeping

  • I've updated relevant documentation (README, docs/, docstrings) — or N/A
  • I've updated cli-config.yaml.example if I added/changed config keys — or N/A
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — or N/A
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guide — or N/A
  • I've updated tool descriptions/schemas if I changed tool behavior — or N/A

Screenshots / Logs

pytest tests/run_agent/test_think_param_ollama_scope.py -q
# 23 passed

…11237)

The think parameter is Ollama-native and was being sent to ALL custom
providers (Mistral, Fireworks, Together.ai, vLLM, LM Studio) when
reasoning was disabled, causing HTTP 422 rejections from providers
that validate request bodies.

Root cause: _build_api_kwargs() used provider == 'custom' as the guard
condition, matching every custom endpoint regardless of whether it
understands the think parameter.

Fix:
- Add _is_ollama() detection method using two signals:
  1. _ollama_num_ctx — set via config or /api/show probe (proven Ollama)
  2. URL heuristics — 'ollama' or ':11434' in base URL (fallback)
- Replace provider == 'custom' guard with _is_ollama() in the
  think=False section of _build_api_kwargs()

Tests: 23 new tests covering:
- 7 _is_ollama() unit tests (num_ctx probe, URL patterns, rejections)
- 5 cloud custom provider no-think tests (Mistral, Fireworks, Together, vLLM)
- 1 LM Studio no-think test (local but not Ollama)
- 7 Ollama think=False positive tests (URL, num_ctx, effort, enabled)
- 3 unaffected provider tests (OpenRouter, Anthropic)

All tests fail without the fix (6 failures from cloud/LM Studio tests)
and pass with it. 107 total tests pass (23 new + 84 existing).
@Tranquil-Flow

Copy link
Copy Markdown
Contributor Author

Closing — the think=False parameter is no longer sent to any provider on main.

The literal "think" key only appears in two places in current main:

  • agent/think_scrubber.py:80 — a defensive removal list
  • hermes_cli/model_normalize.py:127 — a known-field whitelist

Neither path constructs a request payload that sets think=False. The reasoning-param dispatch was refactored into agent/transports/chat_completions.py via provider_profile.build_api_kwargs_extras, and the Ollama-only think field never makes it into the chat-completions builder for other providers. The HTTP 422s from Mistral/Fireworks/Together/vLLM/LM Studio that this PR diagnosed can't reproduce.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: think=False incorrectly sent to all provider=custom endpoints, not just Ollama

2 participants