[Feature]: Pass `think: false` to Ollama for non-reasoning models

### Problem or Use Case

**Summary**

When using Hermes Agent with a local Ollama backend and a thinking-capable model (e.g. `qwen3:8b`), Hermes never passes `think: false` in the chat completions request. This causes the model to run its full reasoning chain on every request, which on CPU inference can take several minutes before producing the first output token — making the agent loop effectively unusable.

**Environment**
- Hermes Agent (latest)
- Ollama 0.20.0
- Model: `qwen3:8b` Q4_K_M via custom OpenAI-compatible endpoint (`http://host.docker.internal:11434/v1`)
- CPU inference (no GPU)

**Steps to reproduce**
1. Configure Hermes with a local Ollama endpoint
2. Use any Qwen3 model (or other thinking-capable model)
3. Send any message — observe multi-minute delay before first token
4. Check Ollama logs — thinking tokens are being generated silently before any response content

**Root cause**

Ollama 0.6+ supports a `think` parameter in the `/api/chat` and `/v1/chat/completions` endpoints. When `think: false` is passed, the model skips the reasoning phase entirely and responds immediately. Hermes never passes this parameter, so thinking-capable models always run in thinking mode regardless of the user's `reasoning_effort` config.

The affected code is `_build_api_kwargs()` in `run_agent.py` around line 5394, where the chat completions payload is assembled.

**Workaround**

Manually patching `run_agent.py` to add `"think": False` to the `api_kwargs` dict fixes the issue and brings response time from several minutes down to ~1 second on the same hardware.

### Proposed Solution

Add an opt-in config option (e.g. `provider.think: false`) or auto-detect when the endpoint is an Ollama instance and pass `think: false` when `reasoning_effort` is not explicitly enabled. At minimum, exposing this as an environment variable (`HERMES_OLLAMA_THINK=false`) would be a low-risk fix.

Happy to submit a PR if the maintainers can advise on the preferred approach.

### Alternatives Considered

_No response_

### Feature Type

Configuration option

### Scope

None

### Contribution

- [ ] I'd like to implement this myself and submit a PR

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Pass `think: false` to Ollama for non-reasoning models #6152

Problem or Use Case

Proposed Solution

Alternatives Considered

Feature Type

Scope

Contribution

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Feature]: Pass think: false to Ollama for non-reasoning models #6152

Description

Problem or Use Case

Proposed Solution

Alternatives Considered

Feature Type

Scope

Contribution

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

[Feature]: Pass `think: false` to Ollama for non-reasoning models #6152