Skip to content

feat: Ollama improvements — Cloud provider, GLM continuation, think=false, surrogate sanitization, /v1 hint#10782

Merged
teknium1 merged 5 commits into
mainfrom
hermes/hermes-c66666e2
Apr 16, 2026
Merged

feat: Ollama improvements — Cloud provider, GLM continuation, think=false, surrogate sanitization, /v1 hint#10782
teknium1 merged 5 commits into
mainfrom
hermes/hermes-c66666e2

Conversation

@teknium1

Copy link
Copy Markdown
Contributor

Summary

Consolidates 5 community PRs into a single Ollama improvement package. Each contributor's commit is preserved with original authorship via cherry-pick (rebase merge to keep attribution).

Changes

1. Ollama Cloud as built-in provider (cherry-picked from PR #6038 by @kshitijk4poor)

  • Full first-class provider: --provider ollama-cloud, PROVIDER_REGISTRY, CANONICAL_PROVIDERS
  • Dynamic model discovery from ollama.com/v1/models + models.dev, disk-cached (1hr TTL)
  • OLLAMA_API_KEY env var, URL auto-detection, model:tag passthrough normalization
  • 37 provider-specific tests
  • Closes Add Ollama Cloud as built-in provider #3926

2. Continue Ollama GLM replies after stop misreports (cherry-picked from PR #10740 by @LeonSGP43)

3. Pass think=false for Ollama when reasoning disabled (cherry-picked from PR #3197 by @Mibayy)

4. Sanitize surrogate characters from Ollama model output (cherry-picked from PR #5074 by @ygd58)

5. Hint about /v1 suffix for local model endpoints (our addition)

Contributor Attribution

All cherry-picked commits preserve original authorship in git log. Use rebase merge to keep individual commit attribution.

PRs to close after merge

Test Results

  • 37/37 Ollama Cloud provider tests pass
  • 296/298 run_agent tests pass (2 pre-existing failures unrelated to this PR)
  • Full hermes_cli suite: 2068 passed (30 failures + 44 errors are pre-existing on main)

kshitijk4poor and others added 5 commits April 15, 2026 22:32
Add ollama-cloud as a first-class provider with full parity to existing
API-key providers (gemini, zai, minimax, etc.):

- PROVIDER_REGISTRY entry with OLLAMA_API_KEY env var
- Provider aliases: ollama -> custom (local), ollama_cloud -> ollama-cloud
- models.dev integration for accurate context lengths
- URL-to-provider mapping (ollama.com -> ollama-cloud)
- Passthrough model normalization (preserves Ollama model:tag format)
- Default auxiliary model (nemotron-3-nano:30b)
- HermesOverlay in providers.py
- CLI --provider choices, CANONICAL_PROVIDERS entry
- Dynamic model discovery with disk caching (1hr TTL)
- 37 provider-specific tests

Cherry-picked from PR #6038 by kshitijk4poor. Closes #3926
…ort is none

When a custom/Ollama provider is used and reasoning_effort is set to 'none'
(or enabled: false), inject 'think': false into the request extra_body.

Ollama does not recognise the OpenRouter-style 'reasoning' extra_body field,
so thinking-capable models (Qwen3, etc.) generate <think> blocks regardless
of the reasoning_effort setting. This produces empty-response errors that
corrupt session state.

The fix adds a provider-specific block in _build_api_kwargs() that sets
think=false in extra_body whenever self.provider == 'custom' and reasoning
is explicitly disabled.

Closes #3191
When a user enters a local model server URL (Ollama, vLLM, llama.cpp)
without a /v1 suffix during 'hermes model' custom endpoint setup,
prompt them to add it. Most OpenAI-compatible local servers require
/v1 in the base URL for chat completions to work.
@arsaboo

arsaboo commented Apr 16, 2026

Copy link
Copy Markdown

@teknium1 how do you envision handling local ollama? I am trying to work on adding local ollama as a built-in provider. Should we have OLLAMA_LOCAL_BASE_URL ?

teknium1 added a commit that referenced this pull request Apr 18, 2026
Fills documentation gaps that accumulated as features merged ahead of their
docs updates. All additions are verified against code and the originating PRs.

Providers:
- Ollama Cloud (#10782) — new provider section, env vars, quickstart/fallback rows
- xAI Grok Responses API + TTS (#10783) — provider note, TTS table + config
- Google Gemini CLI OAuth (#11270) — quickstart/fallback/cli-commands entries
- NVIDIA NIM (#11774) — NVIDIA_API_KEY / NVIDIA_BASE_URL in env-vars reference
- HERMES_INFERENCE_PROVIDER enum updated

Messaging:
- DISCORD_ALLOWED_ROLES (#11608) — env-vars, discord.md access control section
- DingTalk QR device-flow (#11574) — wizard path in Option A + openClaw disclosure
- Feishu document comment intelligent reply (#11898) — full section + 3-tier access control + CLI

Skills / commands:
- concept-diagrams skill (#11363) — optional-skills-catalog entry
- /gquota (#11270) — slash-commands reference

Build: docusaurus build passes, ascii-guard lint 0 errors.
teknium1 added a commit that referenced this pull request Apr 18, 2026
Fills documentation gaps that accumulated as features merged ahead of their
docs updates. All additions are verified against code and the originating PRs.

Providers:
- Ollama Cloud (#10782) — new provider section, env vars, quickstart/fallback rows
- xAI Grok Responses API + TTS (#10783) — provider note, TTS table + config
- Google Gemini CLI OAuth (#11270) — quickstart/fallback/cli-commands entries
- NVIDIA NIM (#11774) — NVIDIA_API_KEY / NVIDIA_BASE_URL in env-vars reference
- HERMES_INFERENCE_PROVIDER enum updated

Messaging:
- DISCORD_ALLOWED_ROLES (#11608) — env-vars, discord.md access control section
- DingTalk QR device-flow (#11574) — wizard path in Option A + openClaw disclosure
- Feishu document comment intelligent reply (#11898) — full section + 3-tier access control + CLI

Skills / commands:
- concept-diagrams skill (#11363) — optional-skills-catalog entry
- /gquota (#11270) — slash-commands reference

Build: docusaurus build passes, ascii-guard lint 0 errors.
@Cat0x00

Cat0x00 commented Apr 21, 2026

Copy link
Copy Markdown

@teknium1 please see #3197 (comment) there is a bug in one of these merges (pull reqeusts).

ulasbilgen pushed a commit to ulasbilgen/hermes-adhd-agent that referenced this pull request May 1, 2026
)

Fills documentation gaps that accumulated as features merged ahead of their
docs updates. All additions are verified against code and the originating PRs.

Providers:
- Ollama Cloud (NousResearch#10782) — new provider section, env vars, quickstart/fallback rows
- xAI Grok Responses API + TTS (NousResearch#10783) — provider note, TTS table + config
- Google Gemini CLI OAuth (NousResearch#11270) — quickstart/fallback/cli-commands entries
- NVIDIA NIM (NousResearch#11774) — NVIDIA_API_KEY / NVIDIA_BASE_URL in env-vars reference
- HERMES_INFERENCE_PROVIDER enum updated

Messaging:
- DISCORD_ALLOWED_ROLES (NousResearch#11608) — env-vars, discord.md access control section
- DingTalk QR device-flow (NousResearch#11574) — wizard path in Option A + openClaw disclosure
- Feishu document comment intelligent reply (NousResearch#11898) — full section + 3-tier access control + CLI

Skills / commands:
- concept-diagrams skill (NousResearch#11363) — optional-skills-catalog entry
- /gquota (NousResearch#11270) — slash-commands reference

Build: docusaurus build passes, ascii-guard lint 0 errors.
aj-nt pushed a commit to aj-nt/hermes-agent that referenced this pull request May 1, 2026
)

Fills documentation gaps that accumulated as features merged ahead of their
docs updates. All additions are verified against code and the originating PRs.

Providers:
- Ollama Cloud (NousResearch#10782) — new provider section, env vars, quickstart/fallback rows
- xAI Grok Responses API + TTS (NousResearch#10783) — provider note, TTS table + config
- Google Gemini CLI OAuth (NousResearch#11270) — quickstart/fallback/cli-commands entries
- NVIDIA NIM (NousResearch#11774) — NVIDIA_API_KEY / NVIDIA_BASE_URL in env-vars reference
- HERMES_INFERENCE_PROVIDER enum updated

Messaging:
- DISCORD_ALLOWED_ROLES (NousResearch#11608) — env-vars, discord.md access control section
- DingTalk QR device-flow (NousResearch#11574) — wizard path in Option A + openClaw disclosure
- Feishu document comment intelligent reply (NousResearch#11898) — full section + 3-tier access control + CLI

Skills / commands:
- concept-diagrams skill (NousResearch#11363) — optional-skills-catalog entry
- /gquota (NousResearch#11270) — slash-commands reference

Build: docusaurus build passes, ascii-guard lint 0 errors.
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
)

Fills documentation gaps that accumulated as features merged ahead of their
docs updates. All additions are verified against code and the originating PRs.

Providers:
- Ollama Cloud (NousResearch#10782) — new provider section, env vars, quickstart/fallback rows
- xAI Grok Responses API + TTS (NousResearch#10783) — provider note, TTS table + config
- Google Gemini CLI OAuth (NousResearch#11270) — quickstart/fallback/cli-commands entries
- NVIDIA NIM (NousResearch#11774) — NVIDIA_API_KEY / NVIDIA_BASE_URL in env-vars reference
- HERMES_INFERENCE_PROVIDER enum updated

Messaging:
- DISCORD_ALLOWED_ROLES (NousResearch#11608) — env-vars, discord.md access control section
- DingTalk QR device-flow (NousResearch#11574) — wizard path in Option A + openClaw disclosure
- Feishu document comment intelligent reply (NousResearch#11898) — full section + 3-tier access control + CLI

Skills / commands:
- concept-diagrams skill (NousResearch#11363) — optional-skills-catalog entry
- /gquota (NousResearch#11270) — slash-commands reference

Build: docusaurus build passes, ascii-guard lint 0 errors.
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
)

Fills documentation gaps that accumulated as features merged ahead of their
docs updates. All additions are verified against code and the originating PRs.

Providers:
- Ollama Cloud (NousResearch#10782) — new provider section, env vars, quickstart/fallback rows
- xAI Grok Responses API + TTS (NousResearch#10783) — provider note, TTS table + config
- Google Gemini CLI OAuth (NousResearch#11270) — quickstart/fallback/cli-commands entries
- NVIDIA NIM (NousResearch#11774) — NVIDIA_API_KEY / NVIDIA_BASE_URL in env-vars reference
- HERMES_INFERENCE_PROVIDER enum updated

Messaging:
- DISCORD_ALLOWED_ROLES (NousResearch#11608) — env-vars, discord.md access control section
- DingTalk QR device-flow (NousResearch#11574) — wizard path in Option A + openClaw disclosure
- Feishu document comment intelligent reply (NousResearch#11898) — full section + 3-tier access control + CLI

Skills / commands:
- concept-diagrams skill (NousResearch#11363) — optional-skills-catalog entry
- /gquota (NousResearch#11270) — slash-commands reference

Build: docusaurus build passes, ascii-guard lint 0 errors.
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
)

Fills documentation gaps that accumulated as features merged ahead of their
docs updates. All additions are verified against code and the originating PRs.

Providers:
- Ollama Cloud (NousResearch#10782) — new provider section, env vars, quickstart/fallback rows
- xAI Grok Responses API + TTS (NousResearch#10783) — provider note, TTS table + config
- Google Gemini CLI OAuth (NousResearch#11270) — quickstart/fallback/cli-commands entries
- NVIDIA NIM (NousResearch#11774) — NVIDIA_API_KEY / NVIDIA_BASE_URL in env-vars reference
- HERMES_INFERENCE_PROVIDER enum updated

Messaging:
- DISCORD_ALLOWED_ROLES (NousResearch#11608) — env-vars, discord.md access control section
- DingTalk QR device-flow (NousResearch#11574) — wizard path in Option A + openClaw disclosure
- Feishu document comment intelligent reply (NousResearch#11898) — full section + 3-tier access control + CLI

Skills / commands:
- concept-diagrams skill (NousResearch#11363) — optional-skills-catalog entry
- /gquota (NousResearch#11270) — slash-commands reference

Build: docusaurus build passes, ascii-guard lint 0 errors.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment