feat(providers): native NVIDIA NIM provider (salvage of #11703)#11774
Merged
Conversation
Adds NVIDIA NIM as a first-class provider: ProviderConfig in auth.py, HermesOverlay in providers.py, curated models (Nemotron plus other open source models hosted on build.nvidia.com), URL mapping in model_metadata.py, aliases (nim, nvidia-nim, build-nvidia, nemotron), and env var tests. Docs updated: providers page, quickstart table, fallback providers table, and README provider list.
Follow-up on the native NVIDIA NIM provider salvage. The original PR wired
PROVIDER_REGISTRY + HERMES_OVERLAYS correctly but missed several touchpoints
required for full parity with other OpenAI-compatible providers (xai,
huggingface, deepseek, zai).
Gaps closed:
- hermes_cli/main.py:
- Add 'nvidia' to the _model_flow_api_key_provider dispatch tuple so
selecting 'NVIDIA NIM' in `hermes model` actually runs the api-key
provider flow (previously fell through silently).
- Add 'nvidia' to `hermes chat --provider` argparse choices so the
documented test command (`hermes chat --provider nvidia --model ...`)
parses successfully.
- hermes_cli/config.py: Register NVIDIA_API_KEY and NVIDIA_BASE_URL in
OPTIONAL_ENV_VARS so setup wizard can prompt for them and they're
auto-added to the subprocess env blocklist.
- hermes_cli/doctor.py: Add NVIDIA NIM row to `_apikey_providers` so
`hermes doctor` probes https://integrate.api.nvidia.com/v1/models.
- hermes_cli/dump.py: Add NVIDIA_API_KEY → 'nvidia' mapping for
`hermes dump` credential masking.
- tests/tools/test_local_env_blocklist.py: Extend registry_vars fixture
with NVIDIA_API_KEY to verify it's blocked from leaking into subprocesses.
- agent/model_metadata.py: Add 'nemotron' → 131072 context-length entry
so all Nemotron variants get 128K context via substring match (rather
than falling back to MINIMUM_CONTEXT_LENGTH).
- hermes_cli/models.py: Fix hallucinated model ID
'nvidia/nemotron-3-nano-8b-a4b' → 'nvidia/nemotron-3-nano-30b-a3b'
(verified against live integrate.api.nvidia.com/v1/models catalog).
Expand curated list from 5 to 9 agentic models mapping to OpenRouter
defaults per provider-guide convention: add qwen3.5-397b-a17b,
deepseek-v3.2, llama-3.3-nemotron-super-49b-v1.5, gpt-oss-120b.
- cli-config.yaml.example: Document 'nvidia' provider option.
- scripts/release.py: Map asurla@nvidia.com → anniesurla in AUTHOR_MAP
for CI attribution.
E2E verified: `hermes chat --provider nvidia ...` now reaches NVIDIA's
endpoint (returns 401 with bogus key instead of argparse error);
`hermes doctor` detects NVIDIA NIM when NVIDIA_API_KEY is set.
39eb8b2 to
113099c
Compare
Contributor
|
19 tasks
teknium1
added a commit
that referenced
this pull request
Apr 18, 2026
Fills documentation gaps that accumulated as features merged ahead of their docs updates. All additions are verified against code and the originating PRs. Providers: - Ollama Cloud (#10782) — new provider section, env vars, quickstart/fallback rows - xAI Grok Responses API + TTS (#10783) — provider note, TTS table + config - Google Gemini CLI OAuth (#11270) — quickstart/fallback/cli-commands entries - NVIDIA NIM (#11774) — NVIDIA_API_KEY / NVIDIA_BASE_URL in env-vars reference - HERMES_INFERENCE_PROVIDER enum updated Messaging: - DISCORD_ALLOWED_ROLES (#11608) — env-vars, discord.md access control section - DingTalk QR device-flow (#11574) — wizard path in Option A + openClaw disclosure - Feishu document comment intelligent reply (#11898) — full section + 3-tier access control + CLI Skills / commands: - concept-diagrams skill (#11363) — optional-skills-catalog entry - /gquota (#11270) — slash-commands reference Build: docusaurus build passes, ascii-guard lint 0 errors.
teknium1
added a commit
that referenced
this pull request
Apr 18, 2026
Fills documentation gaps that accumulated as features merged ahead of their docs updates. All additions are verified against code and the originating PRs. Providers: - Ollama Cloud (#10782) — new provider section, env vars, quickstart/fallback rows - xAI Grok Responses API + TTS (#10783) — provider note, TTS table + config - Google Gemini CLI OAuth (#11270) — quickstart/fallback/cli-commands entries - NVIDIA NIM (#11774) — NVIDIA_API_KEY / NVIDIA_BASE_URL in env-vars reference - HERMES_INFERENCE_PROVIDER enum updated Messaging: - DISCORD_ALLOWED_ROLES (#11608) — env-vars, discord.md access control section - DingTalk QR device-flow (#11574) — wizard path in Option A + openClaw disclosure - Feishu document comment intelligent reply (#11898) — full section + 3-tier access control + CLI Skills / commands: - concept-diagrams skill (#11363) — optional-skills-catalog entry - /gquota (#11270) — slash-commands reference Build: docusaurus build passes, ascii-guard lint 0 errors.
17 tasks
ulasbilgen
pushed a commit
to ulasbilgen/hermes-adhd-agent
that referenced
this pull request
May 1, 2026
) Fills documentation gaps that accumulated as features merged ahead of their docs updates. All additions are verified against code and the originating PRs. Providers: - Ollama Cloud (NousResearch#10782) — new provider section, env vars, quickstart/fallback rows - xAI Grok Responses API + TTS (NousResearch#10783) — provider note, TTS table + config - Google Gemini CLI OAuth (NousResearch#11270) — quickstart/fallback/cli-commands entries - NVIDIA NIM (NousResearch#11774) — NVIDIA_API_KEY / NVIDIA_BASE_URL in env-vars reference - HERMES_INFERENCE_PROVIDER enum updated Messaging: - DISCORD_ALLOWED_ROLES (NousResearch#11608) — env-vars, discord.md access control section - DingTalk QR device-flow (NousResearch#11574) — wizard path in Option A + openClaw disclosure - Feishu document comment intelligent reply (NousResearch#11898) — full section + 3-tier access control + CLI Skills / commands: - concept-diagrams skill (NousResearch#11363) — optional-skills-catalog entry - /gquota (NousResearch#11270) — slash-commands reference Build: docusaurus build passes, ascii-guard lint 0 errors.
aj-nt
pushed a commit
to aj-nt/hermes-agent
that referenced
this pull request
May 1, 2026
) Fills documentation gaps that accumulated as features merged ahead of their docs updates. All additions are verified against code and the originating PRs. Providers: - Ollama Cloud (NousResearch#10782) — new provider section, env vars, quickstart/fallback rows - xAI Grok Responses API + TTS (NousResearch#10783) — provider note, TTS table + config - Google Gemini CLI OAuth (NousResearch#11270) — quickstart/fallback/cli-commands entries - NVIDIA NIM (NousResearch#11774) — NVIDIA_API_KEY / NVIDIA_BASE_URL in env-vars reference - HERMES_INFERENCE_PROVIDER enum updated Messaging: - DISCORD_ALLOWED_ROLES (NousResearch#11608) — env-vars, discord.md access control section - DingTalk QR device-flow (NousResearch#11574) — wizard path in Option A + openClaw disclosure - Feishu document comment intelligent reply (NousResearch#11898) — full section + 3-tier access control + CLI Skills / commands: - concept-diagrams skill (NousResearch#11363) — optional-skills-catalog entry - /gquota (NousResearch#11270) — slash-commands reference Build: docusaurus build passes, ascii-guard lint 0 errors.
02356abc
pushed a commit
to 02356abc/hermes-agent
that referenced
this pull request
May 14, 2026
) Fills documentation gaps that accumulated as features merged ahead of their docs updates. All additions are verified against code and the originating PRs. Providers: - Ollama Cloud (NousResearch#10782) — new provider section, env vars, quickstart/fallback rows - xAI Grok Responses API + TTS (NousResearch#10783) — provider note, TTS table + config - Google Gemini CLI OAuth (NousResearch#11270) — quickstart/fallback/cli-commands entries - NVIDIA NIM (NousResearch#11774) — NVIDIA_API_KEY / NVIDIA_BASE_URL in env-vars reference - HERMES_INFERENCE_PROVIDER enum updated Messaging: - DISCORD_ALLOWED_ROLES (NousResearch#11608) — env-vars, discord.md access control section - DingTalk QR device-flow (NousResearch#11574) — wizard path in Option A + openClaw disclosure - Feishu document comment intelligent reply (NousResearch#11898) — full section + 3-tier access control + CLI Skills / commands: - concept-diagrams skill (NousResearch#11363) — optional-skills-catalog entry - /gquota (NousResearch#11270) — slash-commands reference Build: docusaurus build passes, ascii-guard lint 0 errors.
gweeteve
pushed a commit
to gweeteve/hermes-agent
that referenced
this pull request
Jun 2, 2026
) Fills documentation gaps that accumulated as features merged ahead of their docs updates. All additions are verified against code and the originating PRs. Providers: - Ollama Cloud (NousResearch#10782) — new provider section, env vars, quickstart/fallback rows - xAI Grok Responses API + TTS (NousResearch#10783) — provider note, TTS table + config - Google Gemini CLI OAuth (NousResearch#11270) — quickstart/fallback/cli-commands entries - NVIDIA NIM (NousResearch#11774) — NVIDIA_API_KEY / NVIDIA_BASE_URL in env-vars reference - HERMES_INFERENCE_PROVIDER enum updated Messaging: - DISCORD_ALLOWED_ROLES (NousResearch#11608) — env-vars, discord.md access control section - DingTalk QR device-flow (NousResearch#11574) — wizard path in Option A + openClaw disclosure - Feishu document comment intelligent reply (NousResearch#11898) — full section + 3-tier access control + CLI Skills / commands: - concept-diagrams skill (NousResearch#11363) — optional-skills-catalog entry - /gquota (NousResearch#11270) — slash-commands reference Build: docusaurus build passes, ascii-guard lint 0 errors.
Egavasyug
pushed a commit
to Egavasyug/hermes-agent
that referenced
this pull request
Jun 10, 2026
) Fills documentation gaps that accumulated as features merged ahead of their docs updates. All additions are verified against code and the originating PRs. Providers: - Ollama Cloud (NousResearch#10782) — new provider section, env vars, quickstart/fallback rows - xAI Grok Responses API + TTS (NousResearch#10783) — provider note, TTS table + config - Google Gemini CLI OAuth (NousResearch#11270) — quickstart/fallback/cli-commands entries - NVIDIA NIM (NousResearch#11774) — NVIDIA_API_KEY / NVIDIA_BASE_URL in env-vars reference - HERMES_INFERENCE_PROVIDER enum updated Messaging: - DISCORD_ALLOWED_ROLES (NousResearch#11608) — env-vars, discord.md access control section - DingTalk QR device-flow (NousResearch#11574) — wizard path in Option A + openClaw disclosure - Feishu document comment intelligent reply (NousResearch#11898) — full section + 3-tier access control + CLI Skills / commands: - concept-diagrams skill (NousResearch#11363) — optional-skills-catalog entry - /gquota (NousResearch#11270) — slash-commands reference Build: docusaurus build passes, ascii-guard lint 0 errors.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Ships NVIDIA NIM (integrate.api.nvidia.com) as a first-class OpenAI-compatible provider, at parity with xai / huggingface / deepseek / zai.
Salvage of #11703 by @anniesurla. Her commit is preserved with original authorship (see commit log). A follow-up commit closes parity gaps the original PR didn't cover.
Resolves #9106.
Changes
Contributor commit (0f1545e) — core provider registration:
Follow-up commit (39eb8b2) — parity gaps:
Validation
Targeted tests: `tests/hermes_cli/test_api_key_providers.py` + `tests/tools/test_local_env_blocklist.py` → 147 passed in 7.84s.
E2E: verified with `python -m hermes_cli.main chat --provider nvidia --model nvidia/nemotron-3-super-120b-a12b -q test` and `... doctor` — both take the correct code paths.
Attribution
Original author: @anniesurla (asurla@nvidia.com). Her commit is preserved with `git cherry-pick` + rebase-merge.