Skip to content

fix(hermes): set model.api_key placeholder so LiteLLM accepts inference.local#4718

Merged
cv merged 1 commit into
mainfrom
fix/4711-hermes-litellm-api-key
Jun 3, 2026
Merged

fix(hermes): set model.api_key placeholder so LiteLLM accepts inference.local#4718
cv merged 1 commit into
mainfrom
fix/4711-hermes-litellm-api-key

Conversation

@laitingsheng

@laitingsheng laitingsheng commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Summary

Hermes buildHermesConfig never emitted model.api_key, so LiteLLM fell back to its no-key-required placeholder. LiteLLM's Virtual Key validator rejected that synchronously with HTTP 401 before any network call, breaking every compatible-endpoint (custom-provider) onboard on Hermes. Setting an sk--prefixed placeholder satisfies the validator; OpenShell's L7 inference router strips client Authorization and injects the cluster-route credential at egress, so the placeholder is never seen by the upstream.

Related Issue

Fixes #4711

Changes

  • agents/hermes/config/hermes-config.ts — emit model.api_key: "sk-OPENSHELL-PROXY-REWRITE" so LiteLLM passes its sk- prefix gate.
  • test/generate-hermes-config.test.ts — assert the placeholder field in default + Kimi-compat paths, plus a dedicated contract test for the LiteLLM gate.
  • test/e2e/test-hermes-inference-switch.sh, test/e2e/test-bedrock-runtime-compatible-anthropic.shassert_hermes_config validates model.api_key is present and sk--prefixed.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: Tinson Lai tinsonl@nvidia.com

Summary by CodeRabbit

  • Tests

    • Enhanced configuration validation across end-to-end test suites to verify proper initialization in multiple deployment scenarios.
    • Added test coverage for configuration settings verification.
  • Chores

    • Updated configuration initialization and validation logic.

…ce.local

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
@coderabbitai

coderabbitai Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 2fdd3f15-872a-4cfe-b29d-4500f83590fe

📥 Commits

Reviewing files that changed from the base of the PR and between 02cd77a and c9b542b.

📒 Files selected for processing (4)
  • agents/hermes/config/hermes-config.ts
  • test/e2e/test-bedrock-runtime-compatible-anthropic.sh
  • test/e2e/test-hermes-inference-switch.sh
  • test/generate-hermes-config.test.ts

📝 Walkthrough

Walkthrough

Hermes configuration now explicitly sets a model.api_key field to a proxy rewrite placeholder value (sk-OPENSHELL-PROXY-REWRITE), with corresponding unit test assertions and E2E validation checks to ensure the field is present and correctly formatted.

Changes

Hermes API Key Configuration

Layer / File(s) Summary
Configuration initialization
agents/hermes/config/hermes-config.ts
Hermes config builder sets model.api_key to the placeholder value "sk-OPENSHELL-PROXY-REWRITE".
Test coverage and E2E validation
test/generate-hermes-config.test.ts, test/e2e/test-bedrock-runtime-compatible-anthropic.sh, test/e2e/test-hermes-inference-switch.sh
Unit tests assert that generated config includes api_key set to the OpenShell proxy rewrite placeholder (validated to be a string starting with sk- and matching OPENSHELL). E2E scripts extract and validate model.api_key from the written config alongside existing checks for model.default, model.base_url, and model.provider.

Estimated Code Review Effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 A key arrives in plaintext place,
sk-OPENSHELL now graces the base.
Tests hop to validate it with care,
No more 401s haunting the air! 🔑

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately reflects the main fix: setting model.api_key placeholder to enable LiteLLM compatibility for inference.local flows in Hermes configuration.
Linked Issues check ✅ Passed The PR successfully addresses the coding requirements from issue #4711: emits sk-prefixed model.api_key placeholder in Hermes config to pass LiteLLM validation and prevents HTTP 401 fallback errors.
Out of Scope Changes check ✅ Passed All changes are directly scoped to implementing the model.api_key fix: config emission, E2E test assertions, and unit test coverage. No unrelated alterations present.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/4711-hermes-litellm-api-key

Comment @coderabbitai help to get the list of available commands and usage tips.

@laitingsheng laitingsheng added integration: hermes Hermes integration behavior area: inference Inference routing, serving, model selection, or outputs bug-fix PR fixes a bug or regression labels Jun 3, 2026
@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor

Findings: 0 needs attention, 1 worth checking, 0 nice ideas
Top item: PR review advisor unavailable

Review findings

🛠️ Needs attention

  • None.

🔎 Worth checking

  • PR review advisor unavailable: The automated advisor could not complete: Could not parse JSON from PR review advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/pr-review-advisor/pr-review-advisor-raw-output.txt
    • Recommendation: Re-run the PR Review Advisor or perform a manual review.
    • Evidence: Could not parse JSON from PR review advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/pr-review-advisor/pr-review-advisor-raw-output.txt

🌱 Nice ideas

  • None.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: hermes-inference-switch-e2e, bedrock-runtime-compatible-anthropic-e2e, hermes-e2e
Optional E2E: hermes-onboard-security-posture-e2e

Dispatch hint: hermes-inference-switch-e2e,bedrock-runtime-compatible-anthropic-e2e,hermes-e2e

Auto-dispatched E2E: hermes-inference-switch-e2e, bedrock-runtime-compatible-anthropic-e2e, hermes-e2e via nightly-e2e.yaml at c9b542bca96e7df098a552291ff98fee9c595f71nightly run

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • hermes-inference-switch-e2e (high): Directly covers the changed Hermes config shape after nemohermes inference set, including config.yaml model block, hashes, inference.local chat completion, and Hermes API chat. This is the closest required gate for the new sk-prefixed api_key placeholder.
  • bedrock-runtime-compatible-anthropic-e2e (high): Directly changed by the PR and validates the Bedrock-compatible Anthropic provider path for both OpenClaw and Hermes, including Hermes config.yaml, inference.local routing, live agent/API turns, and token/hostname leak checks.
  • hermes-e2e (high): The source change affects baseline Hermes onboarding config generation. Run the standard Hermes install/onboard/health/live-inference flow to catch startup or LiteLLM authentication-gate regressions outside the inference-switch-specific path.

Optional E2E

  • hermes-onboard-security-posture-e2e (high): Optional defense-in-depth for config and credential posture: validates Hermes onboarding on a non-root host with runtime guard assertions, useful because the change adds a placeholder that must remain non-secret.

New E2E recommendations

  • None.

Dispatch hint

  • Workflow: .github/workflows/nightly-e2e.yaml
  • jobs input: hermes-inference-switch-e2e,bedrock-runtime-compatible-anthropic-e2e,hermes-e2e

@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

E2E Scenario Advisor Recommendation

Required scenario E2E: ubuntu-repo-cloud-hermes
Optional scenario E2E: ubuntu-repo-cloud-hermes-discord, ubuntu-repo-cloud-hermes-slack

Dispatch required scenario E2E:

  • gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-hermes

Workflow run

Full scenario advisor summary

E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required scenario E2E

  • ubuntu-repo-cloud-hermes: Hermes config generation changed the model block by adding an api_key placeholder. The Ubuntu repo cloud Hermes scenario directly exercises current-branch Hermes onboarding, inference.local routing, and Hermes health/config behavior.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-hermes

Optional scenario E2E

  • ubuntu-repo-cloud-hermes-discord: Optional adjacent Hermes onboarding path with Discord messaging; useful if you want to confirm the shared Hermes config change does not regress messaging-enabled Hermes setup.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-hermes-discord
  • ubuntu-repo-cloud-hermes-slack: Optional adjacent Hermes onboarding path with Slack messaging; useful if you want additional coverage for the shared Hermes config change under a messaging-enabled profile.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-hermes-slack

Relevant changed files

  • agents/hermes/config/hermes-config.ts

@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26902203642
Target ref: c9b542bca96e7df098a552291ff98fee9c595f71
Workflow ref: main
Requested jobs: hermes-inference-switch-e2e,bedrock-runtime-compatible-anthropic-e2e,hermes-e2e
Summary: 3 passed, 0 failed, 0 skipped

Job Result
bedrock-runtime-compatible-anthropic-e2e ✅ success
hermes-e2e ✅ success
hermes-inference-switch-e2e ✅ success

@cv cv added the v0.0.58 Release target label Jun 3, 2026
@cv cv self-assigned this Jun 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: inference Inference routing, serving, model selection, or outputs bug-fix PR fixes a bug or regression integration: hermes Hermes integration behavior v0.0.58 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Ubuntu 24.04][Onboard][GitHub Issue #4564] non-interactive custom-provider onboard does not start model-router; sandbox cannot reach inference.local

3 participants