Skip to content

inference set: switching to nvidia-prod/nvidia-nim/gemini-api inherits a stale openai-responses API and 404s every request #5239

@latenighthackathon

Description

@latenighthackathon

Description

nemoclaw <sandbox> inference set --provider nvidia-prod <model> (switching from a different provider) can configure the sandbox with api: "openai-responses" even though NVIDIA Build / NIM / Gemini do not expose /v1/responses. The agent then gets a hard 404 ("page not found") on every turn. The switch path runs no validation probe, so the misconfiguration is silent until first use.

Root cause

On a provider switch, resolveRuntimeInferenceApi (src/lib/actions/inference-route-api.ts) returns null because both its session and config branches are gated on currentProvider === provider. The caller runInferenceSet (src/lib/actions/inference-set.ts) then falls back to getPreferredInferenceApi(config), which reads models.providers.inference.api — the shared managed-provider block. If the previously-active provider was a compatible endpoint validated as openai-responses, that value is still there and is carried into the switch.

getSandboxInferenceConfig (src/lib/inference/config.ts) does not reset inferenceApi for nvidia-prod / nvidia-nim / gemini-api (unlike anthropic-prod, which forces anthropic-messages), so buildProviderConfig writes api: "openai-responses" for a provider that shouldSkipResponsesProbe (src/lib/validation.ts) already documents has no /v1/responses.

Steps to reproduce

  1. Configure a sandbox with a compatible endpoint that validates as the Responses API (so models.providers.inference.api becomes openai-responses).
  2. nemoclaw <sandbox> inference set --provider nvidia-prod <model>.
  3. Inspect the generated openclaw.json: channels/models.providers.inference.api is openai-responses.
  4. Send a turn: every request 404s against NVIDIA Build, which has no /v1/responses.

Expected

Switching to a provider without a /v1/responses endpoint always configures openai-completions, regardless of the prior provider's persisted api.

Fix

In getSandboxInferenceConfig, force inferenceApi = "openai-completions" when shouldSkipResponsesProbe(provider) is true (the canonical no-responses set: nvidia-prod, nvidia-nim, gemini-api), mirroring how anthropic-prod forces its api.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area: cliCommand line interface, flags, terminal UX, or outputarea: inferenceInference routing, serving, model selection, or outputsarea: routingRequest routing, policy routing, model selection, or fallback logicprovider: nvidiaNVIDIA inference endpoint, NIM, or NVIDIA provider behavior

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions