fix(onboard): force chat completions for compatible-endpoint providers#1984
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
✅ Files skipped from review due to trivial changes (2)
📝 WalkthroughWalkthroughWhen validating custom OpenAI‑compatible endpoints, onboarding reads NEMOCLAW_PREFERRED_API and, unless that env var is explicitly one of Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant CLI as "CLI (nemoclaw)"
participant Wizard as "Onboard Wizard\n(src/lib/onboard.ts)"
participant Endpoint as "Compatible Endpoint\n(/v1/responses)"
participant Env as "Env / Config\n(NEMOCLAW_PREFERRED_API)"
CLI->>Wizard: start onboard
Wizard->>Endpoint: probe /v1/responses (validation)
Endpoint-->>Wizard: returns valid response (validation.api)
Wizard->>Env: read NEMOCLAW_PREFERRED_API
alt env is "openai-completions" or "chat-completions"
Wizard->>Wizard: preserve validation.api as preferredInferenceApi
Wizard-->>CLI: proceed with validation.api
else env unset or other value
Wizard->>Wizard: set preferredInferenceApi = "openai-completions"
alt validation.api != "openai-completions"
Wizard-->>CLI: log informational message about forcing completions
end
end
Wizard-->>CLI: output preferredInferenceApi in wizard output
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/lib/onboard.ts`:
- Around line 3343-3351: The comment claims users can override the chosen API
with NEMOCLAW_PREFERRED_API but the code unconditionally sets
preferredInferenceApi to "openai-completions"; update the assignment to honor
the env var: check process.env.NEMOCLAW_PREFERRED_API and, if present, use that
value for preferredInferenceApi (e.g., "openai-responses"), otherwise fall back
to "openai-completions"; keep the existing validation.api check and log message
around it (symbols: preferredInferenceApi, validation.api,
NEMOCLAW_PREFERRED_API).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: e4f38e50-d1b1-48b0-b0e7-03c7a4568fa1
📒 Files selected for processing (2)
src/lib/onboard.tstest/onboard-selection.test.ts
The compatible-endpoint code path accepted whatever API mode the validation probe returned. When backends like Ollama (v0.20+) expose a working /v1/responses endpoint, the wizard selected openai-responses mode. The Responses API sends the system prompt using the `developer` role, which many OpenAI-compatible backends (Ollama, vLLM, LiteLLM) either silently drop or handle incorrectly — causing the model to receive no tool definitions and no system prompt. The ollama-local and vllm-local paths already forced openai-completions. This commit applies the same override to compatible-endpoint. Forcing openai-completions is safe for compatible-endpoint because: - Users pointing at actual OpenAI would use the dedicated openai-api provider, not compatible-endpoint - Every local/proxy backend tested (Ollama, vLLM, LiteLLM, NIM) has Responses API issues — openai-completions works universally - The NEMOCLAW_PREFERRED_API=openai-responses env var override remains available for users who explicitly need the Responses API Closes #1932
…dpoint Address CodeRabbit review: the previous commit claimed users could override via NEMOCLAW_PREFERRED_API=openai-responses but the code unconditionally forced openai-completions. Now check the env var and respect an explicit user preference while still defaulting to openai-completions for safety.
…ible-endpoint Verifies that setting NEMOCLAW_PREFERRED_API=openai-responses bypasses the forced-completions override, proving the escape hatch works for users who know their backend supports the Responses API.
f868bdd to
4584292
Compare
ericksoa
left a comment
There was a problem hiding this comment.
Clean fix — forces chat completions for compatible-endpoint providers where the Responses API developer role is unreliable, with a sensible env var escape hatch. Good test coverage for both paths. LGTM.
Refresh user-facing docs against the 34 commits merged between v0.0.17 and v0.0.18. Highlights: - Replace the Ollama 0.0.0.0 binding guidance with the new authenticated reverse proxy on 127.0.0.1:11435 (#1922). - Document the compatible-endpoint provider defaulting to /v1/chat/completions and the NEMOCLAW_PREFERRED_API=openai-responses opt-in (#1984). - Add the new nemoclaw upgrade-sandboxes command with --check, --auto, and --yes flags (#1943). - Note the cross-sandbox messaging overlap warning and 409 detection in nemoclaw <name> status (#1953). - Document the messaging-token rotation auto-rebuild flow (#1967). - Cover new troubleshooting entries for the Ollama auth proxy, IPv6 localhost resolution, orphan SSH port-forward cleanup on re-onboard, and rotated messaging credentials (#1978, #1950). - Note tar failure exit code for nemoclaw debug --output (#1770) and the orphaned openshell process cleanup in nemoclaw uninstall (#1940). Also: - Extend docs/.docs-skip to exclude the experimental sandbox-mgmt shields and config commands (#1976). - Fix a sphinx-autobuild infinite rebuild loop in docs/conf.py by writing docs/project.json only when its contents change. - Bump the docs version switcher preferred entry to 0.0.18. - Regenerate nemoclaw-user-* agent skills from docs/. Signed-off-by: Miyoung Choi <miyoungc@nvidia.com> Made-with: Cursor
## Summary Refresh user-facing documentation against the 34 commits merged between v0.0.17 and v0.0.18, bump the docs version switcher to v0.0.18, and fix a `sphinx-autobuild` infinite-rebuild loop triggered by `docs/conf.py`. ## Changes - **Ollama authenticated reverse proxy** (#1922): Replace the `0.0.0.0:11434` guidance in `docs/inference/use-local-inference.md` with the new token-gated proxy on `127.0.0.1:11435`, including persisted token, health-check exemption, and sandbox provider wiring. Replace the matching troubleshooting entry in `docs/reference/troubleshooting.md`. - **Compatible-endpoint default API path** (#1984): Document that the compatible-endpoint provider now defaults to `/v1/chat/completions` and update `NEMOCLAW_PREFERRED_API` to describe `openai-responses` as the opt-in instead of `openai-completions`. Updates in `use-local-inference.md`, `switch-inference-providers.md`, and `troubleshooting.md`. - **`nemoclaw upgrade-sandboxes` command** (#1943): Add a new reference entry in `docs/reference/commands.md` covering `--check`, `--auto`, and `--yes` flags. - **Messaging token rotation auto-rebuild** (#1967, #1953): Note the automatic rebuild behavior and cross-sandbox overlap warning in `docs/deployment/set-up-telegram-bridge.md`, `commands.md`, and `troubleshooting.md`. - **Other troubleshooting additions**: - `localhost` → `127.0.0.1` IPv6 note (#1978) - Orphan SSH port-forward cleanup on re-onboard (#1950) - Orphan `openshell` process cleanup in `nemoclaw uninstall` (#1940) - Non-zero exit on tar failure in `nemoclaw debug --output` (#1770) - **Skip list**: Extend `docs/.docs-skip` to exclude the experimental sandbox-mgmt shields and config commands feature (#1976), which was explicitly merged as not-yet-documented. - **Build stability**: `docs/conf.py` now writes `docs/project.json` only when contents change, so `make docs-live` / `sphinx-autobuild` no longer detects its own generated file as a source change and enters an infinite rebuild loop. - **Version switcher**: Bump `docs/versions1.json` and `docs/project.json` preferred entry to v0.0.18 so this refresh renders under the new version. - **Agent skills**: Regenerate `nemoclaw-user-*` skills from `docs/` with `scripts/docs-to-skills.py`. ## Type of Change - [ ] Code change (feature, bug fix, or refactor) - [ ] Code change with doc updates - [x] Doc only (prose changes, no code sample modifications) - [ ] Doc only (includes code sample changes) ## Verification - [x] `npx prek run --all-files` passes (ran via pre-commit hook on staged files) - [ ] `npm test` passes - [ ] Tests added or updated for new or changed behavior - [x] No secrets, API keys, or credentials committed - [x] Docs updated for user-facing behavior changes - [x] `make docs` builds without warnings (doc changes only) - [x] Doc pages follow the [style guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md) (doc changes only) - [ ] New doc pages include SPDX header and frontmatter (new pages only) ## AI Disclosure - [x] AI-assisted — tool: Cursor --- Signed-off-by: Miyoung Choi <miyoungc@nvidia.com> Made with [Cursor](https://cursor.com) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes * **New Features** * Added `nemoclaw upgrade-sandboxes` command to rebuild sandboxes when base-image digests change. * Introduced authenticated reverse proxy for local Ollama inference with token-based access control. * Automatic sandbox backup, recreation, and restore when messaging credentials are updated. * Cross-sandbox messaging token overlap detection with status warnings. * **Improvements** * Compatible-endpoint provider now defaults to `/v1/chat/completions` API path. * Enhanced troubleshooting documentation with new diagnostics sections. * **Documentation** * Updated onboarding and configuration guides. * Expanded version documentation to 0.0.18. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>
Bug
When a user selects "Other OpenAI-compatible endpoint" and points it at Ollama (v0.20+), the onboard wizard probes
/v1/responses, finds it working, and selectsopenai-responsesmode. That mode sends the system prompt using thedeveloperrole, which many backends silently drop — the model receives no tool definitions and no system prompt, and all tool use fails silently.Fix
Force
openai-completionsAPI mode for thecompatible-endpointprovider path during onboard, matching the existing behavior ofollama-localandvllm-local. HonorNEMOCLAW_PREFERRED_API=openai-responsesas an explicit opt-in for users who know their backend supports it.Why forcing
openai-completionsis safe for all compatible endpointsopenai-apiprovider, notcompatible-endpointopenai-completionsworks universally with thesystemroleNEMOCLAW_PREFERRED_API=openai-responsesenv var override remains available for users who explicitly need the Responses APIReproduction
Confirmed on DGX Spark (Ollama 0.20.7, nemotron-3-super:120b) using an isolated Docker-in-Docker container:
/v1/responsesendpoint responds successfully — the wizard probe passesopenai-responsesmode forcompatible-endpointollama-localpath correctly forcesopenai-completions, butcompatible-endpointdid notChanges
src/lib/onboard.ts: OverridepreferredInferenceApitoopenai-completionsin thecompatible-endpointvalidation block, with an informational log message when the override fires. HonorNEMOCLAW_PREFERRED_APIenv var as an explicit opt-in escape hatch.test/onboard-selection.test.ts: Two new tests — one verifying the forced-completions default, one verifying the env var override path.Test plan
npx vitest run --project cli test/onboard-selection.test.ts— 32/32 passcompatible-endpoint→ Ollama confirmsopenai-completionsselectedCloses #1932
Summary by CodeRabbit
Improvements
Tests