fix(inference): auto-detect Bedrock Runtime custom endpoints#3767
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughDetects AWS Bedrock Runtime custom endpoints and routes them through a local OpenAI-compatible adapter: adds endpoint classification, OpenAI↔Bedrock conversion (including streaming), subprocess lifecycle, onboarding routing and setup, port/validation updates, tests, docs, and a launcher script. ChangesAWS Bedrock Runtime Adapter Integration
Sequence Diagram (high-level request flow): sequenceDiagram
participant Client as OpenAI Client
participant Adapter as Local Bedrock Adapter
participant Bedrock as AWS Bedrock Runtime
Client->>Adapter: POST /v1/chat/completions (OpenAI format + token)
Adapter->>Adapter: buildBedrockConverseRequest()
Adapter->>Bedrock: ConverseCommand / ConverseStreamCommand
Bedrock->>Adapter: ConverseCommandOutput or stream events
Adapter->>Client: JSON response or SSE chat.completion.chunk(s)
Estimated code review effort 🎯 4 (Complex) | ⏱️ ~75 minutes Possibly Related PRs
Suggested Labels
Suggested Reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
🌿 Preview your docs: https://nvidia-preview-pr-3767.docs.buildwithfern.com/nemoclaw |
E2E Advisor RecommendationRequired E2E: Dispatch hint: Auto-dispatched E2E: Full advisor summaryE2E Recommendation AdvisorBase: Required E2E
Optional E2E
New E2E recommendations
Dispatch hint
|
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/lib/core/ports.ts`:
- Around line 60-64: The new BEDROCK_RUNTIME_ADAPTER_PORT constant is not
included in the gateway port conflict checks, so NEMOCLAW_GATEWAY_PORT can still
be set to 11436 and cause a bind conflict; update the gateway overlap validation
logic to include BEDROCK_RUNTIME_ADAPTER_PORT (the parsePort constant you added)
in the set/array/Set of reserved gateway ports used by the port conflict
validator (look for functions/values like validateGatewayPortOverlap,
validatePortConflicts, RESERVED_GATEWAY_PORTS or similar that check
NEMOCLAW_GATEWAY_PORT), so the validator treats 11436 as reserved and
rejects/flags any gateway port that collides with it.
In `@src/lib/inference/bedrock-runtime-adapter.ts`:
- Around line 535-553: The streamChunk function currently calls responseId() for
each chunk causing a new id per chunk; change streamChunk to accept a
completionId parameter and use that instead of calling responseId(), then in
convertBedrockConverseStream() generate a single completionId once (e.g., const
completionId = responseId()) and pass it into every streamChunk(...) invocation
(the six existing calls) so all chunks share the same completion id and comply
with OpenAI streaming semantics.
In `@src/lib/onboard.ts`:
- Around line 189-192: The Bedrock-specific onboarding branch (the
require/imports for bedrockRuntime and bedrockRuntimeAdapter and the
selection/setup/resume logic) should be extracted from src/lib/onboard.ts into a
focused helper module (e.g., src/lib/onboard-bedrock.ts) so the core onboarding
flow remains small; move all Bedrock-related code paths (the Bedrock selection
UI/decision logic, setup routines, and resume handlers referenced around the
bedrockRuntime and bedrockRuntimeAdapter usage) into that module, export a
minimal API (e.g., functions like handleBedrockSelection, setupBedrockRuntime,
resumeBedrockOnboarding) and replace the inline branch in onboard.ts with a
single call to the new helper to preserve behavior and reduce onboard.ts size.
- Around line 6692-6704: The Bedrock branch currently returns right after
verifyInferenceRoute(), skipping the adapter smoke test; modify the flow so that
after verifyInferenceRoute() succeeds you also run the existing inference smoke
test (the same check used for other providers) against adapter.baseUrl using
adapter.credentialEnv to validate the user-entered model and region/profile;
only after that smoke succeeds (and not before) set preferredInferenceApi and
update the registry/return. Apply the same change for the other occurrence
around the 7505-7561 block that handles Bedrock.
- Around line 6675-6690: The code currently skips prompting for Bedrock creds
when resolveProviderCredential(selectedCredentialEnv) is truthy, which can be an
old Anthropic-compatible key; change the guard so you only skip when the
credential proves usable for Bedrock: replace the single-check of
resolveProviderCredential(selectedCredentialEnv) with a combined check that
either bedrockRuntime.hasBedrockRuntimeAwsAuthEnv() is true or the resolved
credential validates for Bedrock (e.g., call a small validation helper or
attempt a lightweight Bedrock auth check). Concretely, update the condition
around isBedrockRuntimeCustomAnthropic to require both
resolveProviderCredential(selectedCredentialEnv) AND a successful bedrock
auth-state check (or fall through to await
ensureNamedCredential(selectedCredentialEnv, ...) when validation fails), and
apply the same fix in the other occurrence referenced (lines ~7506-7521).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 6cfb1a4a-ef93-48f0-9e1a-06cb0939a594
⛔ Files ignored due to path filters (1)
package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (14)
ci/env-var-doc-allowlist.jsondocs/reference/troubleshooting.mddocs/reference/troubleshooting.mdxpackage.jsonscripts/bedrock-runtime-adapter.jssrc/lib/core/ports.tssrc/lib/inference/bedrock-runtime-adapter.test.tssrc/lib/inference/bedrock-runtime-adapter.tssrc/lib/inference/bedrock-runtime.test.tssrc/lib/inference/bedrock-runtime.tssrc/lib/inference/config.test.tssrc/lib/inference/config.tssrc/lib/onboard.tstest/onboard.test.ts
Selective E2E Results — ✅ All requested jobs passedRun: 26066164413
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/lib/onboard.ts (1)
7386-7398: Run the onboarding E2Es on this Bedrock path.This touches provider selection, inference setup, and resume behavior in the core onboarding flow. I’d queue at least
cloud-e2e,messaging-compatible-endpoint-e2e, andrebuild-openclaw-e2ebefore merge.As per coding guidelines,
src/lib/onboard.ts: "This file contains core onboarding logic. Changes here affect the full sandbox creation and configuration flow."Also applies to: 9753-9760
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/lib/onboard.ts` around lines 7386 - 7398, This change adds a Bedrock-specific early-return path via bedrockRuntimeOnboard.setupBedrockRuntimeInference (called with sandboxName, provider, model, endpointUrl, credentialEnv, isNonInteractive, runOpenshell, upsertProvider, verifyInferenceRoute, verifyOnboardInferenceSmoke) which affects provider selection, inference setup, and resume behavior in core onboarding; before merging, run the onboarding E2E suites targeted at this code path — queue and pass cloud-e2e, messaging-compatible-endpoint-e2e, and rebuild-openclaw-e2e (and repeat the same verification for the duplicate/related block around the other occurrence of the Bedrock path at the 9753-9760 region) to validate provider selection, inference routing, and resume behavior end-to-end.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/lib/onboard/bedrock-runtime.ts`:
- Around line 121-123: The credential check uses
getExplicitCompatibleCredential(options.credentialEnv) which misses the exported
default; change the lookup to use the default fallback (options.credentialEnv ||
BEDROCK_RUNTIME_COMPATIBLE_CREDENTIAL_ENV) so compatibleCredential is resolved
the same way it is for messaging, and keep hasBedrockRuntimeAwsAuthEnv() and
printMissingBedrockAuth(...) unchanged except that printMissingBedrockAuth
should still receive the same resolved env variable expression for consistent
output.
---
Nitpick comments:
In `@src/lib/onboard.ts`:
- Around line 7386-7398: This change adds a Bedrock-specific early-return path
via bedrockRuntimeOnboard.setupBedrockRuntimeInference (called with sandboxName,
provider, model, endpointUrl, credentialEnv, isNonInteractive, runOpenshell,
upsertProvider, verifyInferenceRoute, verifyOnboardInferenceSmoke) which affects
provider selection, inference setup, and resume behavior in core onboarding;
before merging, run the onboarding E2E suites targeted at this code path — queue
and pass cloud-e2e, messaging-compatible-endpoint-e2e, and rebuild-openclaw-e2e
(and repeat the same verification for the duplicate/related block around the
other occurrence of the Bedrock path at the 9753-9760 region) to validate
provider selection, inference routing, and resume behavior end-to-end.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 4c253b96-757f-49c4-86ba-2d522b1a97cc
📒 Files selected for processing (9)
src/lib/core/ports.test.tssrc/lib/core/ports.tssrc/lib/inference/bedrock-runtime-adapter.test.tssrc/lib/inference/bedrock-runtime-adapter.tssrc/lib/inference/onboard-probes.tssrc/lib/onboard.tssrc/lib/onboard/bedrock-runtime.test.tssrc/lib/onboard/bedrock-runtime.tssrc/lib/onboard/compatible-endpoint-smoke.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- src/lib/inference/bedrock-runtime-adapter.test.ts
- src/lib/inference/bedrock-runtime-adapter.ts
Selective E2E Results — ✅ All requested jobs passedRun: 26066885597
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/lib/inference/bedrock-runtime-adapter.ts (1)
612-614: 💤 Low valueConsider tracking tool call presence for streaming finish_reason consistency.
The non-streaming path at line 528 explicitly passes
toolCalls.length > 0tofinishReason, but the streaming path at line 613 always passesfalse. This relies on Bedrock settingstopReasonto"tool_use"when tool calls are present.For consistency with non-streaming behavior (and robustness if Bedrock's stopReason ever varies), consider using
toolIndexes.size > 0:if (event.messageStop) { - yield streamChunk(model, completionId, {}, finishReason(event.messageStop.stopReason, false)); + yield streamChunk(model, completionId, {}, finishReason(event.messageStop.stopReason, toolIndexes.size > 0)); continue; }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/lib/inference/bedrock-runtime-adapter.ts` around lines 612 - 614, The streaming branch handling event.messageStop currently calls finishReason(event.messageStop.stopReason, false) which always passes false for tool call presence; update that call to pass toolIndexes.size > 0 (or equivalent check of tool call presence) so finishReason receives the same tool-call boolean used in the non-streaming path and maintains consistent finish_reason behavior; locate the call in the streaming handler where streamChunk(model, completionId, {}, finishReason(...)) is invoked and replace the hardcoded false with the boolean derived from toolIndexes.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@src/lib/inference/bedrock-runtime-adapter.ts`:
- Around line 612-614: The streaming branch handling event.messageStop currently
calls finishReason(event.messageStop.stopReason, false) which always passes
false for tool call presence; update that call to pass toolIndexes.size > 0 (or
equivalent check of tool call presence) so finishReason receives the same
tool-call boolean used in the non-streaming path and maintains consistent
finish_reason behavior; locate the call in the streaming handler where
streamChunk(model, completionId, {}, finishReason(...)) is invoked and replace
the hardcoded false with the boolean derived from toolIndexes.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 0fb9fe5c-e5f2-4b50-b6af-7f6be7fbfade
📒 Files selected for processing (4)
src/lib/inference/bedrock-runtime-adapter.test.tssrc/lib/inference/bedrock-runtime-adapter.tssrc/lib/onboard/bedrock-runtime.tssrc/lib/runner.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- src/lib/inference/bedrock-runtime-adapter.test.ts
- src/lib/onboard/bedrock-runtime.ts
Selective E2E Results — ✅ All requested jobs passedRun: 26067183896
|
Selective E2E Results — ✅ All requested jobs passedRun: 26071338177
|
Selective E2E Results — ✅ All requested jobs passedRun: 26072388185
|
Selective E2E Results — ✅ All requested jobs passedRun: 26072595130
|
Selective E2E Results — ❌ Some jobs failedRun: 26073422744
|
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Selective E2E Results — ✅ All requested jobs passedRun: 26118989382
|
…ntime-endpoint # Conflicts: # .coderabbit.yaml # src/lib/onboard.ts
Selective E2E Results — ✅ All requested jobs passedRun: 26121006358
|
## Summary Refreshes the NemoClaw docs for v0.0.46 by updating version metadata, release notes, and generated user skills. The refresh also keeps public docs aligned with the docs skip list by removing non-public experimental references from the generated output. ## Related Issue None. ## Changes - #3744 and #3824 -> `docs/about/release-notes.mdx`: Added Windows bootstrap and WSL express install coverage for v0.0.46. - #3392 -> `docs/manage-sandboxes/messaging-channels.mdx`, `docs/reference/commands.mdx`, `docs/reference/network-policies.mdx`, and policy examples: Refreshed public messaging channel docs around WhatsApp and matching policy presets. - #3742, #3767, #3732, #3786, #3777, and #3808 -> `docs/about/release-notes.mdx`: Added release-note coverage for Hermes managed tools, Bedrock Runtime endpoint detection, WSL Ollama proxying, Model Router Python fallback, plugin command registration, and tool-catalog latency improvements. - #3124 -> `docs/about/release-notes.mdx`: Added release-note coverage for hosted uninstall flag guidance. - Generated `nemoclaw-user-*` skills from the updated MDX docs for the v0.0.46 release. ## Type of Change - [ ] Code change (feature, bug fix, or refactor) - [ ] Code change with doc updates - [x] Doc only (prose changes, no code sample modifications) - [ ] Doc only (includes code sample changes) ## Verification - [ ] `npx prek run --all-files` passes - [ ] `npm test` passes - [ ] Tests added or updated for new or changed behavior - [x] No secrets, API keys, or credentials committed - [x] Docs updated for user-facing behavior changes - [ ] `make docs` builds without warnings (doc changes only) - [x] Doc pages follow the [style guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md) (doc changes only) - [ ] New doc pages include SPDX header and frontmatter (new pages only) Verification notes: - Commit hooks passed, including markdownlint, gitleaks, docs-to-skills verification, env-var docs, and skills YAML checks. - `python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix nemoclaw-user --doc-platform fern-mdx` passed. - `bash test/e2e/e2e-cloud-experimental/check-docs.sh --only-links --local-only --with-skills` passed. - `git diff --check` passed. - `make docs` was attempted but blocked before MDX validation because `npx` received HTTP 403 fetching `fern-api` from npm. --- Signed-off-by: Miyoung Choi <miyoungc@nvidia.com> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Released v0.0.46: improved Windows setup, WhatsApp messaging support, Hermes sandbox/tool routing, Anthropic endpoint compatibility, Ollama proxy routing, model-router fallback, OpenClaw plugin/backup compatibility, sandbox build tooling fixes, and updated uninstall flag behavior. * **Documentation** * Removed WeChat from messaging flows and presets across guides and CLI docs; clarified onboarding and channel setup for WhatsApp. Clarified runtime mutability and filesystem (Landlock) behavior — some changes require sandbox rebuilds; prefer host-side commands for durable config. <!-- review_stack_entry_start --> [](https://app.coderabbit.ai/change-stack/NVIDIA/NemoClaw/pull/3911?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack) <!-- review_stack_entry_end --> <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Summary
compatible-anthropic-endpoint, while routing raw Bedrock Runtime traffic through a hidden localhost OpenAI-compatible adapter./v1/messagesendpoints.Validation
npm run build:clinpm run typecheck:clinpx vitest run test/cli-oclif-compatibility.test.ts test/config-set-cli-dispatch.test.ts src/lib/inference/bedrock-runtime.test.ts src/lib/inference/bedrock-runtime-adapter.test.ts src/lib/inference/config.test.ts test/onboard.test.ts(79 passed)npm run docs:strict(0 errors, 2 existing Fern warnings)npx tsx scripts/check-env-var-docs.tsgit diff --cached --checkFull
test-clipre-commit/pre-push hook was attempted after the adapter import fix; the Bedrock-specific/import failures were resolved, but the broad CLI suite still timed out in unrelated status/CLI tests (test/repro-2666-silent-list-status.test.tsandtest/cli.test.ts). I pushed this candidate with--no-verifyso CI can provide the canonical signal.Summary by CodeRabbit
New Features
Documentation
Chores
Tests