feat(observability): add Prometheus metrics endpoint by HOYALIM · Pull Request #2494 · NVIDIA/NemoClaw

HOYALIM · 2026-04-27T00:04:29Z

Summary

Adds an opt-in, zero-dependency Prometheus metrics endpoint for NemoClaw plugin observability. The metrics path is disabled by default and records blueprint execution, endpoint validation, and sandbox lifecycle timings only when NEMOCLAW_METRICS_ENABLED=true.

Related Issue

Closes #233

Changes

Add a lightweight MetricsRegistry with counters, cumulative histograms, Prometheus exposition formatting, and observeOperation helpers.
Register a /metrics HTTP service from the plugin when metrics are enabled, with NEMOCLAW_METRICS_PORT and NEMOCLAW_METRICS_HOST overrides.
Instrument blueprint plan / apply, endpoint validation, and sandbox create lifecycle paths without changing default behavior.
Add plugin tests for registry behavior, HTTP request handling, server lifecycle, plugin registration, and runner instrumentation.
Update monitoring docs and regenerated agent skill content for the new metrics workflow.
Keep local npm test stable by registering installer and Brev projects only when their CI/credential gates are present, and by making existing network-listener tests tolerate sandboxed environments that reject loopback listeners.

Type of Change

Code change (feature, bug fix, or refactor)
Code change with doc updates
Doc only (prose changes, no code sample modifications)
Doc only (includes code sample changes)

Verification

npx prek run --all-files passes
npm test passes
Tests added or updated for new or changed behavior
No secrets, API keys, or credentials committed
Docs updated for user-facing behavior changes
make docs builds without warnings (doc changes only)
Doc pages follow the style guide (doc changes only)
New doc pages include SPDX header and frontmatter (new pages only)

AI Disclosure

AI-assisted — tool: OpenAI Codex

Signed-off-by: Ho Lim subhoya@gmail.com

Summary by CodeRabbit

New Features
- Prometheus-format /metrics endpoint (disabled by default). Enable via NEMOCLAW_METRICS_ENABLED; host/port configurable (default 127.0.0.1:9090). Endpoint is unauthenticated; plugin conditionally exposes a metrics service and records blueprint, API validation, operation, and sandbox lifecycle metrics.
Documentation
- Expanded monitoring guides with enablement steps, curl example, sample metrics, security warning about binding beyond loopback, and renumbered tutorial steps.
Tests
- Added comprehensive tests for metrics registry, server, startup/shutdown, and integration.

copy-pr-bot · 2026-04-27T00:04:33Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-04-27T00:04:40Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds an optional in-process Prometheus-compatible metrics system (registry, exporter, and HTTP /metrics server), instruments blueprint execution, API validation, and sandbox lifecycle, conditionally registers a plugin service when enabled, and updates tests and docs to cover behavior and security caveats. (≤50 words)

Changes

Cohort / File(s)	Summary
Metrics Core `nemoclaw/src/observability/metrics.ts`, `nemoclaw/src/observability/metrics.test.ts`	New in-memory Prometheus-style registry: types, validation, deterministic series keys, counters & histograms, `observeOperation()` (duration + status), `renderPrometheus()`, env gate `isMetricsEnabled()`, and comprehensive unit tests.
Metrics Server `nemoclaw/src/observability/server.ts`, `nemoclaw/src/observability/server-start.test.ts`, `nemoclaw/src/observability/server.test.ts`	New minimal HTTP server layer: host/port resolution, `createMetricsRequestHandler()` (only `GET /metrics`), `startMetricsServer()` returning `{host, port, close()}` and lifecycle/error handling; tests for binding, request semantics, and error paths.
Blueprint Instrumentation `nemoclaw/src/blueprint/runner.ts`, `nemoclaw/src/blueprint/runner.test.ts`	Wraps `actionPlan`/`actionApply` with `metrics.observeOperation("blueprint_execution", ...)`, adds `validateEndpointForMetrics` for API validation metrics, splits implementations into internal `*Impl` functions, and updates tests to assert emitted metric series (success/error).
Plugin Integration `nemoclaw/src/index.ts`, `nemoclaw/src/register.test.ts`	Plugin conditionally registers `nemoclaw-metrics` service when enabled; service `start` asynchronously starts metrics server and stores handle; `stop` closes server with warnings on failure. Tests mock server lifecycle and validate registration/cleanup.
Docs / Tutorials `.agents/skills/nemoclaw-user-monitor-sandbox/SKILL.md`, `docs/monitoring/monitor-sandbox-activity.md`	Documentation updates: describe `NEMOCLAW_METRICS_ENABLED`, default bind `127.0.0.1:9090`, unauthenticated `/metrics` gotcha and security warning about exposing host, `curl` example, sample metric output, and tutorial step renumbering.
Tests / Harness & Helpers `nemoclaw/src/observability/server.test.ts`, `nemoclaw/src/observability/server-start.test.ts`, `nemoclaw/src/register.test.ts`	New and extended tests covering handler semantics (405/400/404/500), render errors logging, host/port parsing, lifecycle close behavior, and plugin registration when enabled.
Tooling, Config & Tests tweaks `nemoclaw/eslint.config.mjs`, `vitest.config.ts`, `test/legacy-path-guard.test.ts`, `src/lib/preflight.test.ts`, `test/*.{ts}`	ESLint header and TS parser tweak; conditional Vitest projects via env flags; various test robustness improvements (loopback listen guards, timeouts, spawn invocation changes), and small formatting/fixture edits across tests.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Client
  participant Plugin as "NemoClaw Plugin"
  participant Registry as "MetricsRegistry (in-process)"
  participant Server as "Metrics HTTP Server"

  Client->>Plugin: perform action (plan/apply) or call API
  Plugin->>Registry: observeOperation("blueprint_execution", labels, operation)
  Registry->>Registry: increment counters / observe histograms (duration)
  Plugin->>Registry: validateEndpointForMetrics(...)\nrecord api_validation
  Client->>Server: GET /metrics
  Server->>Registry: registry.renderPrometheus()
  Registry-->>Server: Prometheus text payload
  Server-->>Client: 200 + text/plain body

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Suggested labels

enhancement: feature

Suggested reviewers

cv
ericksoa

Poem

🐰 I stitched counters with twitching paws,
Histograms hum with tidy laws,
Blueprints tick and sandboxes sing,
A tiny /metrics bell I bring.
Hop, scrape, watch — observability applause!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 5.26% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely summarizes the main change: adding a Prometheus metrics endpoint as an observability feature.
Linked Issues check	✅ Passed	The PR implementation fully addresses issue `#233` objectives: opt-in metrics via NEMOCLAW_METRICS_ENABLED, /metrics HTTP endpoint, counters and histograms for key operations, zero overhead when disabled, zero external dependencies, Prometheus text format compatibility, and sandbox lifecycle instrumentation.
Out of Scope Changes check	✅ Passed	All changes directly support the core metrics feature. Test refactoring (loopback listeners, vitest config, CLI/e2e timeouts) improves test stability for metrics testing and CI environments without introducing unrelated functionality.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

nemoclaw/eslint.config.mjs (1)

1-4: ⚠️ Potential issue | 🟠 Major

Add the required SPDX header to this source file.

This file is missing the repository-required license header block.

✅ Suggested patch

+// SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+// SPDX-License-Identifier: Apache-2.0
+
 import tseslint from "@typescript-eslint/eslint-plugin";
 import tsparser from "@typescript-eslint/parser";
 import prettier from "eslint-config-prettier";

As per coding guidelines **/*.{js,mjs,ts,tsx,sh,md}: Every source file must include an SPDX license header.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@nemoclaw/eslint.config.mjs` around lines 1 - 4, Add the required repository
SPDX license header at the top of this module (above the import statements) so
the file includes an SPDX-License-Identifier line and any required
copyright/owner line; update the top of eslint.config.mjs (the module containing
the import statements tseslint, tsparser, and prettier) to include the standard
project header block exactly as used by other source files.

🧹 Nitpick comments (1)

nemoclaw/src/observability/server.ts (1)

38-53: Consider restricting to GET method for the /metrics endpoint.

The handler currently serves /metrics regardless of HTTP method. While Prometheus typically uses GET, other methods (POST, DELETE, etc.) would also receive the metrics response. This is minor since it's an internal metrics endpoint, but restricting to GET is more semantically correct.

♻️ Optional: Restrict to GET method

 export function createMetricsRequestHandler(registry: MetricsRegistry) {
   return (req: IncomingMessage, res: ServerResponse): void => {
     const url = new URL(req.url ?? "/", `http://${req.headers.host ?? "localhost"}`);
-    if (url.pathname !== "/metrics") {
+    if (req.method !== "GET" || url.pathname !== "/metrics") {
       res.writeHead(404, { "Content-Type": "text/plain; charset=utf-8" });
       res.end("not found\n");
       return;
     }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@nemoclaw/src/observability/server.ts` around lines 38 - 53, The
createMetricsRequestHandler currently serves metrics for any HTTP method; update
the returned handler to only allow GET: inspect req.method at the top of the
function and if it's not "GET" respond with 405 Method Not Allowed (include an
"Allow: GET" header and a short plain-text body like "method not allowed\n"),
otherwise proceed to validate url.pathname and call registry.renderPrometheus()
as before; modify the handler returned by createMetricsRequestHandler (which
receives IncomingMessage and ServerResponse) to enforce this method check.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/monitoring/monitor-sandbox-activity.md`:
- Around line 84-85: Add a MyST warning admonition (:::{warning}) immediately
before the sentence that recommends changing NEMOCLAW_METRICS_HOST to enable
remote scraping, stating that the /metrics endpoint is unauthenticated and
exposing it beyond localhost may leak operational metadata; include scope (which
services/hosts or networks may be affected), risk (what kind of data could be
exposed), and remediation guidance (prefer scraping over a secure network, use
firewall rules, or bind to loopback and proxy/secure the endpoint) and reference
the environment variables NEMOCLAW_METRICS_HOST and NEMOCLAW_METRICS_PORT and
the /metrics path so readers know exactly which setting and endpoint the warning
applies to.

---

Outside diff comments:
In `@nemoclaw/eslint.config.mjs`:
- Around line 1-4: Add the required repository SPDX license header at the top of
this module (above the import statements) so the file includes an
SPDX-License-Identifier line and any required copyright/owner line; update the
top of eslint.config.mjs (the module containing the import statements tseslint,
tsparser, and prettier) to include the standard project header block exactly as
used by other source files.

---

Nitpick comments:
In `@nemoclaw/src/observability/server.ts`:
- Around line 38-53: The createMetricsRequestHandler currently serves metrics
for any HTTP method; update the returned handler to only allow GET: inspect
req.method at the top of the function and if it's not "GET" respond with 405
Method Not Allowed (include an "Allow: GET" header and a short plain-text body
like "method not allowed\n"), otherwise proceed to validate url.pathname and
call registry.renderPrometheus() as before; modify the handler returned by
createMetricsRequestHandler (which receives IncomingMessage and ServerResponse)
to enforce this method check.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: ff0e53b1-a33c-4999-95bb-ecdb116461cf

📥 Commits

Reviewing files that changed from the base of the PR and between 1f615e2 and 6dc2b5f.

📒 Files selected for processing (15)

.agents/skills/nemoclaw-user-monitor-sandbox/SKILL.md
docs/monitoring/monitor-sandbox-activity.md
nemoclaw/eslint.config.mjs
nemoclaw/src/blueprint/runner.test.ts
nemoclaw/src/blueprint/runner.ts
nemoclaw/src/index.ts
nemoclaw/src/observability/metrics.test.ts
nemoclaw/src/observability/metrics.ts
nemoclaw/src/observability/server-start.test.ts
nemoclaw/src/observability/server.test.ts
nemoclaw/src/observability/server.ts
nemoclaw/src/register.test.ts
src/lib/preflight.test.ts
test/legacy-path-guard.test.ts
vitest.config.ts

Copilot

Pull request overview

This PR adds an opt-in Prometheus-compatible metrics endpoint to the NemoClaw plugin to improve observability of blueprint execution, endpoint validation, and sandbox lifecycle operations, while keeping metrics disabled by default.

Changes:

Introduces a zero-dependency MetricsRegistry (counters + cumulative histograms) and Prometheus text exposition.
Adds an optional /metrics HTTP server (host/port configurable via env vars) and wires it into plugin registration.
Instruments blueprint plan/apply, endpoint SSRF validation, and sandbox creation; updates tests and monitoring docs accordingly.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
`vitest.config.ts`	Gates installer/Brev test projects behind CI/credentials to keep local test runs stable.
`test/legacy-path-guard.test.ts`	Runs the TS guard script via `node --import` with the TSX loader for portability.
`src/lib/preflight.test.ts`	Makes real port-occupation tests tolerant of restricted environments that block loopback listeners.
`nemoclaw/src/register.test.ts`	Adds coverage for conditional metrics service registration and ensures env/registry cleanup between tests.
`nemoclaw/src/observability/metrics.ts`	Implements metrics registry, operation timing helpers, and Prometheus rendering.
`nemoclaw/src/observability/server.ts`	Adds the `/metrics` HTTP server and env-based bind resolution.
`nemoclaw/src/observability/server*.test.ts`	Covers request handling, bind resolution, startup/shutdown, and error paths.
`nemoclaw/src/index.ts`	Registers the optional metrics service when enabled.
`nemoclaw/src/blueprint/runner.ts`	Instruments blueprint execution, endpoint validation, and sandbox create lifecycle with metrics.
`nemoclaw/src/blueprint/runner.test.ts`	Adds assertions that runner paths emit expected metrics when enabled.
`nemoclaw/eslint.config.mjs`	Adjusts TypeScript ESLint project-service settings to keep lint viable with colocated tests.
`docs/monitoring/monitor-sandbox-activity.md`	Documents enabling/scraping the Prometheus metrics endpoint and example metrics.
`.agents/skills/nemoclaw-user-monitor-sandbox/SKILL.md`	Regenerates agent skill content to include the new metrics workflow.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

coderabbitai

🧹 Nitpick comments (1)

nemoclaw/src/observability/server.ts (1)
21-31: Port validation allows privileged ports (0-1023).

This implementation accepts ports 0-65535, while nemoclaw/src/lib/ports.ts:parsePort() restricts to 1024-65535. The difference is likely intentional here: port 0 enables OS-assigned ephemeral ports (useful for testing), and the 9090 default is unprivileged. However, if consistency with the existing port utility is desired, consider documenting why this differs or aligning the ranges.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nemoclaw/src/observability/server.ts` around lines 21 - 31,
resolveMetricsPort currently validates ports 0–65535 which permits privileged
ports; update its validation to accept either 0 (for OS-assigned ephemeral ports
used in tests) or the same unprivileged range used by parsePort (1024–65535) to
keep behavior consistent; modify the check in resolveMetricsPort to allow parsed
=== 0 || (Number.isInteger(parsed) && parsed >= 1024 && parsed <= 65535), update
the thrown Error message to reflect the new allowed range, and add a brief
comment referencing parsePort and the reason 0 is allowed for clarity.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@nemoclaw/src/observability/server.ts`:
- Around line 21-31: resolveMetricsPort currently validates ports 0–65535 which
permits privileged ports; update its validation to accept either 0 (for
OS-assigned ephemeral ports used in tests) or the same unprivileged range used
by parsePort (1024–65535) to keep behavior consistent; modify the check in
resolveMetricsPort to allow parsed === 0 || (Number.isInteger(parsed) && parsed
>= 1024 && parsed <= 65535), update the thrown Error message to reflect the new
allowed range, and add a brief comment referencing parsePort and the reason 0 is
allowed for clarity.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 6331d7f7-45f1-4989-be5e-24d5d6f38e27

📥 Commits

Reviewing files that changed from the base of the PR and between 6dc2b5f and 0fa4868.

📒 Files selected for processing (9)

.agents/skills/nemoclaw-user-monitor-sandbox/SKILL.md
docs/monitoring/monitor-sandbox-activity.md
nemoclaw/eslint.config.mjs
nemoclaw/src/index.ts
nemoclaw/src/observability/metrics.test.ts
nemoclaw/src/observability/metrics.ts
nemoclaw/src/observability/server.test.ts
nemoclaw/src/observability/server.ts
nemoclaw/src/register.test.ts

✅ Files skipped from review due to trivial changes (1)

docs/monitoring/monitor-sandbox-activity.md

🚧 Files skipped from review as they are similar to previous changes (3)

.agents/skills/nemoclaw-user-monitor-sandbox/SKILL.md
nemoclaw/src/observability/server.test.ts
nemoclaw/src/register.test.ts

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (1)

nemoclaw/src/index.ts (1)
335-369: Extract the metrics-service branch out of register().

This block is self-contained, but keeping it inline makes register() even harder to follow alongside provider probing and the before_tool_call hook wiring. Moving it to a small registerMetricsService(api) helper would keep the entrypoint under the repo’s complexity budget and make future observability changes safer to evolve.

As per coding guidelines, TypeScript files must maintain cyclomatic complexity limit of 20, ratcheting down to 15.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nemoclaw/src/index.ts` around lines 335 - 369, The metrics-service block
inside register() is self-contained and should be moved to a small helper to
reduce cyclomatic complexity: create a function registerMetricsService(api: Api)
(or similar) that checks isMetricsEnabled(), calls api.registerService with the
same id "nemoclaw-metrics" and the existing start/stop logic that uses
startMetricsServer, MetricsServer and the logger, and then call that helper from
register(); preserve error handling and the metricsServer variable scoping by
keeping it internal to the helper so behavior is unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@nemoclaw/src/observability/metrics.ts`:
- Around line 210-216: The HELP text is inserted verbatim in renderPrometheus
which can produce invalid Prometheus exposition when metadata.help contains
backslashes or newlines; update renderPrometheus to escape HELP text by
replacing backslashes with double-backslashes and newlines with '\n' (or
otherwise escaping sequences Prometheus expects) before pushing `# HELP ${name}
${metadata.help}` so that the HELP line uses the escaped help string; target the
renderPrometheus method and the usage of metadata.help when building the lines
array to apply the escaping transformation.

In `@nemoclaw/src/observability/server.ts`:
- Around line 62-66: Wrap the call to registry.renderPrometheus() in a try/catch
inside the request handler so a thrown error does not escape: call
registry.renderPrometheus() in the try, send 200 headers and the body on
success, and on catch log the error (use the project logger like processLogger
or console.error) then respond with a 500 status and a safe plain-text error
body; ensure you don't send headers twice (i.e., only call res.writeHead/res.end
in the try or in the catch branch).

In `@test/legacy-path-guard.test.ts`:
- Line 12: The constant TSX_LOADER currently points to an internal file path;
change its value to the documented public module specifier by replacing the
path.join(REPO_ROOT, "node_modules", "tsx", "dist", "loader.mjs") assignment so
TSX_LOADER is simply "tsx" (update any code that imports/uses TSX_LOADER
accordingly), ensuring the module is resolved via package exports rather than a
hard-coded internal file path.

---

Nitpick comments:
In `@nemoclaw/src/index.ts`:
- Around line 335-369: The metrics-service block inside register() is
self-contained and should be moved to a small helper to reduce cyclomatic
complexity: create a function registerMetricsService(api: Api) (or similar) that
checks isMetricsEnabled(), calls api.registerService with the same id
"nemoclaw-metrics" and the existing start/stop logic that uses
startMetricsServer, MetricsServer and the logger, and then call that helper from
register(); preserve error handling and the metricsServer variable scoping by
keeping it internal to the helper so behavior is unchanged.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 1a13a60d-cc91-443f-bd41-2feccd6e91d5

📥 Commits

Reviewing files that changed from the base of the PR and between 0fa4868 and eb8d3fd.

📒 Files selected for processing (15)

.agents/skills/nemoclaw-user-monitor-sandbox/SKILL.md
docs/monitoring/monitor-sandbox-activity.md
nemoclaw/eslint.config.mjs
nemoclaw/src/blueprint/runner.test.ts
nemoclaw/src/blueprint/runner.ts
nemoclaw/src/index.ts
nemoclaw/src/observability/metrics.test.ts
nemoclaw/src/observability/metrics.ts
nemoclaw/src/observability/server-start.test.ts
nemoclaw/src/observability/server.test.ts
nemoclaw/src/observability/server.ts
nemoclaw/src/register.test.ts
src/lib/preflight.test.ts
test/legacy-path-guard.test.ts
vitest.config.ts

✅ Files skipped from review due to trivial changes (4)

nemoclaw/eslint.config.mjs
nemoclaw/src/observability/server.test.ts
docs/monitoring/monitor-sandbox-activity.md
.agents/skills/nemoclaw-user-monitor-sandbox/SKILL.md

🚧 Files skipped from review as they are similar to previous changes (4)

nemoclaw/src/observability/metrics.test.ts
vitest.config.ts
src/lib/preflight.test.ts
nemoclaw/src/blueprint/runner.ts

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

nemoclaw/src/blueprint/runner.ts (1)

293-307: ⚠️ Potential issue | 🟡 Minor

Double validation with misleading source label when CLI override is used.

When --endpoint-url is provided, the endpoint is validated twice:

Line 294: with source="override" ✓
Line 304: with source="blueprint" — but this validates the same overridden URL, not the blueprint's original endpoint

This results in misleading metrics and redundant validation work.

♻️ Suggested fix: skip second validation when override was applied

   let inferenceCfg = { ...inferenceProfiles[profile] };
   if (endpointUrl) {
     const validated = await validateEndpointForMetrics(endpointUrl, "override");
     // Use DNS-pinned URL for HTTP (full SSRF/rebinding protection). For HTTPS,
     // keep the original hostname — TLS certificate validation prevents rebinding
     // since the attacker cannot present a valid cert for the target.
     const safe = endpointUrl.startsWith("https:") ? validated.url : validated.pinnedUrl;
     inferenceCfg = { ...inferenceCfg, endpoint: safe };
-  }
-
-  // Validate the final endpoint (whether from CLI override or blueprint profile)
-  if (inferenceCfg.endpoint) {
+  } else if (inferenceCfg.endpoint) {
+    // Validate the blueprint endpoint (skip if CLI override already validated)
     const validated = await validateEndpointForMetrics(inferenceCfg.endpoint, "blueprint");
     const safe = inferenceCfg.endpoint.startsWith("https:") ? validated.url : validated.pinnedUrl;
     inferenceCfg = { ...inferenceCfg, endpoint: safe };
   }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@nemoclaw/src/blueprint/runner.ts` around lines 293 - 307, The code performs
validateEndpointForMetrics twice when a CLI override (endpointUrl) is supplied,
causing redundant validation and a misleading "blueprint" source; modify the
logic around validateEndpointForMetrics and inferenceCfg so that when
endpointUrl is provided and used to set inferenceCfg.endpoint you skip the
subsequent "blueprint" validation—either by changing the second conditional to
only run when no CLI override was applied (e.g., if (!endpointUrl &&
inferenceCfg.endpoint)) or by using a boolean flag (e.g., overrideApplied) set
after validating endpointUrl; ensure references to validateEndpointForMetrics,
inferenceCfg, endpointUrl, and the "override"/"blueprint" source labels are
adjusted so metrics reflect the true source and the endpoint is validated only
once.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@vitest.config.ts`:
- Around line 10-11: The current boolean runBrevE2e var is too permissive;
change its logic to require both the secret and an explicit opt-in (or CI) so
cloud E2E won't run just because the token exists—replace the definition of
runBrevE2e with a compound check like: token present AND (explicit opt-in env
var e.g. BREV_E2E === '1' OR process.env.CI === 'true'), and apply the same
stricter gating to the other related checks around the e2e-brev suite referenced
in the block at lines 49-59 (update any uses of runBrevE2e or duplicate token
checks to use this new opt-in+token logic).

---

Outside diff comments:
In `@nemoclaw/src/blueprint/runner.ts`:
- Around line 293-307: The code performs validateEndpointForMetrics twice when a
CLI override (endpointUrl) is supplied, causing redundant validation and a
misleading "blueprint" source; modify the logic around
validateEndpointForMetrics and inferenceCfg so that when endpointUrl is provided
and used to set inferenceCfg.endpoint you skip the subsequent "blueprint"
validation—either by changing the second conditional to only run when no CLI
override was applied (e.g., if (!endpointUrl && inferenceCfg.endpoint)) or by
using a boolean flag (e.g., overrideApplied) set after validating endpointUrl;
ensure references to validateEndpointForMetrics, inferenceCfg, endpointUrl, and
the "override"/"blueprint" source labels are adjusted so metrics reflect the
true source and the endpoint is validated only once.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: dfc5eb79-5961-4739-8586-7ec4c528008c

📥 Commits

Reviewing files that changed from the base of the PR and between eb8d3fd and df0181b.

📒 Files selected for processing (15)

.agents/skills/nemoclaw-user-monitor-sandbox/SKILL.md
docs/monitoring/monitor-sandbox-activity.md
nemoclaw/eslint.config.mjs
nemoclaw/src/blueprint/runner.test.ts
nemoclaw/src/blueprint/runner.ts
nemoclaw/src/index.ts
nemoclaw/src/observability/metrics.test.ts
nemoclaw/src/observability/metrics.ts
nemoclaw/src/observability/server-start.test.ts
nemoclaw/src/observability/server.test.ts
nemoclaw/src/observability/server.ts
nemoclaw/src/register.test.ts
src/lib/preflight.test.ts
test/legacy-path-guard.test.ts
vitest.config.ts

✅ Files skipped from review due to trivial changes (3)

docs/monitoring/monitor-sandbox-activity.md
nemoclaw/src/register.test.ts
nemoclaw/src/observability/metrics.ts

🚧 Files skipped from review as they are similar to previous changes (3)

nemoclaw/src/observability/server.test.ts
nemoclaw/src/blueprint/runner.test.ts
nemoclaw/src/observability/server-start.test.ts

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

nemoclaw/src/blueprint/runner.ts (1)

293-305: ⚠️ Potential issue | 🟠 Major

Don’t count an override endpoint as blueprint validation.

With --endpoint-url, Line 294 records source="override", then Lines 303-305 immediately revalidate the same effective endpoint as source="blueprint". That doubles api_validation and mislabels override traffic, which will skew the success-rate metrics this PR is adding.

Suggested fix

   let inferenceCfg = { ...inferenceProfiles[profile] };
   if (endpointUrl) {
     const validated = await validateEndpointForMetrics(endpointUrl, "override");
     // Use DNS-pinned URL for HTTP (full SSRF/rebinding protection). For HTTPS,
     // keep the original hostname — TLS certificate validation prevents rebinding
     // since the attacker cannot present a valid cert for the target.
     const safe = endpointUrl.startsWith("https:") ? validated.url : validated.pinnedUrl;
     inferenceCfg = { ...inferenceCfg, endpoint: safe };
-  }
-
-  // Validate the final endpoint (whether from CLI override or blueprint profile)
-  if (inferenceCfg.endpoint) {
+  } else if (inferenceCfg.endpoint) {
     const validated = await validateEndpointForMetrics(inferenceCfg.endpoint, "blueprint");
     const safe = inferenceCfg.endpoint.startsWith("https:") ? validated.url : validated.pinnedUrl;
     inferenceCfg = { ...inferenceCfg, endpoint: safe };
   }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@nemoclaw/src/blueprint/runner.ts` around lines 293 - 305, The code is
revalidating an endpoint override as if it came from a blueprint, causing
duplicate/mislabelled metrics; update the block that calls
validateEndpointForMetrics for the final endpoint so it only runs when there was
no CLI override (i.e., only when endpointUrl is falsy). Concretely, in runner.ts
adjust the condition around validateEndpointForMetrics/inferenceCfg.endpoint so
that if endpointUrl is provided you skip the second call (or otherwise keep
using the already-validated value), referencing validateEndpointForMetrics,
endpointUrl, and inferenceCfg.endpoint to locate and change the logic.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@nemoclaw/src/blueprint/runner.ts`:
- Around line 339-340: The metrics call in metrics.observeOperation that creates
the "blueprint_execution" series is using the user-controlled label profile
which causes unbounded cardinality; remove profile from the label set passed to
metrics.observeOperation (keep only bounded labels like action and status) where
observeOperation("blueprint_execution", { action: "plan", profile }, () =>
actionPlanImpl(...)) is invoked, and instead ensure the profile value remains
recorded in logs or the plan output (e.g., add it to existing logging around
actionPlanImpl or the returned plan) so you do not lose the profile context.

In `@test/cli.test.ts`:
- Line 310: The per-test Jest timeout is still 15s while runWithEnv("debug
--quick 2>&1", { HOME: home }, 30000) (and the similar call at the other
location) allows the subprocess 30s; update the enclosing tests to increase
their Jest timeout to at least 30000ms (either by adding a timeout argument to
the test(...) call or calling jest.setTimeout(30000) for these tests) so the
test harness won't abort early when runWithEnv takes up to 30s.

In `@test/http-proxy-fix-e2e.test.ts`:
- Around line 274-275: Remove the invalid eslint-disable comments and eliminate
the non-null assertions by adding explicit null checks for the mock object
before accessing its properties (replace occurrences of "mock!.port" and other
"mock!" usages). For each place that currently uses mock! (referenced as mock
and its port property), remove the eslint comment and add a guard such as if
(!mock) throw new Error("expected mock to be initialized") or use a test
assertion like expect(mock).toBeDefined() before accessing mock.port so the code
no longer relies on the non-null assertion operator.

---

Outside diff comments:
In `@nemoclaw/src/blueprint/runner.ts`:
- Around line 293-305: The code is revalidating an endpoint override as if it
came from a blueprint, causing duplicate/mislabelled metrics; update the block
that calls validateEndpointForMetrics for the final endpoint so it only runs
when there was no CLI override (i.e., only when endpointUrl is falsy).
Concretely, in runner.ts adjust the condition around
validateEndpointForMetrics/inferenceCfg.endpoint so that if endpointUrl is
provided you skip the second call (or otherwise keep using the already-validated
value), referencing validateEndpointForMetrics, endpointUrl, and
inferenceCfg.endpoint to locate and change the logic.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 9898b24e-fbdb-461b-80b4-c9b663f06837

📥 Commits

Reviewing files that changed from the base of the PR and between df0181b and ba186cf.

📒 Files selected for processing (18)

.agents/skills/nemoclaw-user-monitor-sandbox/SKILL.md
docs/monitoring/monitor-sandbox-activity.md
nemoclaw/eslint.config.mjs
nemoclaw/src/blueprint/runner.test.ts
nemoclaw/src/blueprint/runner.ts
nemoclaw/src/index.ts
nemoclaw/src/observability/metrics.test.ts
nemoclaw/src/observability/metrics.ts
nemoclaw/src/observability/server-start.test.ts
nemoclaw/src/observability/server.test.ts
nemoclaw/src/observability/server.ts
nemoclaw/src/register.test.ts
src/lib/preflight.test.ts
test/cli.test.ts
test/http-proxy-fix-e2e.test.ts
test/legacy-path-guard.test.ts
test/runtime-shell.test.ts
vitest.config.ts

✅ Files skipped from review due to trivial changes (3)

nemoclaw/eslint.config.mjs
docs/monitoring/monitor-sandbox-activity.md
src/lib/preflight.test.ts

🚧 Files skipped from review as they are similar to previous changes (4)

nemoclaw/src/observability/server.test.ts
nemoclaw/src/observability/server-start.test.ts
.agents/skills/nemoclaw-user-monitor-sandbox/SKILL.md
vitest.config.ts

Signed-off-by: Ho Lim <subhoya@gmail.com>

wscurran · 2026-04-27T15:06:36Z

✨ Thanks for submitting this pull request that proposes a way to add observability metrics to NemoClaw. This identifies an enhancement to the plugin's capabilities and proposes a change to include a Prometheus metrics endpoint.

Related open issues:

#233 Proposal: Lightweight observability and Prometheus-compatible metrics service

jyaunches · 2026-05-20T18:02:30Z

Thanks for the implementation. We do not think we want NemoClaw core to accumulate one-off integrations/exporters like Prometheus directly.

The direction we prefer is a generic observability plugin extension point where core emits stable lifecycle/operation events, and external plugins can provide adapters such as Prometheus, OpenTelemetry, Datadog, etc.

We are going to close this PR in favor of tracking that architecture explicitly. A Prometheus adapter would be a good first consumer once that extension point exists, but we do not think the Prometheus HTTP endpoint should live directly in the main repo/core path.

jyaunches · 2026-05-20T18:02:36Z

Closing in favor of #3915, which tracks the generic observability plugin support path.

Copilot AI review requested due to automatic review settings April 27, 2026 00:04

Copilot started reviewing on behalf of HOYALIM April 27, 2026 00:04 View session