security: redact secret patterns from CLI error output by BenediktSchackenberg · Pull Request #1247 · NVIDIA/NemoClaw

BenediktSchackenberg · 2026-04-01T10:35:25Z

Implements and finalizes the approach from #794 on top of current main.

Summary

add redactSecrets() in bin/lib/runner.js
redact known secret patterns in run(), runInteractive(), and thrown runCapture() errors
add focused coverage in test/runner-redact.test.js

Test plan

./node_modules/.bin/vitest run test/runner-redact.test.js test/runner.test.js

Refs #794

Summary by CodeRabbit

New Features
- Error messages, logs, and captured command output now automatically redact sensitive credentials and API tokens (including various token formats, quoted values, truncated tokens) when commands fail. Redaction is applied consistently to stderr/stdout and handles null/undefined inputs without throwing; repeated calls produce deterministic redaction.
Tests
- Added unit and integration tests validating redaction across secret formats, quoting styles, multiple secrets, truncation/regression cases, and error-capture scenarios.

coderabbitai · 2026-04-01T10:35:50Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds a redactSecrets helper that coerces input to string and redacts credential-like substrings (KEY=... and nvapi-* tokens). Integrates redaction into runner logging and error paths; new tests exercise unit cases and integration scenarios using spawned Node processes.

Changes

Cohort / File(s)	Summary
Core Secret Redaction `bin/lib/runner.js`	Adds `redactSecrets(str)` and exports it. Redacts env-style `KEY=value` (unquoted, single- or double-quoted) for known keys and truncates bare `nvapi-*` tokens. Uses redacted output when printing failing commands (first 80 chars) in `run()`/`runInteractive()` and mutates thrown errors in `runCapture()` to sanitize `err.message`, `err.stderr`, `err.stdout`, and string chunks in `err.output` (unless `ignoreError`).
Test Coverage `test/runner-redact.test.js`	New Vitest suite for `redactSecrets`: unit tests for varied patterns (multiple keys, quoted/unquoted, multiple secrets, empty/null, determinism) and integration tests that spawn Node subprocesses exercising `run`, `runInteractive`, and `runCapture` error paths, asserting secrets are redacted (including truncated-quote regression).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐇 I hop through logs to hide what’s sly,
I tuck the keys where prying eyes can’t spy.
Three stars I place where secrets used to be,
A quiet nibble, neat as can be.
Hop, redact—then nibble happily.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title directly and clearly identifies the main change: adding secret redaction functionality to CLI error output for security purposes.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull request overview

This PR adds a small redaction layer to the CLI runner utilities to prevent accidental credential leakage in command failure output, and introduces focused Vitest coverage to validate the redaction behavior.

Changes:

Add redactSecrets() to bin/lib/runner.js and export it for reuse/tests.
Apply redaction to run() / runInteractive() failure messages and to runCapture() thrown error fields.
Add test/runner-redact.test.js to cover keyed secrets, bare nvapi- tokens, and an integration check for run() stderr.

Reviewed changes

Copilot reviewed 1 out of 2 changed files in this pull request and generated 1 comment.

File	Description
`bin/lib/runner.js`	Introduces `redactSecrets()` and uses it when emitting runner error output / throwing `runCapture()` errors.
`test/runner-redact.test.js`	Adds unit + integration tests to validate redaction for several secret formats and usage paths.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

test/runner-redact.test.js (1)

100-117: Add regression tests for the other modified call paths.

This integration test validates run(), but this PR also changed runInteractive() and runCapture() redaction behavior. Add focused assertions for those two paths so future refactors don’t regress them.

Suggested additional tests

+  it("runInteractive() error output redacts secrets (integration)", () => {
+    const runnerPath = path.join(import.meta.dirname, "..", "bin", "lib", "runner");
+    const script = `
+      const { runInteractive } = require(${JSON.stringify(runnerPath)});
+      process.exit = () => {};
+      runInteractive("false GITHUB_TOKEN=ghp_realSecretValue123");
+    `;
+    const result = spawnSync("node", ["-e", script], {
+      cwd: path.join(import.meta.dirname, ".."),
+      encoding: "utf-8",
+      timeout: 10000,
+    });
+    expect(result.stderr).toContain("GITHUB_TOKEN=***");
+    expect(result.stderr).not.toContain("ghp_realSecretValue123");
+  });
+
+  it("runCapture() thrown error redacts message/stdout/stderr", () => {
+    const runnerPath = path.join(import.meta.dirname, "..", "bin", "lib", "runner");
+    const script = `
+      const { runCapture } = require(${JSON.stringify(runnerPath)});
+      try {
+        runCapture("bash -lc 'echo NVIDIA_API_KEY=nvapi-realSecret >&2; exit 1'");
+      } catch (err) {
+        process.stdout.write(String(err.message) + "\\n");
+        process.stdout.write(String(err.stderr || "") + "\\n");
+      }
+    `;
+    const result = spawnSync("node", ["-e", script], {
+      cwd: path.join(import.meta.dirname, ".."),
+      encoding: "utf-8",
+      timeout: 10000,
+    });
+    expect(result.stdout).toContain("NVIDIA_API_KEY=***");
+    expect(result.stdout).not.toContain("nvapi-realSecret");
+  });

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/runner-redact.test.js` around lines 100 - 117, Add integration
assertions for the other changed call paths by adding tests similar to the
existing run() test that spawn a child requiring the same runner module (use the
same runnerPath) and invoke runInteractive(...) and runCapture(...) with
commands containing a secret token (e.g.,
"NVIDIA_API_KEY=nvapi-realSecretValue123"); capture the child's stderr/stdout
and assert that outputs contain the redacted form "NVIDIA_API_KEY=***" and do
not contain the raw secret "nvapi-realSecretValue123". Locate tests around the
existing "run() error output redacts secrets (integration)" test and create two
focused specs that call runInteractive and runCapture respectively, ensuring you
override process.exit in the child script as done now and keep the same
spawnSync options (cwd, encoding, timeout).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bin/lib/runner.js`:
- Line 47: The command preview is truncated before redaction which can leak
partial secret tokens; change both occurrences that call
redactSecrets(cmd.slice(0, 80)) (the console.error on the reported lines) to
redactSecrets(cmd).slice(0, 80) so the full command is redacted first and then
truncated for display, updating both places where redactSecrets is used with
cmd.slice.
- Around line 19-26: The hardcoded allowlist keyedSecrets only covers specific
names and misses arbitrary credential keys (e.g., TEST_API_KEY), so update the
redaction logic to also match generic patterns like
/.*(_TOKEN|_API_KEY|_SECRET|_PASSWORD)$/i before falling back to the explicit
keyedSecrets array; locate the redaction usage around keyedSecrets in runner.js
(and any related functions that check for keys at lines near 29-31) and
implement pattern-based checks first, then fallback to the existing explicit
list to preserve strict matches.

---

Nitpick comments:
In `@test/runner-redact.test.js`:
- Around line 100-117: Add integration assertions for the other changed call
paths by adding tests similar to the existing run() test that spawn a child
requiring the same runner module (use the same runnerPath) and invoke
runInteractive(...) and runCapture(...) with commands containing a secret token
(e.g., "NVIDIA_API_KEY=nvapi-realSecretValue123"); capture the child's
stderr/stdout and assert that outputs contain the redacted form
"NVIDIA_API_KEY=***" and do not contain the raw secret
"nvapi-realSecretValue123". Locate tests around the existing "run() error output
redacts secrets (integration)" test and create two focused specs that call
runInteractive and runCapture respectively, ensuring you override process.exit
in the child script as done now and keep the same spawnSync options (cwd,
encoding, timeout).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8292af6c-919c-4c6a-ac8d-d8e32eb9eb0e

📥 Commits

Reviewing files that changed from the base of the PR and between e06c147 and de58663.

📒 Files selected for processing (2)

bin/lib/runner.js
test/runner-redact.test.js

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/runner-redact.test.js`:
- Around line 125-132: The test's intended long padding is not created because
`"x" * 5000` produces NaN; update the test to generate a long string correctly
by replacing the invalid multiplication with a real repeat operation (e.g., use
"x".repeat(5000) or Array(5001).join('x')) when assigning longPadding in
test/runner-redact.test.js so that the run(...) call constructs a truly long
command and triggers the truncation behavior for the run function.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ddc22390-3b06-428c-97d9-854722e44a51

📥 Commits

Reviewing files that changed from the base of the PR and between de58663 and d9d80b9.

📒 Files selected for processing (2)

bin/lib/runner.js
test/runner-redact.test.js

"x" * 5000 evaluates to NaN, which JSON.stringify converts to "null" (4 chars). Replace with "x".repeat(5000) to actually create a long padding string that triggers the truncation scenario. Fixes CodeRabbit review comment on NVIDIA#1247.

BenediktSchackenberg · 2026-04-01T11:38:47Z

Fixed — replaced "x" * 5000 (evaluates to NaN, JSON-serialized as "null") with "x".repeat(5000) so the test actually creates a long command that triggers the truncation scenario.

coderabbitai

🧹 Nitpick comments (1)

test/runner-redact.test.js (1)

106-156: Consider extracting a helper for repeated spawnSync integration setup.

The three integration tests duplicate the same process-launch scaffolding. A local helper would reduce maintenance cost and keep assertions focused.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/runner-redact.test.js` around lines 106 - 156, The three tests duplicate
spawnSync scaffolding—extract a small helper (e.g., spawnRunnerScript or
runChildWithScript) that takes the inline script (and optional timeout/args) and
returns spawnSync(...) result; have it construct runnerPath using
path.join(import.meta.dirname, "..", "bin", "lib", "runner"), set cwd to
path.join(import.meta.dirname, ".."), encoding: "utf-8", and timeout: 10000, and
then replace the repeated spawnSync blocks in the tests that call run(),
runInteractive(), and the truncated-command case to use this helper so
assertions remain focused.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@test/runner-redact.test.js`:
- Around line 106-156: The three tests duplicate spawnSync scaffolding—extract a
small helper (e.g., spawnRunnerScript or runChildWithScript) that takes the
inline script (and optional timeout/args) and returns spawnSync(...) result;
have it construct runnerPath using path.join(import.meta.dirname, "..", "bin",
"lib", "runner"), set cwd to path.join(import.meta.dirname, ".."), encoding:
"utf-8", and timeout: 10000, and then replace the repeated spawnSync blocks in
the tests that call run(), runInteractive(), and the truncated-command case to
use this helper so assertions remain focused.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 23fe97f2-2048-44ae-b35a-abf6814aaeff

📥 Commits

Reviewing files that changed from the base of the PR and between d9d80b9 and 52abf31.

📒 Files selected for processing (1)

test/runner-redact.test.js

Adds redactSecrets() to sanitize credential-like substrings from command strings and error fields before they appear in CLI output: - Hardcoded key allowlist (NVIDIA_API_KEY, GITHUB_TOKEN, etc.) - Generic env-style matcher for *_TOKEN, *_API_KEY, *_SECRET, etc. - Redacts nvapi- bearer tokens - Redacts before truncating command previews (fixes leak on long cmds) Tests cover unit cases, integration scenarios, and a regression test for the truncated-quoted-command case. Signed-off-by: Benedikt Schackenberg <6381261+BenediktSchackenberg@users.noreply.github.com>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bin/lib/runner.js`:
- Around line 88-93: The execSync error handling currently redacts err.message,
err.stderr and err.stdout but misses err.output; update the handling around the
execSync error (where redactSecrets is called) to also redact err.output by
checking if err.output exists and mapping/redacting each non-null element
(indexes 0..2) with redactSecrets (or converting buffers to strings first),
replacing err.output with the redacted array to avoid leaking secrets when the
full error object is logged or serialized.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5646a711-de26-408a-ab8e-179cd012522d

📥 Commits

Reviewing files that changed from the base of the PR and between 52abf31 and 79313ec.

📒 Files selected for processing (2)

bin/lib/runner.js
test/runner-redact.test.js

execSync errors include an output array [stdin, stdout, stderr] separate from the .stdout/.stderr properties. Callers that serialize the full error object could leak unredacted secrets via err.output[1/2]. Now maps over err.output and redacts each non-null element (skipping index 0 which is the stdin placeholder). Adds a regression test to verify the output array is redacted. Signed-off-by: Benedikt Schackenberg <6381261+BenediktSchackenberg@users.noreply.github.com>

BenediktSchackenberg · 2026-04-01T19:45:59Z

Fixed — err.output array is now redacted in the runCapture error handler. Elements at index 0 (stdin placeholder) are skipped; stdout/stderr chunks (index 1/2) are passed through redactSecrets. Added a regression test to cover the output array case.

BenediktSchackenberg · 2026-04-01T19:47:31Z

@kjw3 — could you approve the CI run for this PR? GitHub is blocking Actions from fork PRs until a maintainer approves. Latest commit is 7054f83.

coderabbitai

🧹 Nitpick comments (1)

test/runner-redact.test.js (1)

170-183: Consider adding positive assertion for err.output redaction.

The test verifies secrets are removed from err.output but doesn't assert that the redacted placeholder (GITHUB_TOKEN=***) is present. Adding a positive assertion would strengthen the test by confirming redaction actually occurred rather than just secret removal.

💡 Suggested improvement

   it("runCapture() thrown error redacts err.output array", () => {
     try {
       runCapture("bash -lc 'echo GITHUB_TOKEN=ghp_secretOutputValue >&2; exit 1'");
       throw new Error("expected runCapture to throw");
     } catch (err) {
       if (Array.isArray(err.output)) {
+        const stderrChunk = err.output[2]; // stderr is at index 2
+        if (stderrChunk != null) {
+          expect(String(stderrChunk)).toContain("GITHUB_TOKEN=***");
+        }
         for (const chunk of err.output) {
           if (chunk != null) {
             expect(String(chunk)).not.toContain("ghp_secretOutputValue");
           }
         }
       }
     }
   });

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/runner-redact.test.js` around lines 170 - 183, The test should not only
assert the secret is absent but also that redaction occurred; update the
it("runCapture() thrown error redacts err.output array"...) test to check
err.output for the redacted placeholder by verifying at least one output chunk
contains the string "GITHUB_TOKEN=***" (use runCapture and err.output as shown),
e.g., iterate err.output and set a foundRedaction flag when
String(chunk).includes("GITHUB_TOKEN=***"), then assert that flag is true so the
test positively confirms redaction occurred.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@test/runner-redact.test.js`:
- Around line 170-183: The test should not only assert the secret is absent but
also that redaction occurred; update the it("runCapture() thrown error redacts
err.output array"...) test to check err.output for the redacted placeholder by
verifying at least one output chunk contains the string "GITHUB_TOKEN=***" (use
runCapture and err.output as shown), e.g., iterate err.output and set a
foundRedaction flag when String(chunk).includes("GITHUB_TOKEN=***"), then assert
that flag is true so the test positively confirms redaction occurred.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a0906393-30fc-43b0-b2c8-37f7244ca048

📥 Commits

Reviewing files that changed from the base of the PR and between 79313ec and 7054f83.

📒 Files selected for processing (2)

bin/lib/runner.js
test/runner-redact.test.js

🚧 Files skipped from review as they are similar to previous changes (1)

bin/lib/runner.js

cv · 2026-04-02T05:21:55Z

Closing in favor of #1246 which covers the same redaction scope and more:

fix(security): redact secret patterns from CLI log and error output #1246 intercepts child output streams by changing stdio from inherit to pipe, then redacts before writing to the terminal. security: redact secret patterns from CLI error output #1247 only redacts error messages and command strings — child stdout/stderr with secrets goes directly to the terminal unredacted.
fix(security): redact secret patterns from CLI log and error output #1246 handles URL credentials (userinfo, auth query params). security: redact secret patterns from CLI error output #1247 does not.
fix(security): redact secret patterns from CLI log and error output #1246 has passing CI. security: redact secret patterns from CLI error output #1247 has a failure.
security: redact secret patterns from CLI error output #1247 has a broken test — the truncation regression test uses "x" * 5000 (produces NaN) instead of "x".repeat(5000), so the test passes vacuously without testing truncation.

Thanks @BenediktSchackenberg for writing both approaches — having them side by side made the decision straightforward.

Copilot AI review requested due to automatic review settings April 1, 2026 10:35

Copilot started reviewing on behalf of BenediktSchackenberg April 1, 2026 10:36 View session

Copilot AI reviewed Apr 1, 2026

View reviewed changes

Comment thread test/runner-redact.test.js

coderabbitai Bot reviewed Apr 1, 2026

View reviewed changes

Comment thread bin/lib/runner.js

Comment thread bin/lib/runner.js Outdated

coderabbitai Bot reviewed Apr 1, 2026

View reviewed changes

Comment thread test/runner-redact.test.js

coderabbitai Bot reviewed Apr 1, 2026

View reviewed changes

BenediktSchackenberg mentioned this pull request Apr 1, 2026

refactor(cli): shell consolidation, TypeScript migration & oclif #924

Open

This was referenced Apr 1, 2026

epic: structured memory and secret-safe persistence #1231

Open

security: block secret writes into persistent workspace memory #1233

Closed

BenediktSchackenberg force-pushed the fix/redact-secrets-794 branch from acdec44 to 79313ec Compare April 1, 2026 19:35

coderabbitai Bot reviewed Apr 1, 2026

View reviewed changes

Comment thread bin/lib/runner.js

coderabbitai Bot reviewed Apr 1, 2026

View reviewed changes

Merge branch 'main' into fix/redact-secrets-794

c644b49

wscurran added security Potential vulnerability, unsafe behavior, or access risk priority: high labels Apr 1, 2026

Merge branch 'main' into fix/redact-secrets-794

2997be9

cv closed this Apr 2, 2026

This was referenced Apr 2, 2026

fix(security): redact secret patterns from CLI log and error output #1246

Merged

security: redact secret patterns from CLI log and error output #794

Closed

wscurran added area: cli Command line interface, flags, terminal UX, or output and removed priority: high labels Jun 3, 2026

wscurran added bug-fix PR fixes a bug or regression and removed NemoClaw CLI labels Jun 3, 2026

Conversation

BenediktSchackenberg commented Apr 1, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

BenediktSchackenberg commented Apr 1, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

BenediktSchackenberg commented Apr 1, 2026

Uh oh!

BenediktSchackenberg commented Apr 1, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cv commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

BenediktSchackenberg commented Apr 1, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 1, 2026 •

edited

Loading