test: eliminate source-shape assertions#2829
Conversation
Add a source-shape detector and budget gate, then replace production-source-text assertions across the CLI/plugin test suite with behavior coverage. Behavior replacements use subprocess fakes, temp fixtures, shell snippet execution, emitted files, stdout/stderr/exit-code assertions, and direct helper behavior instead of asserting on implementation strings. Experiments: kept runs #1-#88 Metric: source_shape_cases 317 → 0 (-100%); source_shape_assertions 819 → 0; source_shape_files 23 → 0
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughIntroduces a source-shape scanner CLI, budget JSON, npm scripts and a pre-commit hook to enforce a test-budget; concurrently refactors many brittle text-inspection tests into runtime execution-based tests, and adds/updates several unit tests tightening credential and deploy behavior assertions. Changes
Sequence DiagramsequenceDiagram
participant Dev as Developer
participant Pre as Pre-commit Hook
participant NPM as npm
participant CLI as Scanner CLI
participant AST as AST / Heuristics
participant CFG as Budget Config
participant Exit as Process Exit
Dev->>Pre: git commit
Pre->>NPM: run `npm run source-shape:check`
NPM->>CLI: invoke scanner CLI
CLI->>AST: parse repo, analyze tests
AST->>AST: detect source-read vars & assertions
CLI->>CFG: read `ci/source-shape-test-budget.json`
CLI->>AST: compare detected cases to budget
alt budget exceeded
CLI->>Exit: set exit code = 1
Exit->>Pre: fail hook
Pre->>Dev: reject commit
else budget OK
CLI->>Exit: set exit code = 0
Exit->>Pre: pass hook
Pre->>Dev: commit proceeds
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested labels
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 8
🧹 Nitpick comments (7)
test/fetch-guard-patch-regression.test.ts (2)
87-92: ⚡ Quick winRun the generated patch script in strict bash mode.
Line 88 writes a script without
set -euo pipefail; intermediate failures inside a multi-command block can be missed and still report success.Proposed change
- fs.writeFileSync(scriptPath, ["#!/usr/bin/env bash", command].join("\n"), { mode: 0o700 }); + fs.writeFileSync( + scriptPath, + ["#!/usr/bin/env bash", "set -euo pipefail", command].join("\n"), + { mode: 0o700 }, + );🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/fetch-guard-patch-regression.test.ts` around lines 87 - 92, The generated test script is missing strict bash options, so update the script content written by fs.writeFileSync (the script created at scriptPath) to include a strict mode header (e.g., add "set -euo pipefail" immediately after the shebang) so any intermediate failures in multi-command blocks cause the script to fail; keep the shebang (#!/usr/bin/env bash) and then insert the strict flags before writing the command string that is later executed via spawnSync.
83-87: ⚡ Quick winUse
replaceAllfor the OpenClaw dist-path rewrite.Line 86 only rewrites the first occurrence. If the Dockerfile patch block gains another matching path, the test can hit the real absolute path unexpectedly.
Proposed change
- ).replace("/usr/local/lib/node_modules/openclaw/dist", dist); + ).replaceAll("/usr/local/lib/node_modules/openclaw/dist", dist);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/fetch-guard-patch-regression.test.ts` around lines 83 - 87, The test constructs the Docker command into the variable command using dockerRunCommandBetween and then calls .replace to rewrite the OpenClaw dist path, but .replace only replaces the first occurrence; change that call to use a global replace (e.g., replaceAll) so every instance of "/usr/local/lib/node_modules/openclaw/dist" is rewritten to the test-local dist value; update the expression that assigns command (the code calling .replace) to use replaceAll with the same search string and the dist variable so no stray absolute paths remain.test/sandbox-provisioning.test.ts (1)
166-170: 💤 Low valueConsider making whitespace normalization more robust.
The
.replace('python3 -c " ', 'python3 -c "')fix depends on exact Dockerfile formatting. If the Dockerfile's line continuation style changes, this could silently break the test.Consider normalizing whitespace more broadly in
dockerRunCommandBetween, or adding a comment explaining why this specific fix is needed.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sandbox-provisioning.test.ts` around lines 166 - 170, The test's fragile post-processing uses a specific string replace after calling dockerRunCommandBetween, which will break if spacing/line-continuation in the Dockerfile changes; instead update dockerRunCommandBetween to normalize whitespace and line-continuations (collapse consecutive spaces, trim leading/trailing spaces, and join backslash-continuation lines) so returned command is stable, or at minimum replace the ad-hoc .replace call with a broader normalization (e.g., collapse multiple spaces around python3 -c and remove stray space after the opening quote) and add a short comment in the test referencing dockerRunCommandBetween to explain why normalization is required.test/service-env.test.ts (1)
202-238: 🏗️ Heavy liftThese tests still key off
nemoclaw-start.shcomment anchors.Both cases locate a block with
indexOf(...)and executesrc.slice(...). That means a benign reorder or comment rewrite in the entrypoint can fail the test without any behavioral regression, which keeps the suite coupled to source shape. Prefer driving the real script with controlled env/paths, or extracting these shell helpers into directly invokable files.Also applies to: 323-367
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/service-env.test.ts` around lines 202 - 238, The test currently relies on fragile string anchors (start/end, src.slice) and writes a temporary shell snippet (tmpFile) to assert SSL_CERT_FILE handling; instead, change the test to drive the real entrypoint or extracted helper directly by executing the actual nemoclaw-start.sh (or refactor the shell helper into a committed, invokable script) with controlled environment and temporary paths, replacing the indexOf/src.slice logic and temporary concatenation (writeFileSync/execFileSync) with a direct execFileSync call against the real script (passing a fake CA bundle path via env and asserting output), or load the shell helper file and invoke it as a standalone script so the test verifies behavior not source layout.test/sandbox-init.test.ts (1)
512-543: 🏗️ Heavy liftThis still depends on
nemoclaw-start.shsource layout.Locating the block via
_SANDBOX_INIT=and# Harden: limit process countmeans a harmless refactor of the entrypoint can break the test even if it still sourcessandbox-init.shcorrectly. That keeps the test coupled to production text shape; a more robust version would invoke the real script path (or a thin wrapper around it) and assert the sentinel effect without slicing by marker text.scripts/find-source-shape-tests.ts (1)
77-85: ⚡ Quick winUse
readdirSync(..., { withFileTypes: true })to avoid an extrastatSyncper entry.This script runs in local hooks/CI paths; cutting one syscall per directory entry improves scan latency with no behavior change.
Suggested refactor
- for (const entry of readdirSync(dir)) { - const abs = join(dir, entry); + for (const entry of readdirSync(dir, { withFileTypes: true })) { + const abs = join(dir, entry.name); if (isSkippedPath(abs)) continue; - const stats = statSync(abs); - if (stats.isDirectory()) { + if (entry.isDirectory()) { yield* walkFiles(abs); - } else if (stats.isFile()) { + } else if (entry.isFile()) { yield abs; } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/find-source-shape-tests.ts` around lines 77 - 85, The loop in walkFiles uses readdirSync(dir) plus statSync(abs) per entry; replace it by calling readdirSync(dir, { withFileTypes: true }) and iterate Dirent objects so you can avoid statSync: compute abs with join(dir, dirent.name), continue when isSkippedPath(abs), then use dirent.isDirectory() to recurse with yield* walkFiles(abs) or dirent.isFile() to yield abs. Update references in scripts/find-source-shape-tests.ts (walkFiles) accordingly and remove the now-unnecessary statSync call.test/credential-exposure.test.ts (1)
145-167: ⚡ Quick winExtend the secret non-forwarding assertion to plugin env builder too.
Line 153 only verifies
buildCliSubprocessEnv(). If plugin env filtering regresses to includeNVIDIA_API_KEY, this test still passes. Please assert both builders here, similar to the TLS parity tests.Based on learnings: Security-sensitive code paths (credential sanitization, SSRF validation, policy enforcement) require extra test coverage to maintain security integrity.Suggested patch
try { process.env.NVIDIA_API_KEY = "nvapi-secret-should-not-leak"; process.env.PATH = `/tmp/nemoclaw-fake-bin:${process.env.PATH || ""}`; - const env = buildCliSubprocessEnv(); - expect(env.NVIDIA_API_KEY).toBeUndefined(); - expect(env.PATH).toContain("/tmp/nemoclaw-fake-bin"); + for (const buildSubprocessEnv of [buildCliSubprocessEnv, buildPluginSubprocessEnv]) { + const env = buildSubprocessEnv(); + expect(env.NVIDIA_API_KEY).toBeUndefined(); + expect(env.PATH).toContain("/tmp/nemoclaw-fake-bin"); + } } finally {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/credential-exposure.test.ts` around lines 145 - 167, The current test only checks buildCliSubprocessEnv for leaking NVIDIA_API_KEY; update the test to also call buildPluginSubprocessEnv() and assert the same expectations (expect(env.NVIDIA_API_KEY).toBeUndefined() and expect(env.PATH).toContain("/tmp/nemoclaw-fake-bin")) so both subprocess env builders are covered; locate the assertions around buildCliSubprocessEnv in the "subprocess env builder does not spread full process.env into subprocesses" test and duplicate the checks for buildPluginSubprocessEnv.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/find-source-shape-tests.ts`:
- Around line 324-337: The detector misses direct expect(readFileSync(...))
assertions because assertionFromCall only tests isReadFileCall on the expect
call (expectBase) and checks textual membership against sourceVars; fix by
inspecting the actual subject expression node: grab expectBase.arguments[0] as
an Expression (not only getText) and set referencesSource =
[...sourceVars].some(name => textContainsIdentifier(subjectText, name)) OR
isReadFileCall(subjectNode); then allow the assertion when either condition is
true. Apply the same change to the analogous check referenced around the other
occurrence (lines noted) so direct expect(readFileSync(...)) assertions are
detected.
- Around line 350-359: The isTestCall function currently only checks direct
Identifier and single PropertyAccessExpression names and misses wrapped/chained
forms like it.only, test.each(...)(...), or obj.test.only; update isTestCall
(and add a small helper e.g. getRootIdentifierName) to recursively unwrap
CallExpression and PropertyAccessExpression nodes: if node.expression is a
CallExpression, recurse into its expression; if PropertyAccessExpression,
recurse into its expression but also check the final name; finally when you
reach an Identifier check for "it" or "test". Ensure isTestCall accepts
ts.CallExpression and uses that helper so wrapped forms are detected.
In `@test/exec-approvals-path-regression.test.ts`:
- Around line 7-13: The test case it("unified layout and mutable-default
permissions are covered by behavior suites", ...) currently contains a no-op
assertion expect(true).toBe(true); replace this placeholder with a real
regression check that exercises the exec-approvals path (for example, invoke the
function/fixture that triggers exec-approvals behavior and assert expected side
effects or outputs), or if you cannot implement such an assertion here, remove
the entire test block to avoid a misleading passing test; ensure the replacement
references the actual helper/fixture names used to exercise exec-approvals so
the test genuinely guards the regression.
In `@test/nemoclaw-start.test.ts`:
- Around line 760-768: The harness currently in runSlackGuardHarness injects the
guard script but doesn't enable the sandbox gate so SLACK_GUARD_EOF returns
early; modify the generated inline script so it sets
process.env.OPENSHELL_SANDBOX = "1" (or otherwise enables the sandbox) before
inserting the extractGuardScript() content so the Slack guard actually installs
and the subsequent assertions exercise the interception path.
In `@test/runner.test.ts`:
- Around line 703-707: Replace the current existence-only assertions in the
"installers expose explicit, reviewable scripts instead of piping curl to shell"
test with a runtime harness: locate the test (the it(...) block) and execute
both ../install.sh and ../scripts/install.sh via child_process.spawn/exec with a
controlled env that injects a fake "curl" and "sh" into PATH (or uses PATH
overrides) to detect any attempt to pipe to shell; assert the fake curl is never
executed (or that the script exits without invoking external network fetches)
and/or that the script explicitly reads local files rather than invoking
subprocesses that match the "curl | sh" pattern—mirror the installer harness
pattern used elsewhere in the test suite to stub external commands and assert
calls (use the test name and the path.join(import.meta.dirname, "..",
"install.sh") and path.join(import.meta.dirname, "..", "scripts", "install.sh")
targets to locate the scripts).
- Around line 638-661: The test is nondeterministic because the stub's PATH
includes "/usr/bin" which may contain a real gh binary and it doesn't assert
process exit status; update the stub to set PATH to only the controlled tmpBin
(remove "/usr/bin:/bin") so the test cannot accidentally hit the gh-present
branch, and add an explicit assertion on spawnSync's result.status (use the
local variable result and assert result.status === 0 or the expected exit code)
after spawning to ensure the script executed successfully; locate and modify the
variables/script snippet named scriptPath, tmpBin, stub and check result.status
from spawnSync in the same test block.
In `@test/sandbox-provisioning.test.ts`:
- Around line 206-209: The spawnSync call that executes the sourced script (the
variable sourceScript) can hang because it lacks a timeout; update the options
object passed to spawnSync (the call where env is set to { ...process.env,
NEMOCLAW_CODEX_ACP_HOME: tmp }) to include a timeout (use the same 5000 ms used
elsewhere in this test file) so that the test fails fast instead of blocking
indefinitely.
In `@test/wsl2-probe-timeout.test.ts`:
- Around line 66-105: The helper runProbeWithCurlStatuses should freeze
platform.isWsl so retry-arg calculations stay hermetic: inside the function,
require the platform module used by
getValidationProbeCurlArgs/onboard-inference-probes, save the original
platform.isWsl, set platform.isWsl = () => false (or a boolean) before requiring
probeOpenAiLikeEndpoint, then restore the original isWsl in the finally block
(alongside restoring httpProbe.runCurlProbe and clearing require.cache). This
ensures probeOpenAiLikeEndpoint and getValidationProbeCurlArgs compute the
expected 20/30 retry args regardless of the real host.
---
Nitpick comments:
In `@scripts/find-source-shape-tests.ts`:
- Around line 77-85: The loop in walkFiles uses readdirSync(dir) plus
statSync(abs) per entry; replace it by calling readdirSync(dir, { withFileTypes:
true }) and iterate Dirent objects so you can avoid statSync: compute abs with
join(dir, dirent.name), continue when isSkippedPath(abs), then use
dirent.isDirectory() to recurse with yield* walkFiles(abs) or dirent.isFile() to
yield abs. Update references in scripts/find-source-shape-tests.ts (walkFiles)
accordingly and remove the now-unnecessary statSync call.
In `@test/credential-exposure.test.ts`:
- Around line 145-167: The current test only checks buildCliSubprocessEnv for
leaking NVIDIA_API_KEY; update the test to also call buildPluginSubprocessEnv()
and assert the same expectations (expect(env.NVIDIA_API_KEY).toBeUndefined() and
expect(env.PATH).toContain("/tmp/nemoclaw-fake-bin")) so both subprocess env
builders are covered; locate the assertions around buildCliSubprocessEnv in the
"subprocess env builder does not spread full process.env into subprocesses" test
and duplicate the checks for buildPluginSubprocessEnv.
In `@test/fetch-guard-patch-regression.test.ts`:
- Around line 87-92: The generated test script is missing strict bash options,
so update the script content written by fs.writeFileSync (the script created at
scriptPath) to include a strict mode header (e.g., add "set -euo pipefail"
immediately after the shebang) so any intermediate failures in multi-command
blocks cause the script to fail; keep the shebang (#!/usr/bin/env bash) and then
insert the strict flags before writing the command string that is later executed
via spawnSync.
- Around line 83-87: The test constructs the Docker command into the variable
command using dockerRunCommandBetween and then calls .replace to rewrite the
OpenClaw dist path, but .replace only replaces the first occurrence; change that
call to use a global replace (e.g., replaceAll) so every instance of
"/usr/local/lib/node_modules/openclaw/dist" is rewritten to the test-local dist
value; update the expression that assigns command (the code calling .replace) to
use replaceAll with the same search string and the dist variable so no stray
absolute paths remain.
In `@test/sandbox-provisioning.test.ts`:
- Around line 166-170: The test's fragile post-processing uses a specific string
replace after calling dockerRunCommandBetween, which will break if
spacing/line-continuation in the Dockerfile changes; instead update
dockerRunCommandBetween to normalize whitespace and line-continuations (collapse
consecutive spaces, trim leading/trailing spaces, and join
backslash-continuation lines) so returned command is stable, or at minimum
replace the ad-hoc .replace call with a broader normalization (e.g., collapse
multiple spaces around python3 -c and remove stray space after the opening
quote) and add a short comment in the test referencing dockerRunCommandBetween
to explain why normalization is required.
In `@test/service-env.test.ts`:
- Around line 202-238: The test currently relies on fragile string anchors
(start/end, src.slice) and writes a temporary shell snippet (tmpFile) to assert
SSL_CERT_FILE handling; instead, change the test to drive the real entrypoint or
extracted helper directly by executing the actual nemoclaw-start.sh (or refactor
the shell helper into a committed, invokable script) with controlled environment
and temporary paths, replacing the indexOf/src.slice logic and temporary
concatenation (writeFileSync/execFileSync) with a direct execFileSync call
against the real script (passing a fake CA bundle path via env and asserting
output), or load the shell helper file and invoke it as a standalone script so
the test verifies behavior not source layout.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: f5d8537f-19c8-4962-8e5e-b361100e14f0
📒 Files selected for processing (30)
.pre-commit-config.yamlci/source-shape-test-budget.jsonnemoclaw/src/blueprint/runner.test.tspackage.jsonscripts/find-source-shape-tests.tssrc/lib/deploy.test.tstest/cli.test.tstest/credential-exposure.test.tstest/credentials.test.tstest/dns-proxy.test.tstest/exec-approvals-path-regression.test.tstest/fetch-guard-patch-regression.test.tstest/http-proxy-fix-sync.test.tstest/install-preflight.test.tstest/local-inference-setup.test.tstest/nemoclaw-start.test.tstest/nemotron-inference-fix.test.tstest/onboard.test.tstest/policies.test.tstest/runner.test.tstest/sandbox-init.test.tstest/sandbox-provisioning.test.tstest/seccomp-guard.test.tstest/secret-redaction.test.tstest/security-c2-dockerfile-injection.test.tstest/service-env.test.tstest/shellquote-sandbox.test.tstest/shields.test.tstest/slack-token-rewriter-sync.test.tstest/wsl2-probe-timeout.test.ts
💤 Files with no reviewable changes (1)
- test/security-c2-dockerfile-injection.test.ts
Add a source-shape detector and budget gate, then replace production-source-text assertions across the CLI/plugin test suite with behavior coverage. Behavior replacements use subprocess fakes, temp fixtures, shell snippet execution, emitted files, stdout/stderr/exit-code assertions, and direct helper behavior instead of asserting on implementation strings. Experiments: kept runs #1-#88 Metric: source_shape_cases 317 → 0 (-100%); source_shape_assertions 819 → 0; source_shape_files 23 → 0
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/shellquote-sandbox.test.ts`:
- Around line 9-13: The test currently uses CommonJS loading via createRequire
and require; remove the createRequire(import.meta.url) call and any require(...)
usages and switch to ESM imports instead: replace static CommonJS imports with
top-level import statements (e.g., import shellquote or other modules directly)
and where dynamic loading was used, use await import('module') inside the test
(e.g., in your describe/it blocks) to load modules asynchronously; update any
references to the old require variable and ensure the file uses only ESM syntax
(no createRequire or require).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 901c1b54-24e7-4fb7-9e6d-828ef084bc40
📒 Files selected for processing (6)
test/cli.test.tstest/install-preflight.test.tstest/onboard.test.tstest/policies.test.tstest/sandbox-provisioning.test.tstest/shellquote-sandbox.test.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- test/policies.test.ts
- test/sandbox-provisioning.test.ts
There was a problem hiding this comment.
🧹 Nitpick comments (1)
test/shellquote-sandbox.test.ts (1)
163-169: 💤 Low valueConsider simplifying type narrowing with non-null assertion.
The
as unknown asdouble cast is a workaround for TypeScript not narrowing after Vitest'sexpect().not.toBeNull(). Since the assertion on line 163 already validates non-null, you can use the non-null assertion operator directly for cleaner code.♻️ Proposed simplification
expect(captured).not.toBeNull(); - const capturedRun = captured as unknown as { - command: string[]; - opts: { ignoreError?: boolean }; - }; - expect(capturedRun.command).toEqual(["/tmp/custom-openshell", "--version"]); - expect(capturedRun.opts.ignoreError).toBe(true); + expect(captured!.command).toEqual(["/tmp/custom-openshell", "--version"]); + expect(captured!.opts.ignoreError).toBe(true);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/shellquote-sandbox.test.ts` around lines 163 - 169, The test uses a double cast "as unknown as" to coerce `captured` into the `capturedRun` shape after `expect(captured).not.toBeNull()`; replace that double cast by using the non-null assertion operator to simplify narrowing (i.e., assign `capturedRun` from `captured!` and keep the shape `{ command: string[]; opts: { ignoreError?: boolean } }`), so subsequent assertions on `capturedRun.command` and `capturedRun.opts.ignoreError` remain valid.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@test/shellquote-sandbox.test.ts`:
- Around line 163-169: The test uses a double cast "as unknown as" to coerce
`captured` into the `capturedRun` shape after `expect(captured).not.toBeNull()`;
replace that double cast by using the non-null assertion operator to simplify
narrowing (i.e., assign `capturedRun` from `captured!` and keep the shape `{
command: string[]; opts: { ignoreError?: boolean } }`), so subsequent assertions
on `capturedRun.command` and `capturedRun.opts.ignoreError` remain valid.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: bca5e815-7957-42e7-9ea2-7572bb9bfc77
📒 Files selected for processing (2)
test/runner.test.tstest/shellquote-sandbox.test.ts
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/shellquote-sandbox.test.ts`:
- Around line 20-179: The test still performs a source-shape assertion by
reading src/lib/onboard.ts and regex-scanning for run()/runCapture() template
literals; remove that last static source inspection from
test/shellquote-sandbox.test.ts and replace it with behavior-based checks (use
the existing runtime mocks around createSandbox, run(), runCapture(), and
runCaptureOpenshell to assert correct behavior) or move the policy enforcement
into scripts/find-source-shape-tests.ts; specifically, eliminate any code that
reads src/lib/onboard.ts and any regex assertions about run()/runCapture() and
instead assert the same contract by verifying calls to runner.run,
runner.runFile, runner.runCapture, and the exported runCaptureOpenshell
function.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: b9fd20f1-65e4-45c3-b282-22826575099d
📒 Files selected for processing (1)
test/shellquote-sandbox.test.ts
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
test/wsl2-probe-timeout.test.ts (1)
79-80: ⚡ Quick winFail fast when scripted probe responses are exhausted.
Line 79 and Line 155 currently default to success/last-result when extra calls happen. That can mask unintended retries. Throwing when the scripted queue is exceeded makes these tests much more regression-sensitive.
Proposed patch
- const status = statuses[index++] ?? 0; + const status = statuses[index++]; + if (status === undefined) { + throw new Error(`Unexpected extra probe invocation at call ${index}`); + } @@ - return results[index++] ?? results[results.length - 1]; + const next = results[index++]; + if (!next) { + throw new Error(`Unexpected extra probe invocation at call ${index}`); + } + return next;Also applies to: 155-156
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/wsl2-probe-timeout.test.ts` around lines 79 - 80, The test currently masks extra probe calls by defaulting to a success/last-result when the scripted arrays are exhausted; update the logic around the scripted response consumption (the statuses array access using index and the results/lastResult access later) to throw an error when index is out of bounds instead of using the nullish/default fallback. Concretely, replace the expressions that use "statuses[index++] ?? 0" and the similar "results/lastResult" fallback with an explicit bounds check that throws a clear error like "No more scripted probe responses" when index >= statuses.length (and similarly for results.length) before advancing the index and returning the scripted value.test/nemoclaw-start.test.ts (1)
576-582: 💤 Low valueConsider consolidating duplicate helper functions.
extractShellFunction(lines 576-582, 678-684) andextractFunction(lines 1221-1227) duplicate the top-levelextractShellFunctionFromSource(lines 44-50). You could reuse the existing helper:- function extractShellFunction(name: string): string { - const match = src.match(new RegExp(`${name}\\(\\) \\{([\\s\\S]*?)^\\}`, "m")); - if (!match) { - throw new Error(`Expected ${name} in scripts/nemoclaw-start.sh`); - } - return `${name}() {${match[1]}\n}`; - } + const extractShellFunction = (name: string) => extractShellFunctionFromSource(src, name);Regarding the static analysis ReDoS warnings: these are false positives since
namealways comes from hardcoded string literals in the test code (e.g.,"apply_model_override","_read_gateway_token").🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/nemoclaw-start.test.ts` around lines 576 - 582, There are duplicate helpers—extractShellFunction and extractFunction—that replicate extractShellFunctionFromSource; refactor tests to call the existing extractShellFunctionFromSource instead of redefining those functions (search for extractShellFunction and extractFunction and replace their bodies/calls with calls to extractShellFunctionFromSource), and leave a comment near the call sites noting that the ReDoS static-analysis warnings are false positives because name is a hardcoded literal (e.g., "apply_model_override", "_read_gateway_token").
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/wsl2-probe-timeout.test.ts`:
- Around line 77-89: The mock for httpProbe.runCurlProbe must return objects
that exactly match the ProbeResult shape; update both branches in the mocked
function (the successful branch returning { ok: true, ... } and the failure
branch returning { ok: false, ... }) to include all ProbeResult fields (notably
stderr and ensure body is present) so tests mirror production contract; also
update the other fixtures referenced (the ones around the statuses array and
fixtures at the other mentioned spots) to include the same full set of fields so
all test doubles consistently return complete ProbeResult objects.
---
Nitpick comments:
In `@test/nemoclaw-start.test.ts`:
- Around line 576-582: There are duplicate helpers—extractShellFunction and
extractFunction—that replicate extractShellFunctionFromSource; refactor tests to
call the existing extractShellFunctionFromSource instead of redefining those
functions (search for extractShellFunction and extractFunction and replace their
bodies/calls with calls to extractShellFunctionFromSource), and leave a comment
near the call sites noting that the ReDoS static-analysis warnings are false
positives because name is a hardcoded literal (e.g., "apply_model_override",
"_read_gateway_token").
In `@test/wsl2-probe-timeout.test.ts`:
- Around line 79-80: The test currently masks extra probe calls by defaulting to
a success/last-result when the scripted arrays are exhausted; update the logic
around the scripted response consumption (the statuses array access using index
and the results/lastResult access later) to throw an error when index is out of
bounds instead of using the nullish/default fallback. Concretely, replace the
expressions that use "statuses[index++] ?? 0" and the similar
"results/lastResult" fallback with an explicit bounds check that throws a clear
error like "No more scripted probe responses" when index >= statuses.length (and
similarly for results.length) before advancing the index and returning the
scripted value.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 21b64f1b-a626-41c0-83e0-0bdb54883795
📒 Files selected for processing (7)
scripts/find-source-shape-tests.tstest/exec-approvals-path-regression.test.tstest/nemoclaw-start.test.tstest/runner.test.tstest/sandbox-provisioning.test.tstest/shellquote-sandbox.test.tstest/wsl2-probe-timeout.test.ts
💤 Files with no reviewable changes (1)
- test/exec-approvals-path-regression.test.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- test/shellquote-sandbox.test.ts
- test/sandbox-provisioning.test.ts
…inate-source-shape-tests
…hape-zero-final/01-eliminate-source-shape-tests # Conflicts: # test/sandbox-provisioning.test.ts
…hape-zero-final/01-eliminate-source-shape-tests # Conflicts: # test/nemoclaw-start.test.ts
Summary
Replace production-source-text assertions across the NemoClaw test suite with behavior coverage and add a budgeted source-shape detector to prevent regressions. The final detector budget is zero source-shape cases, assertions, and files.
Changes
scripts/find-source-shape-tests.ts,source-shape:scan,source-shape:check, and a pre-commit budget hook.ci/source-shape-test-budget.jsonto zero source-shape cases.Type of Change
Verification
npx prek run --all-filespassesnpm testpassesmake docsbuilds without warnings (doc changes only)AI Disclosure
Signed-off-by: Carlos Villela cvillela@nvidia.com
Summary by CodeRabbit
New Features
Tests
Chores