fix(test): stop swallowing error output in snapshot E2E by ericksoa · Pull Request #2350 · NVIDIA/NemoClaw

ericksoa · 2026-04-23T12:12:43Z

Summary

The snapshot-commands-e2e nightly job has been failing consistently at Phase 3 with zero diagnostic output — the actual error from nemoclaw snapshot create is captured but never printed because set -e kills the script first
This PR adds a run_capture helper that captures output and exit code separately, so the error message is always visible in CI logs
Adds Phase 2b pre-snapshot diagnostics (registry state, sandbox list, docker containers, stale lock detection)
Enriches the fail() function with additional context (registry contents, lock state, docker ps, node version)

Root cause analysis

The test uses this pattern:

set -euo pipefail
SNAPSHOT_OUTPUT=$(nemoclaw "${SANDBOX_NAME}" snapshot create 2>&1)
echo "$SNAPSHOT_OUTPUT"  # never reached if the command fails

When nemoclaw exits non-zero, set -e terminates the script at the assignment line. The error output is captured in $SNAPSHOT_OUTPUT but the echo on the next line is never executed. The fail() handler is also never reached.

This has been masking the failure since at least 2026-04-22. The test will still fail at the same point, but now we'll see exactly why.

Test plan

Merge and trigger nightly E2E — the snapshot test will still fail but now with visible error output
shellcheck test/e2e/test-snapshot-commands.sh — clean

🤖 Generated with Claude Code

Summary by CodeRabbit

Tests
- Enhanced end-to-end snapshot test diagnostics with comprehensive pre-test environment logging including version information, sandbox configuration, registry status, and container state.
- Improved error reporting and debugging for snapshot commands by capturing and displaying detailed output and exit codes in failure messages.
- Strengthened test infrastructure with improved command execution and failure handling.

The snapshot-commands E2E test uses `$(nemoclaw ... 2>&1)` inside `set -euo pipefail`, which causes the shell to exit immediately when the command fails — before the captured output is ever printed. This has been masking the actual error message for every Phase 3+ failure. Changes: - Add `run_capture` helper that captures both output and exit code without triggering set -e on non-zero exit - Replace all bare `$(nemoclaw ... 2>&1)` calls with run_capture - Add Phase 2b pre-snapshot diagnostics: registry state, sandbox list, docker containers, stale lock check - Enrich fail() diagnostics with registry contents, lock state, docker ps, node version The test will still fail at the same point, but now the actual error message from nemoclaw will be visible in CI logs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-04-23T12:12:54Z

📝 Walkthrough

Walkthrough

Adds run_capture() utility function to capture command output and exit codes without premature script termination. Implements pre-snapshot diagnostics phase logging environment and local state. Updates snapshot lifecycle steps to use the new utility with explicit failure handling, and enhances failure diagnostics with additional registry and Docker information.

Changes

Cohort / File(s)	Summary
Test Snapshot Enhancements `test/e2e/test-snapshot-commands.sh`	Added `run_capture()` utility to safely capture command stdout/stderr and exit codes. Introduced pre-snapshot diagnostics phase logging NemoClaw version, openshell sandbox, registry files, config directory, and Docker container status. Updated snapshot lifecycle steps (create/list/restore/help) to use `run_capture()` with explicit failure handling including captured output in error messages. Enhanced `fail()` diagnostics with additional registry and Docker/node/nemoclaw path information.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 A script that captures with care,
Diagnostics logged everywhere,
No early exit, no surprise,
Just better snapshots, test-wise!
—Yours truly, the code-testing hare 🌟

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'fix(test): stop swallowing error output in snapshot E2E' directly and clearly summarizes the main objective of the changeset: preventing error output from being lost in the snapshot E2E tests.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/snapshot-e2e-diagnostics

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/e2e/test-snapshot-commands.sh`:
- Around line 256-263: run_capture's exit code (_CAPTURE_RC) must be checked
before asserting HELP_OUTPUT contents: after calling run_capture (and before
grepping HELP_OUTPUT), verify that _CAPTURE_RC is 0 and call fail with a
descriptive message if not; only if _CAPTURE_RC is 0 proceed to test HELP_OUTPUT
for "snapshot create", "snapshot list", and "snapshot restore" and call pass or
fail accordingly. Use the existing symbols run_capture, _CAPTURE_RC,
HELP_OUTPUT, pass, and fail to implement the early exit on non-zero capture
return.
- Around line 58-64: The run_capture function uses eval to assign captured
output to a dynamic variable name, which can execute shell code in the output;
replace the unsafe eval "${_var_name}=\${_output}" with a safe dynamic
assignment using printf -v (e.g. printf -v "$_var_name" '%s' "$_output") so the
output is stored literally without evaluation, keeping the rest of run_capture
(variables _CAPTURE_RC, _output, and invocation) unchanged.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: f6276c4d-2ca5-442a-975d-9c3c2c54cb18

📥 Commits

Reviewing files that changed from the base of the PR and between b1afc50 and 9baf085.

📒 Files selected for processing (1)

test/e2e/test-snapshot-commands.sh

coderabbitai · 2026-04-23T12:17:08Z

+run_capture() {
+  local _var_name="$1"; shift
+  _CAPTURE_RC=0
+  local _output
+  _output=$("$@" 2>&1) || _CAPTURE_RC=$?
+  eval "${_var_name}=\${_output}"
+}


⚠️ Potential issue | 🟠 Major

Avoid eval in run_capture; it can execute captured output.

On Line 63, eval is applied to command output. If output contains shell syntax like $()/backticks, it can be executed during assignment.

Suggested fix

run_capture() { local _var_name="$1"; shift _CAPTURE_RC=0 local _output _output=$("$@" 2>&1) || _CAPTURE_RC=$? - eval "${_var_name}=\${_output}" + printf -v "$_var_name" '%s' "$_output" }

#!/bin/bash set -euo pipefail # Verify current usage in the file rg -n -C2 'run_capture|eval' test/e2e/test-snapshot-commands.sh # Demonstrate why eval on captured output is unsafe bash -c 'set -euo pipefail; _output='\''$(echo "EVAL_EXECUTED" >&2)'\''; eval "x=\${_output}"; printf "x=%s\n" "$x"'

Expected: second command prints EVAL_EXECUTED, proving evaluation occurred.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@test/e2e/test-snapshot-commands.sh` around lines 58 - 64, The run_capture function uses eval to assign captured output to a dynamic variable name, which can execute shell code in the output; replace the unsafe eval "${_var_name}=\${_output}" with a safe dynamic assignment using printf -v (e.g. printf -v "$_var_name" '%s' "$_output") so the output is stored literally without evaluation, keeping the rest of run_capture (variables _CAPTURE_RC, _output, and invocation) unchanged.

coderabbitai · 2026-04-23T12:17:08Z

+run_capture HELP_OUTPUT nemoclaw "${SANDBOX_NAME}" snapshot
 if echo "$HELP_OUTPUT" | grep -q "snapshot create" \
  && echo "$HELP_OUTPUT" | grep -q "snapshot list" \
  && echo "$HELP_OUTPUT" | grep -q "snapshot restore"; then
  pass "snapshot help shows create/list/restore"
 else
-  fail "snapshot help incomplete: ${HELP_OUTPUT}"
+  fail "snapshot help incomplete (code $_CAPTURE_RC): ${HELP_OUTPUT}"
 fi


⚠️ Potential issue | 🟠 Major

Check _CAPTURE_RC before asserting help content.

At Line 256 onward, run_capture is used but non-zero exit is not handled first. A failing command could still satisfy string checks and incorrectly pass Phase 9.

Suggested fix

run_capture HELP_OUTPUT nemoclaw "${SANDBOX_NAME}" snapshot +if [ "$_CAPTURE_RC" -ne 0 ]; then + fail "snapshot help exited with code $_CAPTURE_RC: ${HELP_OUTPUT}" +fi if echo "$HELP_OUTPUT" | grep -q "snapshot create" \ && echo "$HELP_OUTPUT" | grep -q "snapshot list" \ && echo "$HELP_OUTPUT" | grep -q "snapshot restore"; then pass "snapshot help shows create/list/restore" else - fail "snapshot help incomplete (code $_CAPTURE_RC): ${HELP_OUTPUT}" + fail "snapshot help incomplete: ${HELP_OUTPUT}" fi

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

run_capture HELP_OUTPUT nemoclaw "${SANDBOX_NAME}" snapshot

if echo "$HELP_OUTPUT" | grep -q "snapshot create" \

&& echo "$HELP_OUTPUT" | grep -q "snapshot list" \

&& echo "$HELP_OUTPUT" | grep -q "snapshot restore"; then

pass "snapshot help shows create/list/restore"

else

fail "snapshot help incomplete: ${HELP_OUTPUT}"

fail "snapshot help incomplete (code $_CAPTURE_RC): ${HELP_OUTPUT}"

fi

run_capture HELP_OUTPUT nemoclaw "${SANDBOX_NAME}" snapshot

if [ "$_CAPTURE_RC" -ne 0 ]; then

fail "snapshot help exited with code $_CAPTURE_RC: ${HELP_OUTPUT}"

fi

if echo "$HELP_OUTPUT" | grep -q "snapshot create" \

&& echo "$HELP_OUTPUT" | grep -q "snapshot list" \

&& echo "$HELP_OUTPUT" | grep -q "snapshot restore"; then

pass "snapshot help shows create/list/restore"

else

fail "snapshot help incomplete: ${HELP_OUTPUT}"

fi

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@test/e2e/test-snapshot-commands.sh` around lines 256 - 263, run_capture's exit code (_CAPTURE_RC) must be checked before asserting HELP_OUTPUT contents: after calling run_capture (and before grepping HELP_OUTPUT), verify that _CAPTURE_RC is 0 and call fail with a descriptive message if not; only if _CAPTURE_RC is 0 proceed to test HELP_OUTPUT for "snapshot create", "snapshot list", and "snapshot restore" and call pass or fail accordingly. Use the existing symbols run_capture, _CAPTURE_RC, HELP_OUTPUT, pass, and fail to implement the early exit on non-zero capture return.

ericksoa · 2026-04-23T13:21:01Z

Superseded by the new PR from fix/nightly-e2e-repairs which includes these diagnostics plus fixes for the remaining three E2E failures.

## Summary Fixes four nightly E2E failures that have been masking each other since April 21–22. Supersedes #2350 (which added the diagnostics that helped identify these). ### snapshot-commands-e2e - **Root cause:** PR #2184 changed the CLI success output from `Snapshot created` to `Snapshot v1 created` (version string inserted), but the test's `grep -q "Snapshot created"` assertion was never updated - **Previous masking:** This was hidden behind the `safeTarExtract` symlink regression (#2268, fixed by #2308) — the snapshot never got far enough to print the success message. Now that #2308 landed, the snapshot succeeds but the grep fails - **Fix:** Change grep to `"Snapshot.*created"` to match the versioned output format - Also includes the `run_capture` helper and Phase 2b diagnostics from #2350 so future failures are visible ### deployment-services-e2e - **Root cause:** `SANDBOX_NAME` and `LOG_FILE` are never defined — the script uses `set -euo pipefail` (`-u` = nounset) and crashes immediately at line 70 - **Fix:** Add variable definitions matching the pattern used by other E2E tests; pass `NEMOCLAW_SANDBOX_NAME` and `NEMOCLAW_RECREATE_SANDBOX` in the workflow ### network-policy-e2e - **Root cause:** `setup_sandbox()` destroys the sandbox, then calls `nemoclaw onboard` which detects the stale registry entry and fails with "Sandbox already exists but is not ready" - **Fix:** Add `NEMOCLAW_RECREATE_SANDBOX=1` to the onboard env so it overwrites the stale entry ### cloud-experimental-e2e (not addressed here) Two unrelated failures remain: Landlock regression on `/sandbox` writability and CLI/docs command reference drift. These need separate investigation. ## Test plan - [ ] Trigger nightly E2E on this branch — snapshot-commands, deployment-services, and network-policy jobs should pass - [x] `shellcheck` clean on all three modified test scripts 🤖 Generated with [Claude Code](https://claude.com/claude-code)  ## Summary by CodeRabbit * **Tests** * More reliable E2E runs via consistent sandbox naming and automatic sandbox recreation. * Timestamped logs, a capture wrapper that preserves command output and exit codes. * Expanded pre-snapshot diagnostics and richer failure logs for clearer snapshot troubleshooting. * **Chores** * CI updated to disable an experimental E2E job and adjust failure notifications to avoid false triggers.  --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai Bot reviewed Apr 23, 2026

View reviewed changes

ericksoa closed this Apr 23, 2026

ericksoa mentioned this pull request Apr 23, 2026

fix(test): repair four nightly E2E test failures #2351

Merged

2 tasks

wscurran added the bug-fix PR fixes a bug or regression label Jun 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(test): stop swallowing error output in snapshot E2E#2350

fix(test): stop swallowing error output in snapshot E2E#2350
ericksoa wants to merge 1 commit into
mainfrom
fix/snapshot-e2e-diagnostics

ericksoa commented Apr 23, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 23, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Apr 23, 2026

Uh oh!

coderabbitai Bot Apr 23, 2026

Uh oh!

ericksoa commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ericksoa commented Apr 23, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root cause analysis

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

ericksoa commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ericksoa commented Apr 23, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 23, 2026 •

edited

Loading