fix(test): stop swallowing error output in snapshot E2E#2350
Conversation
The snapshot-commands E2E test uses `$(nemoclaw ... 2>&1)` inside `set -euo pipefail`, which causes the shell to exit immediately when the command fails — before the captured output is ever printed. This has been masking the actual error message for every Phase 3+ failure. Changes: - Add `run_capture` helper that captures both output and exit code without triggering set -e on non-zero exit - Replace all bare `$(nemoclaw ... 2>&1)` calls with run_capture - Add Phase 2b pre-snapshot diagnostics: registry state, sandbox list, docker containers, stale lock check - Enrich fail() diagnostics with registry contents, lock state, docker ps, node version The test will still fail at the same point, but now the actual error message from nemoclaw will be visible in CI logs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughAdds Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/e2e/test-snapshot-commands.sh`:
- Around line 256-263: run_capture's exit code (_CAPTURE_RC) must be checked
before asserting HELP_OUTPUT contents: after calling run_capture (and before
grepping HELP_OUTPUT), verify that _CAPTURE_RC is 0 and call fail with a
descriptive message if not; only if _CAPTURE_RC is 0 proceed to test HELP_OUTPUT
for "snapshot create", "snapshot list", and "snapshot restore" and call pass or
fail accordingly. Use the existing symbols run_capture, _CAPTURE_RC,
HELP_OUTPUT, pass, and fail to implement the early exit on non-zero capture
return.
- Around line 58-64: The run_capture function uses eval to assign captured
output to a dynamic variable name, which can execute shell code in the output;
replace the unsafe eval "${_var_name}=\${_output}" with a safe dynamic
assignment using printf -v (e.g. printf -v "$_var_name" '%s' "$_output") so the
output is stored literally without evaluation, keeping the rest of run_capture
(variables _CAPTURE_RC, _output, and invocation) unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: f6276c4d-2ca5-442a-975d-9c3c2c54cb18
📒 Files selected for processing (1)
test/e2e/test-snapshot-commands.sh
| run_capture() { | ||
| local _var_name="$1"; shift | ||
| _CAPTURE_RC=0 | ||
| local _output | ||
| _output=$("$@" 2>&1) || _CAPTURE_RC=$? | ||
| eval "${_var_name}=\${_output}" | ||
| } |
There was a problem hiding this comment.
Avoid eval in run_capture; it can execute captured output.
On Line 63, eval is applied to command output. If output contains shell syntax like $()/backticks, it can be executed during assignment.
Suggested fix
run_capture() {
local _var_name="$1"; shift
_CAPTURE_RC=0
local _output
_output=$("$@" 2>&1) || _CAPTURE_RC=$?
- eval "${_var_name}=\${_output}"
+ printf -v "$_var_name" '%s' "$_output"
}#!/bin/bash
set -euo pipefail
# Verify current usage in the file
rg -n -C2 'run_capture|eval' test/e2e/test-snapshot-commands.sh
# Demonstrate why eval on captured output is unsafe
bash -c 'set -euo pipefail; _output='\''$(echo "EVAL_EXECUTED" >&2)'\''; eval "x=\${_output}"; printf "x=%s\n" "$x"'Expected: second command prints EVAL_EXECUTED, proving evaluation occurred.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/e2e/test-snapshot-commands.sh` around lines 58 - 64, The run_capture
function uses eval to assign captured output to a dynamic variable name, which
can execute shell code in the output; replace the unsafe eval
"${_var_name}=\${_output}" with a safe dynamic assignment using printf -v (e.g.
printf -v "$_var_name" '%s' "$_output") so the output is stored literally
without evaluation, keeping the rest of run_capture (variables _CAPTURE_RC,
_output, and invocation) unchanged.
| run_capture HELP_OUTPUT nemoclaw "${SANDBOX_NAME}" snapshot | ||
| if echo "$HELP_OUTPUT" | grep -q "snapshot create" \ | ||
| && echo "$HELP_OUTPUT" | grep -q "snapshot list" \ | ||
| && echo "$HELP_OUTPUT" | grep -q "snapshot restore"; then | ||
| pass "snapshot help shows create/list/restore" | ||
| else | ||
| fail "snapshot help incomplete: ${HELP_OUTPUT}" | ||
| fail "snapshot help incomplete (code $_CAPTURE_RC): ${HELP_OUTPUT}" | ||
| fi |
There was a problem hiding this comment.
Check _CAPTURE_RC before asserting help content.
At Line 256 onward, run_capture is used but non-zero exit is not handled first. A failing command could still satisfy string checks and incorrectly pass Phase 9.
Suggested fix
run_capture HELP_OUTPUT nemoclaw "${SANDBOX_NAME}" snapshot
+if [ "$_CAPTURE_RC" -ne 0 ]; then
+ fail "snapshot help exited with code $_CAPTURE_RC: ${HELP_OUTPUT}"
+fi
if echo "$HELP_OUTPUT" | grep -q "snapshot create" \
&& echo "$HELP_OUTPUT" | grep -q "snapshot list" \
&& echo "$HELP_OUTPUT" | grep -q "snapshot restore"; then
pass "snapshot help shows create/list/restore"
else
- fail "snapshot help incomplete (code $_CAPTURE_RC): ${HELP_OUTPUT}"
+ fail "snapshot help incomplete: ${HELP_OUTPUT}"
fi📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| run_capture HELP_OUTPUT nemoclaw "${SANDBOX_NAME}" snapshot | |
| if echo "$HELP_OUTPUT" | grep -q "snapshot create" \ | |
| && echo "$HELP_OUTPUT" | grep -q "snapshot list" \ | |
| && echo "$HELP_OUTPUT" | grep -q "snapshot restore"; then | |
| pass "snapshot help shows create/list/restore" | |
| else | |
| fail "snapshot help incomplete: ${HELP_OUTPUT}" | |
| fail "snapshot help incomplete (code $_CAPTURE_RC): ${HELP_OUTPUT}" | |
| fi | |
| run_capture HELP_OUTPUT nemoclaw "${SANDBOX_NAME}" snapshot | |
| if [ "$_CAPTURE_RC" -ne 0 ]; then | |
| fail "snapshot help exited with code $_CAPTURE_RC: ${HELP_OUTPUT}" | |
| fi | |
| if echo "$HELP_OUTPUT" | grep -q "snapshot create" \ | |
| && echo "$HELP_OUTPUT" | grep -q "snapshot list" \ | |
| && echo "$HELP_OUTPUT" | grep -q "snapshot restore"; then | |
| pass "snapshot help shows create/list/restore" | |
| else | |
| fail "snapshot help incomplete: ${HELP_OUTPUT}" | |
| fi |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/e2e/test-snapshot-commands.sh` around lines 256 - 263, run_capture's
exit code (_CAPTURE_RC) must be checked before asserting HELP_OUTPUT contents:
after calling run_capture (and before grepping HELP_OUTPUT), verify that
_CAPTURE_RC is 0 and call fail with a descriptive message if not; only if
_CAPTURE_RC is 0 proceed to test HELP_OUTPUT for "snapshot create", "snapshot
list", and "snapshot restore" and call pass or fail accordingly. Use the
existing symbols run_capture, _CAPTURE_RC, HELP_OUTPUT, pass, and fail to
implement the early exit on non-zero capture return.
|
Superseded by the new PR from fix/nightly-e2e-repairs which includes these diagnostics plus fixes for the remaining three E2E failures. |
## Summary Fixes four nightly E2E failures that have been masking each other since April 21–22. Supersedes #2350 (which added the diagnostics that helped identify these). ### snapshot-commands-e2e - **Root cause:** PR #2184 changed the CLI success output from `Snapshot created` to `Snapshot v1 created` (version string inserted), but the test's `grep -q "Snapshot created"` assertion was never updated - **Previous masking:** This was hidden behind the `safeTarExtract` symlink regression (#2268, fixed by #2308) — the snapshot never got far enough to print the success message. Now that #2308 landed, the snapshot succeeds but the grep fails - **Fix:** Change grep to `"Snapshot.*created"` to match the versioned output format - Also includes the `run_capture` helper and Phase 2b diagnostics from #2350 so future failures are visible ### deployment-services-e2e - **Root cause:** `SANDBOX_NAME` and `LOG_FILE` are never defined — the script uses `set -euo pipefail` (`-u` = nounset) and crashes immediately at line 70 - **Fix:** Add variable definitions matching the pattern used by other E2E tests; pass `NEMOCLAW_SANDBOX_NAME` and `NEMOCLAW_RECREATE_SANDBOX` in the workflow ### network-policy-e2e - **Root cause:** `setup_sandbox()` destroys the sandbox, then calls `nemoclaw onboard` which detects the stale registry entry and fails with "Sandbox already exists but is not ready" - **Fix:** Add `NEMOCLAW_RECREATE_SANDBOX=1` to the onboard env so it overwrites the stale entry ### cloud-experimental-e2e (not addressed here) Two unrelated failures remain: Landlock regression on `/sandbox` writability and CLI/docs command reference drift. These need separate investigation. ## Test plan - [ ] Trigger nightly E2E on this branch — snapshot-commands, deployment-services, and network-policy jobs should pass - [x] `shellcheck` clean on all three modified test scripts 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Tests** * More reliable E2E runs via consistent sandbox naming and automatic sandbox recreation. * Timestamped logs, a capture wrapper that preserves command output and exit codes. * Expanded pre-snapshot diagnostics and richer failure logs for clearer snapshot troubleshooting. * **Chores** * CI updated to disable an experimental E2E job and adjust failure notifications to avoid false triggers. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
snapshot-commands-e2enightly job has been failing consistently at Phase 3 with zero diagnostic output — the actual error fromnemoclaw snapshot createis captured but never printed becauseset -ekills the script firstrun_capturehelper that captures output and exit code separately, so the error message is always visible in CI logsfail()function with additional context (registry contents, lock state, docker ps, node version)Root cause analysis
The test uses this pattern:
When
nemoclawexits non-zero,set -eterminates the script at the assignment line. The error output is captured in$SNAPSHOT_OUTPUTbut theechoon the next line is never executed. Thefail()handler is also never reached.This has been masking the failure since at least 2026-04-22. The test will still fail at the same point, but now we'll see exactly why.
Test plan
shellcheck test/e2e/test-snapshot-commands.sh— clean🤖 Generated with Claude Code
Summary by CodeRabbit