Skip to content

fix(onboard): exclude .venv and caches from sandbox build context#776

Closed
temrjan wants to merge 2 commits into
NVIDIA:mainfrom
temrjan:fix/exclude-venv-from-build-context
Closed

fix(onboard): exclude .venv and caches from sandbox build context#776
temrjan wants to merge 2 commits into
NVIDIA:mainfrom
temrjan:fix/exclude-venv-from-build-context

Conversation

@temrjan

@temrjan temrjan commented Mar 24, 2026

Copy link
Copy Markdown

Summary

Fixes #774

  • Add cleanup of .venv, __pycache__, .pytest_cache, and .env* files from the staged build context after cp -r, following the existing node_modules cleanup pattern (line 544)
  • Add .venv, .env, .env.* to .dockerignore as defense-in-depth for direct docker build invocations

Details

The staging step in createSandbox() copies the entire nemoclaw-blueprint/ tree into a temp build context via cp -r. If a developer has a .venv directory (from running uv sync, pytest, etc.), it gets baked into the sandbox image, causing:

  1. Build failures — packaging can choke on .venv contents
  2. Security risk.venv and .env files may contain secrets that should not be in image layers

Test plan

  • Verified npm test (vitest) — no new test failures introduced (1 pre-existing Windows PATH separator failure on main)
  • Manual: create .venv/ and .env in nemoclaw-blueprint/, run nemoclaw onboard, verify they are excluded from build context

🤖 Generated with Claude Code

Summary by CodeRabbit

Changes

  • Chores
    • Excluded virtual environment and environment files from Docker build context to reduce build time and artifact size.
    • Cleaned sandbox staging by removing Python caches and top-level env files to produce cleaner, more consistent build sandboxes and smaller deliverables.

The staging step copies the entire nemoclaw-blueprint/ tree into a temp
build context via cp -r. If a developer has a .venv directory (from
running uv sync, pytest, etc.), it gets baked into the sandbox image,
causing build failures and risking secret leakage.

Add cleanup of .venv, __pycache__, .pytest_cache, and .env files from
the staged build context, following the existing node_modules cleanup
pattern. Also add .venv and .env to .dockerignore as defense-in-depth
for direct docker build invocations.

Fixes NVIDIA#774

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Mar 24, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 15dac791-62f8-4cf9-bc96-e5c9fa83afb9

📥 Commits

Reviewing files that changed from the base of the PR and between 9450dd4 and bf5620c.

📒 Files selected for processing (1)
  • bin/lib/onboard.js
🚧 Files skipped from review as they are similar to previous changes (1)
  • bin/lib/onboard.js

📝 Walkthrough

Walkthrough

Exclude local Python artifacts and top-level env files from Docker build contexts: .dockerignore now lists .venv, .env, and .env.*, and the sandbox staging step removes .venv, __pycache__, .pytest_cache, and .env* from the copied blueprint before image build.

Changes

Cohort / File(s) Summary
Docker Build Exclusions
\.dockerignore
Added ignore patterns for .venv, .env, and .env.* to prevent local virtualenvs and env files from being sent to the Docker daemon during builds.
Sandbox Staging Cleanup
bin/lib/onboard.js
createSandbox(gpu) now deletes nemoclaw-blueprint's .venv, __pycache__, .pytest_cache, and top-level .env / .env.* files from the temporary build context (best-effort removals) before building the sandbox image.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Poem

🐰 I tidy the build with a twitch and a cheer,
.venv and .env hop far from here.
Layers stay lean, secrets kept tight,
Sandbox bakes clean through day and night. ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely describes the main change: excluding .venv and caches from the sandbox build context, which is the primary objective of the PR.
Linked Issues check ✅ Passed The PR successfully addresses issue #774 by excluding local artifacts (.venv, pycache, .pytest_cache, .env*) from the Docker build context through both .dockerignore updates and cleanup in createSandbox().
Out of Scope Changes check ✅ Passed All changes are directly scoped to addressing issue #774: updates to .dockerignore for Docker build exclusions and cleanup logic in onboard.js for sandbox staging.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
bin/lib/onboard.js (1)

546-549: Harden .env* deletion pipeline to handle directories and filenames with whitespace.

At Line 549, find ... | xargs rm -f fails to remove directories and mishandles filenames with whitespace or newlines. Use null-delimited output with xargs -0 and rm -rf instead:

Proposed diff
-  run(`find "${buildCtx}/nemoclaw-blueprint" -maxdepth 1 -name ".env" -o -name ".env.*" | xargs rm -f 2>/dev/null`, { ignoreError: true });
+  run(
+    `find "${buildCtx}/nemoclaw-blueprint" -maxdepth 1 \\( -name ".env" -o -name ".env.*" \\) -print0 | xargs -0 rm -rf -- 2>/dev/null`,
+    { ignoreError: true }
+  );
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bin/lib/onboard.js` around lines 546 - 549, The deletion pipeline in the loop
uses run(...) and the subsequent find + xargs pipeline that calls rm -f fails on
directories and filenames with whitespace/newlines; update the command invoked
by run for removing .env* entries in the ktermbuild context (the existing call
that references buildCtx and "nemoclaw-blueprint") to use null-delimited output
and recursive removal (e.g., use find ... -print0 piped to xargs -0 rm -rf) so
directories are removed and names with spaces/newlines are handled safely; keep
the run(..., { ignoreError: true }) invocation and only change the shell
pipeline invoked.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@bin/lib/onboard.js`:
- Around line 546-549: The deletion pipeline in the loop uses run(...) and the
subsequent find + xargs pipeline that calls rm -f fails on directories and
filenames with whitespace/newlines; update the command invoked by run for
removing .env* entries in the ktermbuild context (the existing call that
references buildCtx and "nemoclaw-blueprint") to use null-delimited output and
recursive removal (e.g., use find ... -print0 piped to xargs -0 rm -rf) so
directories are removed and names with spaces/newlines are handled safely; keep
the run(..., { ignoreError: true }) invocation and only change the shell
pipeline invoked.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 06600e82-4950-4e6e-94d4-402c5e549083

📥 Commits

Reviewing files that changed from the base of the PR and between 6e1208c and 9450dd4.

📒 Files selected for processing (2)
  • .dockerignore
  • bin/lib/onboard.js

Address review feedback: use -print0 and xargs -0 to correctly
handle filenames with whitespace, and group -name predicates with
parentheses for correct logical evaluation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@wscurran

Copy link
Copy Markdown
Contributor

✨ Thanks for submitting this PR with a detailed summary, it addresses an issue with the sandbox build context and proposes a fix to improve the testing and Docker experience, which could enhance the overall quality of NemoClaw.

@temrjan

temrjan commented Mar 31, 2026

Copy link
Copy Markdown
Author

Closing this — the core issue (#774) has been addressed upstream via shouldIncludeBuildContextPath() which filters .venv, __pycache__, .pytest_cache and others during copy rather than removing after. A cleaner approach than my post-copy cleanup.

The .dockerignore was also updated upstream with the relevant patterns.

Thanks for the review feedback!

@temrjan temrjan closed this Mar 31, 2026
@wscurran wscurran added area: e2e End-to-end tests, nightly failures, or validation infrastructure area: packaging Packages, images, registries, installers, or distribution bug-fix PR fixes a bug or regression feature PR adds or expands user-visible functionality platform: container Affects Docker, containerd, Podman, or images and removed area: packaging Packages, images, registries, installers, or distribution Docker feature PR adds or expands user-visible functionality labels Jun 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: e2e End-to-end tests, nightly failures, or validation infrastructure bug-fix PR fixes a bug or regression platform: container Affects Docker, containerd, Podman, or images

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Sandbox image build includes nemoclaw-blueprint/.venv from local tree (breaks build / risks leaking secrets)

2 participants