Skip to content

perf(onboard): honor .dockerignore for custom --from build contexts #4679

@ahunnargikar-nvidia

Description

@ahunnargikar-nvidia

Problem

nemoclaw onboard --from <Dockerfile> stages the Dockerfile parent directory as the Docker build context, but the custom context path currently uses NemoClaw's built-in ignore list rather than the user's .dockerignore.

That protects common secret-heavy paths, but it does not behave like Docker's own context selection. Users who already maintain a .dockerignore can still pay avoidable copy and build-context transfer cost when large local directories, cache files, generated artifacts, or model assets are not covered by NemoClaw's hardcoded denylist.

This is a performance and UX issue for custom sandbox images. A broad custom build directory can make onboard look slow or stuck before Docker even starts useful image work, and the warning only reports aggregate size after NemoClaw has walked the tree.

Scope

Honor .dockerignore semantics when staging custom --from build contexts, while preserving NemoClaw's secret-safety denylist.

Candidate areas:

  • Parse .dockerignore from the custom Dockerfile's parent directory.
  • Apply .dockerignore rules during context size calculation and fs.cpSync staging.
  • Keep NemoClaw's existing secret/path denylist as an additional safety filter.
  • Keep the current hard failure when the Dockerfile itself is inside an ignored/denied path.
  • Improve large-context diagnostics so users can see why staging is expensive.
  • Add tests for included files, ignored files, negation rules, secret-denylisted paths, missing .dockerignore, and large-context warnings.

Expected Behavior

Custom --from onboarding should stage the same intentional build context a user expects Docker to send, plus NemoClaw's additional secret exclusions.

If the effective staged context is large, the warning should be actionable: it should identify the effective size and enough high-cost paths to help the user fix the build directory or .dockerignore.

Secret-like paths should remain excluded even if .dockerignore would include them.

Related Work

This issue complements #3775 by reducing avoidable custom Docker build-context cost before sandbox creation waits begin. It should use #3769 trace artifacts where useful to verify whether custom context staging is a meaningful part of slow --from runs.

Acceptance Criteria

  • Custom --from context staging honors .dockerignore rules from the Dockerfile parent directory.
  • NemoClaw's built-in secret denylist still wins over .dockerignore negation rules.
  • Context size calculation and copy/staging use the same effective include/exclude decision.
  • If the effective context exceeds the warning threshold, output includes the effective size and actionable large-path diagnostics.
  • Missing .dockerignore preserves current behavior except for any intentional diagnostic improvements.
  • Tests cover .dockerignore include/exclude behavior, negation, denied secret paths, missing .dockerignore, large context warnings, and cleanup on staging failure.
  • Documentation for onboard --from explains that .dockerignore is honored and that NemoClaw applies additional secret exclusions.

Non-goals

  • Replacing Docker's build engine or changing openshell sandbox create --from.
  • Removing NemoClaw's built-in secret/path denylist.
  • Optimizing Dockerfile layer cache behavior.
  • Caching or prefetching sandbox base image resolution.
  • Parallelizing onboard orchestration.
  • Defining CI performance budgets.

Metadata

Metadata

Assignees

Labels

area: cliCommand line interface, flags, terminal UX, or outputarea: packagingPackages, images, registries, installers, or distributionarea: performanceLatency, throughput, resource use, benchmarks, or scalingplatform: containerAffects Docker, containerd, Podman, or imagesv0.0.60Release target
No fields configured for Enhancement.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions