Skip to content

[DGX Spark][Install] Onboard blocked by missing CDI device spec on fresh Spark with Skip OTA #3252

@zNeill

Description

@zNeill

Description

Description

On a fresh DGX Spark (FastOS 1.135.29, customer build), nemoclaw onboard fails at preflight with "CDI device specs missing" error. The factory image ships nvidia-container-toolkit without the systemd unit that auto-generates /etc/cdi/nvidia.yaml. The COMPUTEX POR mandates "Skip OTA in OOBE", meaning every new Spark user will hit this blocker before reaching the express setup flow.

User must manually run: sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml

Related: NVBug 5118160 (toolkit-side fix in 1.18.0+), GitHub #3152/#3159 (NemoClaw preflight warning only, no auto-fix).
Environment
Device:        DGX Spark (spark-dadc, 10.173.104.110)
OS:            DGX Spark FastOS 1.135.29 (2026-04-13, customer build)
Architecture:  aarch64
Node.js:       v22.22.2
npm:           10.9.7
Docker:        Docker 29.2.1
OpenShell CLI: openshell 0.0.36
NemoClaw:      v0.0.37
OpenClaw:      N/A (onboard not completed)
nvidia-ctk:    1.19.0
Ollama:        0.23.2
Steps to Reproduce
1. Fresh DGX Spark with FastOS 1.135.29 (no OTA applied — matches COMPUTEX POR "Skip OTA")
2. Run: curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash
3. Install completes (Node.js, NemoClaw CLI, OpenShell all installed)
4. Onboard starts automatically at step [3/3]
5. Observe preflight error
Expected Result
Onboard proceeds without manual CDI spec generation. Either:
(a) Installer auto-runs nvidia-ctk cdi generate, or
(b) Factory image ships with /etc/cdi/nvidia.yaml pre-generated, or
(c) Express setup handles this prerequisite automatically
Actual Result
[WARN] Host preflight found issues that will prevent onboarding right now.
  - Generate NVIDIA CDI device specs: Docker is configured for CDI device injection
    (CDISpecDirs is set) but no nvidia.com/gpu CDI spec is present on the host.
    OpenShell's gateway start --gpu will fail with
    "unresolvable CDI devices nvidia.com/gpu=all" until a spec is generated.
    sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
[WARN] Skipping onboarding until the host prerequisites above are fixed.
Logs
Not captured — error is in installer stdout above.

Bug Details

Field Value
Priority Unprioritized
Action Dev - Open - To fix
Disposition Open issue
Module Machine Learning - NemoClaw
Keyword DGX_Spark_OTA_Computex, NemoClaw, NEMOCLAW_GH_SYNC_APPROVAL, NemoClaw_Install

[NVB#6157911]

Metadata

Metadata

Assignees

Labels

NV QABugs found by the NVIDIA QA TeamUATIssues flagged for User Acceptance Testing.VDRLinked to VDR findingarea: cliCommand line interface, flags, terminal UX, or outputarea: installInstall, setup, prerequisites, or uninstall flowarea: onboardingOnboarding FSM, provider setup, sandbox launch, or first-run flowplatform: dgx-sparkAffects DGX Spark hardware or workflows

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions