Environment
- Product: NemoClaw
- Version: v0.0.58
- Host OS: Ubuntu 22.04 / Ubuntu 24.04
- Container runtime: Docker
- Inference: NVIDIA Endpoints
- GPU: NVIDIA GB10
Summary
When host outbound DNS is blocked at the OUTPUT chain (tcp/udp dport 53), host-level DNS and Docker pull DNS fails, but nemoclaw onboard preflight still reports "Container DNS resolution works" and proceeds. The DNS failure is only detected later during NVIDIA Endpoints validation, which prints curl: (6) Could not resolve host: integrate.api.nvidia.com.
This makes preflight misleading for operators: it suggests container DNS is healthy even when host-level DNS for providers is broken.
Preconditions
- NemoClaw installed and working on a host with Docker and NVIDIA GPU.
- NVIDIA Endpoints selected as inference provider during
nemoclaw onboard.
- No special Docker DNS configuration beyond defaults (so Docker containers rely on host resolver / systemd-resolved).
Steps to Reproduce
- On the host, block outbound DNS in the OUTPUT chain:
sudo iptables -I OUTPUT -p udp --dport 53 -j DROP
sudo iptables -I OUTPUT -p tcp --dport 53 -j DROP
- Confirm host DNS is broken:
nslookup registry.npmjs.org # or: curl https://integrate.api.nvidia.com
Both should fail with "communications error" / "Could not resolve host".
- Run NemoClaw onboarding:
nemoclaw onboard --non-interactive --yes \
--yes-i-accept-third-party-software \
--name host-dns-test
- Observe preflight output (step
[1/8]) and subsequent steps.
Expected Result
Preflight should either:
- Detect that provider-relevant DNS / host API reachability is broken, and fail early with a clear error, for example:
"Host DNS / provider hostname resolution failed during preflight. Could not resolve integrate.api.nvidia.com. Check host DNS, VPN, or outbound network ACLs."
OR
- Clearly distinguish between "container DNS OK" and "host/provider DNS unreachable", so operators are not misled by a blanket "Container DNS resolution works" when provider DNS failures are guaranteed later.
nemoclaw onboard should not continue past preflight as if DNS is healthy when host-level DNS for cloud providers is known-broken.
Actual Result
nemoclaw onboard preflight prints:
[1/8] Preflight checks
...
✓ Container DNS resolution works
...
Onboard proceeds to inference configuration and NVIDIA Endpoints selection.
During provider validation, the first visible failure appears:
Chat Completions API validation timed out; retrying in 5s...
...
NVIDIA Endpoints endpoint validation failed.
Chat Completions API: curl failed (exit 6): curl: (6) Could not resolve host: integrate.api.nvidia.com
Validation could not resolve the provider hostname. Check DNS, VPN, or the endpoint URL.
From an operator perspective:
- Preflight claims container DNS is fine.
- The actual DNS failure is only surfaced in step
[3/8] when contacting NVIDIA Endpoints, after the user has already provided credentials and picked a model.
Why this is a separate issue
The container-DNS preflight check behaves correctly when Docker's container path is blocked via DOCKER-USER rules (preflight fails with "Container DNS probe did not complete" and prints Docker/systemd-resolved remediation).
This issue is specifically about host-level DNS / provider reachability being caught only in provider validation, not in preflight, while preflight still prints "Container DNS resolution works" in scenarios where host DNS is clearly broken.
Proposed Improvement
Either:
-
Add a host DNS / provider reachability preflight that probes key provider hostnames (e.g., integrate.api.nvidia.com) and fails early with targeted remediation when they cannot be resolved, or
-
Adjust the preflight output to explicitly separate:
- "Container DNS resolution works (for internal/registry lookups)"
- "Provider DNS / cloud endpoints not checked until inference validation"
so operators aren't misled into thinking the entire DNS path is validated.
Environment
Summary
When host outbound DNS is blocked at the OUTPUT chain (tcp/udp dport 53), host-level DNS and Docker pull DNS fails, but
nemoclaw onboardpreflight still reports "Container DNS resolution works" and proceeds. The DNS failure is only detected later during NVIDIA Endpoints validation, which printscurl: (6) Could not resolve host: integrate.api.nvidia.com.This makes preflight misleading for operators: it suggests container DNS is healthy even when host-level DNS for providers is broken.
Preconditions
nemoclaw onboard.Steps to Reproduce
nslookup registry.npmjs.org # or: curl https://integrate.api.nvidia.com[1/8]) and subsequent steps.Expected Result
Preflight should either:
OR
nemoclaw onboardshould not continue past preflight as if DNS is healthy when host-level DNS for cloud providers is known-broken.Actual Result
nemoclaw onboardpreflight prints:Onboard proceeds to inference configuration and NVIDIA Endpoints selection.
During provider validation, the first visible failure appears:
From an operator perspective:
[3/8]when contacting NVIDIA Endpoints, after the user has already provided credentials and picked a model.Why this is a separate issue
The container-DNS preflight check behaves correctly when Docker's container path is blocked via
DOCKER-USERrules (preflight fails with "Container DNS probe did not complete" and prints Docker/systemd-resolved remediation).This issue is specifically about host-level DNS / provider reachability being caught only in provider validation, not in preflight, while preflight still prints "Container DNS resolution works" in scenarios where host DNS is clearly broken.
Proposed Improvement
Either:
Add a host DNS / provider reachability preflight that probes key provider hostnames (e.g.,
integrate.api.nvidia.com) and fails early with targeted remediation when they cannot be resolved, orAdjust the preflight output to explicitly separate:
so operators aren't misled into thinking the entire DNS path is validated.