Description
Description
On NemoClaw v0.0.38 with a sandbox that uses ollama-local provider, stopping and restarting the host-side Ollama systemd unit breaks the sandbox's `inference.local` route. Reconnecting to the sandbox triggers the wizard's DNS-proxy auto-repair flow, which reports "DNS verification: 4 passed, 0 failed" — yet immediately prints "Warning: inference.local is still unavailable after DNS proxy repair." The subsequent `openclaw agent --agent main -m "hello"` retry hangs indefinitely on "Waiting for agent reply..." with no response.
Per the T560821 expected result:
"After provider transient failure recovers, inference resumes WITHOUT
requiring re-onboard or sandbox destroy; nemoclaw status transitions
from degraded -> healthy within the documented window."
The current behavior fails this expectation: even after Ollama is fully back up on the host (curl http://localhost:11434/api/tags returns models, systemctl is-active=active), the sandbox cannot reach inference.local without a manual intervention.
Environment
Device: 2u1g-x570-1795 (10.63.136.90)
OS: Ubuntu 24.04.4 LTS
Architecture: x86_64
GPU: NVIDIA RTX 6000 Ada Generation, 46068 MiB
Node.js: v22.22.2
npm: 10.9.7
Docker: 29.4.3
OpenShell CLI: openshell 0.0.36
NemoClaw: v0.0.38
OpenClaw: 2026.4.24
Sandbox: ollama-base (Ollama / qwen2.5:7b)
Ollama: 0.23.1 (host systemd service)
Steps to Reproduce
1. Fresh NemoClaw v0.0.38 install
2. nemoclaw onboard --fresh --name ollama-base; pick Local Ollama, default model
3. nemoclaw ollama-base connect; verify openclaw agent works (e.g.
openclaw agent --agent main -m "hello" --session-id baseline returns OK)
4. exit sandbox
5. sudo systemctl stop ollama
6. nemoclaw ollama-base connect
Observe: wizard prints "inference.local is unavailable inside 'ollama-base'.
Repairing sandbox DNS proxy..." then "DNS verification: 4 passed, 0 failed"
then "Warning: inference.local is still unavailable after DNS proxy repair."
7. Inside sandbox: openclaw agent --agent main -m "hello" --session-id fail-test
Observe: clear error mentioning the inference backend is unavailable. PASS.
8. exit sandbox
9. sudo systemctl start ollama; verify systemctl is-active=active and
curl -s http://localhost:11434/api/tags returns qwen2.5:7b
10. nemoclaw ollama-base connect (AGAIN)
Observe: SAME warning sequence — DNS verification 4 passed,
but "Warning: inference.local is still unavailable after DNS proxy repair."
11. Inside sandbox: openclaw agent --agent main -m "hello" --session-id retry-test
Observe: hangs on "Waiting for agent reply..." indefinitely (>30s, did not
respond within typical 7B Ollama timeout).
Expected Result
After step 9 (Ollama restored on host), step 10's reconnect should either:
(a) succeed cleanly without a "DNS proxy repair" hop (because Ollama on host
is the upstream, and the sandbox's inference.local -> Ollama proxy
should re-establish without filesystem mutation), OR
(b) if DNS-proxy auto-repair is needed, the repair must ACTUALLY restore
inference.local — not declare 4/4 verification PASS while simultaneously
warning the route is still unavailable.
After step 10/11, openclaw agent retry must return a normal reply within
~30-60s (typical qwen2.5:7b warm response latency). The transition from
degraded -> healthy must happen automatically without `nemoclaw destroy`
or `nemoclaw onboard --fresh`.
Actual Result
Step 6 wizard output:
Dashboard port forward to 'ollama-base' is missing or dead.
Re-establishing...
→ Found forward on sandbox 'ollama-base'
✓ Stopped forward of port 18789 for sandbox ollama-base
✓ Forwarding port 18789 to sandbox ollama-base in the background
Access at: http://127.0.0.1:18789/
Stop with: openshell forward stop 18789 ollama-base
Failed to re-establish the dashboard port forward.
Run `openshell forward start --background ollama-base` manually.
inference.local is unavailable inside 'ollama-base'. Repairing sandbox DNS proxy...
Setting up DNS proxy in pod 'ollama-base' (10.200.0.1:53 -> 10.43.0.10)...
[PASS] DNS forwarder running (pid=637): dns-proxy: 10.200.0.1:53 -> 10.43.0.10:53 pid=637
[PASS] resolv.conf -> nameserver 10.200.0.1
[PASS] iptables: UDP 10.200.0.1:53 ACCEPT rule present
[PASS] getent hosts github.com -> 140.82.116.4 github.com
DNS verification: 4 passed, 0 failed
Warning: inference.local is still unavailable after DNS proxy repair.
✓ Connecting to sandbox 'ollama-base'
Step 11 inside sandbox (after Ollama restarted on host):
$ openclaw agent --agent main -m "hello" --session-id retry-test
OpenClaw 2026.4.24 (cbcfdf6) — Less clicking, more shipping...
◓ Waiting for agent reply…
◒ Waiting for agent reply….
(no further output observed within 30s polling window)
Note: same "Warning: inference.local is still unavailable" appears in BOTH
the Ollama-down state (expected) AND the Ollama-up state (NOT expected).
This indicates the inference.local route inside the sandbox is broken
independently of whether host-side Ollama is running, once it has been
disrupted by a host-side restart.
Logs
Full T560821 per-step capture:
/home/lab/day0-automation/20260511/report-T560821.txt
Related bugs (different scenario / fixed):
6128212 — curl to inference.local/v1/models returns null exit code, ERR_IPC_CHANNEL_CLOSED (Open, different code path: HTTP 503 rather than DNS auto-repair)
6115797 — TUI fails with "Connection error" — sandbox security guard blocks inference.local route (Fixed/Verified — onboard-time bug, this is a runtime recovery bug)
6116337 — Ollama-local returns HTTP 401 (Fixed/Verified — credentials path)
Bug Details
| Field |
Value |
| Priority |
Unprioritized |
| Action |
Dev - Open - To fix |
| Disposition |
Open issue |
| Module |
Machine Learning - NemoClaw |
| Keyword |
NemoClaw, NEMOCLAW_GH_SYNC_APPROVAL, NemoClaw_Inference, NemoClaw_Policy&Network |
[NVB#6166935]
Description
Description
Environment Steps to Reproduce1. Fresh NemoClaw v0.0.38 install 2. nemoclaw onboard --fresh --name ollama-base; pick Local Ollama, default model 3. nemoclaw ollama-base connect; verify openclaw agent works (e.g. openclaw agent --agent main -m "hello" --session-id baseline returns OK) 4. exit sandbox 5. sudo systemctl stop ollama 6. nemoclaw ollama-base connect Observe: wizard prints "inference.local is unavailable inside 'ollama-base'. Repairing sandbox DNS proxy..." then "DNS verification: 4 passed, 0 failed" then "Warning: inference.local is still unavailable after DNS proxy repair." 7. Inside sandbox: openclaw agent --agent main -m "hello" --session-id fail-test Observe: clear error mentioning the inference backend is unavailable. PASS. 8. exit sandbox 9. sudo systemctl start ollama; verify systemctl is-active=active and curl -s http://localhost:11434/api/tags returns qwen2.5:7b 10. nemoclaw ollama-base connect (AGAIN) Observe: SAME warning sequence — DNS verification 4 passed, but "Warning: inference.local is still unavailable after DNS proxy repair." 11. Inside sandbox: openclaw agent --agent main -m "hello" --session-id retry-test Observe: hangs on "Waiting for agent reply..." indefinitely (>30s, did not respond within typical 7B Ollama timeout).Expected ResultAfter step 9 (Ollama restored on host), step 10's reconnect should either: (a) succeed cleanly without a "DNS proxy repair" hop (because Ollama on host is the upstream, and the sandbox's inference.local -> Ollama proxy should re-establish without filesystem mutation), OR (b) if DNS-proxy auto-repair is needed, the repair must ACTUALLY restore inference.local — not declare 4/4 verification PASS while simultaneously warning the route is still unavailable. After step 10/11, openclaw agent retry must return a normal reply within ~30-60s (typical qwen2.5:7b warm response latency). The transition from degraded -> healthy must happen automatically without `nemoclaw destroy` or `nemoclaw onboard --fresh`.Actual ResultStep 6 wizard output: Dashboard port forward to 'ollama-base' is missing or dead. Re-establishing... → Found forward on sandbox 'ollama-base' ✓ Stopped forward of port 18789 for sandbox ollama-base ✓ Forwarding port 18789 to sandbox ollama-base in the background Access at: http://127.0.0.1:18789/ Stop with: openshell forward stop 18789 ollama-base Failed to re-establish the dashboard port forward. Run `openshell forward start --background ollama-base` manually. inference.local is unavailable inside 'ollama-base'. Repairing sandbox DNS proxy... Setting up DNS proxy in pod 'ollama-base' (10.200.0.1:53 -> 10.43.0.10)... [PASS] DNS forwarder running (pid=637): dns-proxy: 10.200.0.1:53 -> 10.43.0.10:53 pid=637 [PASS] resolv.conf -> nameserver 10.200.0.1 [PASS] iptables: UDP 10.200.0.1:53 ACCEPT rule present [PASS] getent hosts github.com -> 140.82.116.4 github.com DNS verification: 4 passed, 0 failed Warning: inference.local is still unavailable after DNS proxy repair. ✓ Connecting to sandbox 'ollama-base' Step 11 inside sandbox (after Ollama restarted on host): $ openclaw agent --agent main -m "hello" --session-id retry-test OpenClaw 2026.4.24 (cbcfdf6) — Less clicking, more shipping... ◓ Waiting for agent reply… ◒ Waiting for agent reply…. (no further output observed within 30s polling window) Note: same "Warning: inference.local is still unavailable" appears in BOTH the Ollama-down state (expected) AND the Ollama-up state (NOT expected). This indicates the inference.local route inside the sandbox is broken independently of whether host-side Ollama is running, once it has been disrupted by a host-side restart.LogsBug Details
[NVB#6166935]