Description
When the NVIDIA inference endpoint is unreachable (e.g. blocked by firewall), openclaw tui shows an indefinite spinner with "connected" status and never surfaces an error message. The user has zero actionability -- no HTTP status, no error cause, no recovery hint. The spinner ran for over 3 minutes 42 seconds with no feedback before being manually cancelled.
Related fixed bug #6226597 covers stale health in nemoclaw status; this bug is specifically about the TUI agent chat pane failing silently on inference errors.
Environment
Device: DGX Spark (NVIDIA_DGX_Spark), hostname spark-8158
OS: Ubuntu 24.04.4 LTS (aarch64)
Architecture: aarch64
Node.js: v22.22.3
npm: 10.9.8
Docker: Docker version 29.2.1, build a5c7197
OpenShell CLI: openshell 0.0.44
NemoClaw: v0.0.53
OpenClaw: 2026.5.22 (a374c3a)
Steps to Reproduce
nemoclaw onboard with NVIDIA Endpoints provider (nvidia-prod), model nvidia/nemotron-3-super-120b-a12b. Verify inference healthy:
nemoclaw my-assistant status
Inference: healthy
- Block NVIDIA endpoint IPs from Docker containers:
sudo iptables -I DOCKER-USER -d 75.2.113.119 -j DROP
sudo iptables -I DOCKER-USER -d 99.83.136.103 -j DROP
nemoclaw my-assistant connect
openclaw tui
- Type any prompt (e.g. "hello") and press Enter
- Observe the TUI status bar and main pane for up to 4 minutes
Expected Result
TUI surfaces a structured error within the gateway timeout (180s) including:
- HTTP status or cause (e.g. "HTTP 503 from upstream" or "connection refused")
- Which layer reported it (gateway proxy / upstream API)
- One-line recovery hint (e.g. "check egress policy" / "check API key")
The status bar should show "error" (not "connected") when inference fails.
Actual Result
TUI shows an indefinite spinner with playful loading text:
⠦ flibbertigibbeting… • 3m 42s | connected
Status bar reads "connected" -- NOT "error" or "timeout".
Main pane shows NOTHING -- no error text, no HTTP status, no recovery hint.
The spinner continues indefinitely past the 180s gateway timeout.
User has no way to determine what failed or how to fix it.
Logs
TUI capture (tmux capture-pane after 3m42s):
⠦ flibbertigibbeting… • 3m 42s | connected
agent main | session main | inference/nvidia/nemotron-3-super-120b-a12b | tokens 6.5k/131k (5%)
No error text appeared in the main pane at any point during the 3m42s wait.
The only change was the spinner animation and elapsed time counter.
Verified that inference was actually blocked:
curl -s --connect-timeout 5 https://integrate.api.nvidia.com/v1/models
→ exit code 28 (timeout)
NVB#6236510
Description
When the NVIDIA inference endpoint is unreachable (e.g. blocked by firewall),
openclaw tuishows an indefinite spinner with "connected" status and never surfaces an error message. The user has zero actionability -- no HTTP status, no error cause, no recovery hint. The spinner ran for over 3 minutes 42 seconds with no feedback before being manually cancelled.Related fixed bug #6226597 covers stale health in
nemoclaw status; this bug is specifically about the TUI agent chat pane failing silently on inference errors.Environment
Steps to Reproduce
nemoclaw onboardwith NVIDIA Endpoints provider (nvidia-prod), modelnvidia/nemotron-3-super-120b-a12b. Verify inference healthy:nemoclaw my-assistant connectopenclaw tuiExpected Result
TUI surfaces a structured error within the gateway timeout (180s) including:
The status bar should show "error" (not "connected") when inference fails.
Actual Result
TUI shows an indefinite spinner with playful loading text:
Status bar reads "connected" -- NOT "error" or "timeout".
Main pane shows NOTHING -- no error text, no HTTP status, no recovery hint.
The spinner continues indefinitely past the 180s gateway timeout.
User has no way to determine what failed or how to fix it.
Logs
NVB#6236510