[Jentson Orin][aarch64] OpenClaw TUI HTTP 503 "inference service unavailable"

## Description

Description:
On Jetson Orin (platform), OpenClaw TUI fails with repeated HTTP 503 inference service unavailable and LLM request timed out when using NVIDIA cloud inference (nemotron-3-super-120b-a12b via integrate.api.nvidia.com).
The gateway proxy routes the request correctly (NET:OPEN ALLOWED) but the inference call fails with NET:FAIL on inference.local:443 approximately every 4 seconds, suggesting the request times out before the remote LLM can respond.

[Environment]
 Device: NVIDIA IGX Orin Development Kit
 OS: Ubuntu 22.04.5 LTS (Jetson L4T R36.4.6)
 Architecture: aarch64
 Node.js: v22.22.2
 npm: 10.9.7
 Docker: Docker version 29.1.2
 OpenShell CLI: 0.0.26
 NemoClaw: v0.0.16
 OpenClaw: TUI shows openclaw-tui (version not retrievable due to 503 error)

[Steps to Reproduce]

 1. On Jetson Orin, install NemoClaw and complete onboarding with NVIDIA Endpoints provider
 2. Verify sandbox is created:
 nemoclaw list
 Output: test2 — model: nvidia/nemotron-3-super-120b-a12b, provider: nvidia-prod
 3. Connect to sandbox and launch TUI:
 nemoclaw test2 connect
 openclaw tui
 4. Send any prompt (e.g. what's your name )
 5. Observe repeated 503 errors and eventual timeout


[Expected Result]
The TUI should connect to integrate.api.nvidia.com via the gateway inference proxy and return a response from nemotron-3-super-120b-a12b within the configured timeout.



[Actual Result] TUI shows repeated errors:

<pre> HTTP 503: inference service unavailable 
 HTTP 503: inference service unavailable 
 HTTP 503: inference service unavailable 
 HTTP 503: inference service unavailable 
 run error: LLM request timed out.
 connected | error</code></pre>

 Gateway logs show a repeating cycle (~4s interval):
<pre> [sandbox] [OCSF] NET:FAIL [LOW] inference.local:443
 [sandbox] [OCSF] NET:OPEN [INFO] ALLOWED inference.local:443
 [sandbox] [INFO] [openshell_router] routing proxy inference request (streaming)
 endpoint=https://integrate.api.nvidia.com/v1
 path=/v1/chat/completions
 protocols=openai_chat_completions,openai_completions,openai_responses,model_discovery</code></pre>
 The request is routed and allowed by policy, but fails repeatedly before the LLM can complete inference.
Root Cause:
 The remote NVIDIA cloud inference provider (nvidia-prod) does not have a timeout configured, unlike local providers (vllm, ollama) which set timeout_secs: 180.

 1. Missing timeout in blueprint profile:
 - File: nemoclaw-blueprint/blueprint.yaml
 - The default nvidia inference profile does NOT define timeout_secs
 - Local profiles (nim-local, vllm) correctly set timeout_secs: 180

 2. Missing --timeout flag for remote providers in onboard:
 - File: src/lib/onboard.ts (lines 3585-3614)
 - When running `openshell inference set` for nvidia-prod provider, no --timeout flag is passed
 - For local providers (vllm-local, ollama-local), the code correctly adds:
 --timeout , String(LOCAL_INFERENCE_TIMEOUT_SECS) // 180 seconds
 - For remote providers (nvidia-prod, openai-api, anthropic-prod), NO timeout is added

 3. Blueprint runner respects timeout when defined but default profile omits it:
 - File: nemoclaw/src/blueprint/runner.ts (lines 311-313)
 - Code: if (inferenceCfg.timeout_secs !== undefined) { inferenceArgs.push( --timeout , ...) }
 - Since default profile has no timeout_secs, this branch is never taken

 Without an explicit timeout, OpenShell's gateway uses a very short default HTTP idle timeout. For remote cloud LLM inference (which can take 10-30+ seconds for
 large models), this default is insufficient, causing the gateway to return 503 before the LLM response arrives.

 This is likely exacerbated on Jetson Orin due to:
 - ARM64 platform may have different network stack behavior
 - Possible additional latency from NVIDIA network path on Jetson devices

## Bug Details

| Field | Value |
|-------|-------|
| Priority | Unprioritized |
| Action | Dev - Open - To fix |
| Disposition | Open issue |
| Module | Machine Learning - NemoClaw |
| Keyword | NemoClaw, NEMOCLAW_GH_SYNC_APPROVAL, NemoClaw-SWQA-RelBlckr-Recommended |

---
[NVB# 6081485]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Jentson Orin][aarch64] OpenClaw TUI HTTP 503 "inference service unavailable" #1908

Description

Bug Details

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Field	Value
Priority	Unprioritized
Action	Dev - Open - To fix
Disposition	Open issue
Module	Machine Learning - NemoClaw
Keyword	NemoClaw, NEMOCLAW_GH_SYNC_APPROVAL, NemoClaw-SWQA-RelBlckr-Recommended

[Jentson Orin][aarch64] OpenClaw TUI HTTP 503 "inference service unavailable" #1908

Description

Description

Bug Details

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions