Skip to content

[Jentson Orin][aarch64] OpenClaw TUI HTTP 503 "inference service unavailable" #1908

@zNeill

Description

@zNeill

Description

Description:
On Jetson Orin (platform), OpenClaw TUI fails with repeated HTTP 503 inference service unavailable and LLM request timed out when using NVIDIA cloud inference (nemotron-3-super-120b-a12b via integrate.api.nvidia.com).
The gateway proxy routes the request correctly (NET:OPEN ALLOWED) but the inference call fails with NET:FAIL on inference.local:443 approximately every 4 seconds, suggesting the request times out before the remote LLM can respond.

[Environment]
Device: NVIDIA IGX Orin Development Kit
OS: Ubuntu 22.04.5 LTS (Jetson L4T R36.4.6)
Architecture: aarch64
Node.js: v22.22.2
npm: 10.9.7
Docker: Docker version 29.1.2
OpenShell CLI: 0.0.26
NemoClaw: v0.0.16
OpenClaw: TUI shows openclaw-tui (version not retrievable due to 503 error)

[Steps to Reproduce]

  1. On Jetson Orin, install NemoClaw and complete onboarding with NVIDIA Endpoints provider
  2. Verify sandbox is created:
    nemoclaw list
    Output: test2 — model: nvidia/nemotron-3-super-120b-a12b, provider: nvidia-prod
  3. Connect to sandbox and launch TUI:
    nemoclaw test2 connect
    openclaw tui
  4. Send any prompt (e.g. what's your name )
  5. Observe repeated 503 errors and eventual timeout

[Expected Result]
The TUI should connect to integrate.api.nvidia.com via the gateway inference proxy and return a response from nemotron-3-super-120b-a12b within the configured timeout.

[Actual Result] TUI shows repeated errors:

    HTTP 503:  inference service unavailable 
    HTTP 503:  inference service unavailable 
    HTTP 503:  inference service unavailable 
    HTTP 503:  inference service unavailable 
    run error: LLM request timed out.
    connected | error

Gateway logs show a repeating cycle (~4s interval):

    [sandbox] [OCSF] NET:FAIL [LOW] inference.local:443
    [sandbox] [OCSF] NET:OPEN [INFO] ALLOWED inference.local:443
    [sandbox] [INFO] [openshell_router] routing proxy inference request (streaming)
      endpoint=https://integrate.api.nvidia.com/v1
      path=/v1/chat/completions
      protocols=openai_chat_completions,openai_completions,openai_responses,model_discovery

The request is routed and allowed by policy, but fails repeatedly before the LLM can complete inference.
Root Cause:
The remote NVIDIA cloud inference provider (nvidia-prod) does not have a timeout configured, unlike local providers (vllm, ollama) which set timeout_secs: 180.

  1. Missing timeout in blueprint profile:

    • File: nemoclaw-blueprint/blueprint.yaml
    • The default nvidia inference profile does NOT define timeout_secs
    • Local profiles (nim-local, vllm) correctly set timeout_secs: 180
  2. Missing --timeout flag for remote providers in onboard:

    • File: src/lib/onboard.ts (lines 3585-3614)
    • When running openshell inference set for nvidia-prod provider, no --timeout flag is passed
    • For local providers (vllm-local, ollama-local), the code correctly adds:
      --timeout , String(LOCAL_INFERENCE_TIMEOUT_SECS) // 180 seconds
    • For remote providers (nvidia-prod, openai-api, anthropic-prod), NO timeout is added
  3. Blueprint runner respects timeout when defined but default profile omits it:

    • File: nemoclaw/src/blueprint/runner.ts (lines 311-313)
    • Code: if (inferenceCfg.timeout_secs !== undefined) { inferenceArgs.push( --timeout , ...) }
    • Since default profile has no timeout_secs, this branch is never taken

Without an explicit timeout, OpenShell's gateway uses a very short default HTTP idle timeout. For remote cloud LLM inference (which can take 10-30+ seconds for
large models), this default is insufficient, causing the gateway to return 503 before the LLM response arrives.

This is likely exacerbated on Jetson Orin due to:

  • ARM64 platform may have different network stack behavior
  • Possible additional latency from NVIDIA network path on Jetson devices

Bug Details

Field Value
Priority Unprioritized
Action Dev - Open - To fix
Disposition Open issue
Module Machine Learning - NemoClaw
Keyword NemoClaw, NEMOCLAW_GH_SYNC_APPROVAL, NemoClaw-SWQA-RelBlckr-Recommended

[NVB# 6081485]

Metadata

Metadata

Assignees

Labels

NV QABugs found by the NVIDIA QA Teamarea: inferenceInference routing, serving, model selection, or outputsintegration: openclawOpenClaw integration behaviorplatform: jetsonAffects Jetson AGX Thor or Orin
No fields configured for Enhancement.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions