[DGX Spark] Gateway crash loop on startup: @homebridge/ciao networkInterfaces() returns EPERM in OpenShell sandbox

Hit a clean reproducible crash on a fresh DGX Spark install and figured I'd write it up since the workaround is also clean. Setup details first, then the crash, then what I tried.

## Setup

Fresh install on:

- ASUS GX10 (NVIDIA DGX Spark, GB10 Grace Blackwell, 128GB unified memory)
- DGX OS 24.04
- NemoClaw 0.1.0
- OpenClaw v2026.4.2 inside the sandbox
- Node 22.22.1 inside the sandbox
- Balanced policy preset (default from the onboard wizard)
- Local Ollama, model `nemotron-3-super:120b`
- One Telegram channel

## What's happening

Onboard finishes successfully. `nemoclaw <name> status` initially shows everything green. But the gateway never actually serves anything. Messages to the Telegram bot get no reply.

Tail the gateway log and you can see why. Every time it boots, it crashes on the same line, then health-monitor restarts it, then it crashes again. Loops forever.

```
[gateway] listening on ws://127.0.0.1:18789, ws://[::1]:18789 (PID <n>)
[gateway] log file: /tmp/openclaw-998/openclaw-2026-04-25.log
[gateway] security warning: dangerous config flags enabled: ...
[openclaw] Unhandled promise rejection: SystemError: A system error occurred: uv_interface_addresses returned Unknown system error 1 (Unknown system error 1)
    at Object.networkInterfaces (node:os:218:16)
    at Function.assumeNetworkInterfaceNames (/usr/local/lib/node_modules/openclaw/node_modules/@homebridge/ciao/src/NetworkManager.ts:527:23)
    at NetworkManager.getCurrentNetworkInterfaces (/usr/local/lib/node_modules/openclaw/node_modules/@homebridge/ciao/src/NetworkManager.ts:370:32)
```

The status output hints at the symptom but the suggested fix doesn't actually work:

```
OpenClaw: not running

The sandbox is alive but the OpenClaw gateway process is not running.
This typically happens after a gateway restart (e.g., laptop close/open).

To recover, run:
  nemoclaw <name> connect  (auto-recovers on connect)
```

`connect` doesn't auto-recover, because the respawned gateway hits the same crash on the same line every time.

## What's actually going wrong

`@homebridge/ciao` (the mDNS/Bonjour library OpenClaw bundles for local network discovery) calls `os.networkInterfaces()` during init. Inside the OpenShell sandbox the underlying syscall fails with EPERM, because seccomp is blocking the netlink socket family. Node turns that into a SystemError, ciao doesn't catch it, and the unhandled rejection takes the gateway down.

You can confirm the netlink restriction independently from inside the sandbox:

```
$ ss -tlnp 2>&1 | head -1
Cannot open netlink socket: Operation not permitted
```

One thing worth flagging: ciao's mDNS isn't actually used by any of the supported channels. Telegram, Slack, and Discord all reach out over plain HTTPS. The crash happens just because the library gets loaded and tries to list interfaces eagerly on startup.

## Workaround that worked

Override `os.networkInterfaces` before ciao loads, via a NODE_OPTIONS preload:

```
echo 'require("os").networkInterfaces=()=>({});' > /sandbox/.openclaw-data/patch/preload.js

NODE_OPTIONS="--require /sandbox/.openclaw-data/patch/preload.js" \
  openclaw gateway run
```

That's it. ciao gets `{}`, gives up on mDNS, gateway stays up. Telegram channel connects, chat with Nemotron 120B works fine.

## The real problem

This workaround can't be made persistent today. NemoClaw's config schema doesn't expose env vars or a preload path. I tried every variation I could think of:

```
$ nemoclaw <name> config set --key gateway.env.NODE_OPTIONS --value "..."
Cannot modify the gateway section directly.

$ nemoclaw <name> config set --key env.NODE_OPTIONS --value "..."
Key validation failed: "env.NODE_OPTIONS" is not a recognized openclaw config path.

(same error for sandbox.env.*, sandbox.envVars.*, etc.)
```

So in practice I'm running a relaunch script by hand every time the gateway dies (laptop close, container restart, anything that triggers a respawn). Not viable for an always-on assistant. The whole point is being able to ping it from Telegram any time.

## Possible fixes

A few options, roughly in order of surgical-ness:

1. Wrap the ciao NetworkManager calls in try/catch inside OpenClaw, fall back to no mDNS if `os.networkInterfaces()` throws. Probably the smallest diff.

2. Add an `OPENCLAW_DISABLE_MDNS=1` env var (or a config flag) that skips loading ciao entirely. Most explicit user-facing fix.

3. Loosen the OpenShell sandbox seccomp profile to allow the netlink syscall family. Probably not what you want for an isolation-focused product, but listing it for completeness.

4. As a stopgap until any of the above lands, expose `gateway.preload` or `gateway.env.*` in the NemoClaw config schema. That way users can persist the workaround through `nemoclaw config set` instead of running a script by hand.

## Repro

1. ASUS GX10 / DGX Spark with DGX OS 24.04, Docker preinstalled
2. `curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash`
3. `nemoclaw onboard`. Defaults, Local Ollama with `nemotron-3-super:120b`, Balanced policy preset, Telegram channel.
4. Onboard reports success.
5. Send a message to the bot. No reply.
6. `nemoclaw <name> connect`, then `tail /tmp/openclaw-998/openclaw-*.log`. Stack trace above repeats every ~4 minutes (health-monitor restart cadence).

Happy to send full logs or test fixes against my setup. Easy reproduction on a fresh install.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DGX Spark] Gateway crash loop on startup: @homebridge/ciao networkInterfaces() returns EPERM in OpenShell sandbox #2478

Setup

What's happening

What's actually going wrong

Workaround that worked

The real problem

Possible fixes

Repro

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[DGX Spark] Gateway crash loop on startup: @homebridge/ciao networkInterfaces() returns EPERM in OpenShell sandbox #2478

Description

Setup

What's happening

What's actually going wrong

Workaround that worked

The real problem

Possible fixes

Repro

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions