Description
Description
On Linux hosts whose glibc is older than the openshell-gateway binary requirement (e.g. Ubuntu 22.04 with glibc 2.35 vs. gateway requirement 2.39+), NemoClaw auto-selects the containerized-compat gateway launch mode (`docker run --rm --name nemoclaw-openshell-gateway --network host ubuntu:24.04 /opt/nemoclaw/openshell-gateway`). The first onboard succeeds and the gateway runs in a docker container with `/proc//exe = /usr/bin/docker`.
When a SECOND `nemoclaw onboard` runs on the same host (multi-sandbox flow), the product's drift detection compares the running gateway's executable path against the *host-mode* binary path and incorrectly judges the live gateway as stale. It then triggers `recreate` — spawning a NEW gateway process before the OLD one has been stopped — and the new process cannot bind port 8080 because the old container still holds it. Onboard aborts non-zero.
Environment
OS: Ubuntu 22.04.5 LTS
Architecture: x86_64
Node.js: v22.22.3
npm: 10.9.8
Docker: Docker Engine 29.4.1
OpenShell CLI: openshell 0.0.44
NemoClaw: v0.0.53
OpenClaw: 2026.5.22
glibc: 2.35 (Ubuntu)
Steps to Reproduce
Verified live on the cloud-service-qa nemoclaw-test ubuntu22 runner (10.6.11.114, glibc 2.35) at 2026-05-29.
1. Run `nemoclaw onboard --non-interactive --yes-i-accept-third-party-software` for the default sandbox (`my-assistant`). The first gateway starts in container-compat mode: `docker run … nemoclaw-openshell-gateway … ubuntu:24.04 /opt/nemoclaw/openshell-gateway`. Both gateway and sandbox become healthy.
2. Run a SECOND `nemoclaw onboard --non-interactive --yes-i-accept-third-party-software` for a different sandbox name (`my-assistant-beta-t5882265`).
3. Watch the [2/8] Starting OpenShell gateway step.
Expected Result
Second onboard reuses the existing healthy gateway. No recreate, no port 8080 collision. Both sandboxes reach Ready state and inference is reachable concurrently. (Per spec 5.3.4-sandbox-lifecycle.md: "second sandbox onboard must reuse the existing gateway"; recreating with new TLS certs would invalidate the first sandbox.)
Actual Result
Second onboard aborts non-zero. Full trace:
[non-interactive] Agent: OpenClaw
NemoClaw Onboarding (non-interactive mode)
===================
[1/8] Preflight checks
✓ Docker is running / DNS / runtime ok / openshell 0.0.44 / port 8080 owned by healthy NemoClaw runtime
[2/8] Starting OpenShell gateway
Existing OpenShell Docker-driver gateway is stale
(executable=/usr/bin/docker (expected /home/gitlab-runner/.local/bin/openshell-gateway));
it will be recreated.
⚠ Gateway will be recreated when sandbox creation starts — this will affect running sandboxes.
!! Port 8080 is not available.
OpenShell gateway needs this port.
Blocked by: openshell (PID 181769)
Detail: sudo lsof reports openshell (PID 181769) listening on port 8080
Teardown (`nemoclaw uninstall --yes`) sequence on the same runner:
Stopped host openshell-gateway processes 181005 ← first onboard's gateway, stopped cleanly
Failed to stop host openshell-gateway processes 181769 ← orphaned recreate-spawned process, cannot be killed
Test impact: T5882265 beforeAll fails ("second sandbox onboard must reuse the existing gateway"), 3 sub-tests get skipped via skipRestOfSuiteAfterFailure.
Bug Details
| Field |
Value |
| Priority |
Unprioritized |
| Action |
Dev - Open - To fix |
| Disposition |
Open issue |
| Module |
Machine Learning - NemoClaw |
| Keyword |
NemoClaw, NEMOCLAW_GH_SYNC_APPROVAL, NemoClaw_Onboard |
[NVB#6240888]
Description
Description
Environment Steps to Reproduce Expected Result Actual ResultSecond onboard aborts non-zero. Full trace: [non-interactive] Agent: OpenClaw NemoClaw Onboarding (non-interactive mode) =================== [1/8] Preflight checks ✓ Docker is running / DNS / runtime ok / openshell 0.0.44 / port 8080 owned by healthy NemoClaw runtime [2/8] Starting OpenShell gateway Existing OpenShell Docker-driver gateway is stale (executable=/usr/bin/docker (expected /home/gitlab-runner/.local/bin/openshell-gateway)); it will be recreated. ⚠ Gateway will be recreated when sandbox creation starts — this will affect running sandboxes. !! Port 8080 is not available. OpenShell gateway needs this port. Blocked by: openshell (PID 181769) Detail: sudo lsof reports openshell (PID 181769) listening on port 8080 Teardown (`nemoclaw uninstall --yes`) sequence on the same runner: Stopped host openshell-gateway processes 181005 ← first onboard's gateway, stopped cleanly Failed to stop host openshell-gateway processes 181769 ← orphaned recreate-spawned process, cannot be killed Test impact: T5882265 beforeAll fails ("second sandbox onboard must reuse the existing gateway"), 3 sub-tests get skipped via skipRestOfSuiteAfterFailure.Bug Details
[NVB#6240888]