Description
Description
The "Deployment Topology" section of docs/reference/architecture.md
(rendered at https://docs.nvidia.com/nemoclaw/latest/reference/architecture.html
under the "Deployment Topology" heading) shows a mermaid diagram and a 5-row
"Layering" table that describes the following architecture:
Host CLI → Docker daemon → OpenShell gateway container →
Embedded k3s cluster → Sandbox pod (Landlock + seccomp + netns)
The intro text is unconditional ("NemoClaw uses a Docker daemon. The
OpenShell gateway runs as a container that embeds a k3s cluster. The sandbox
runs as a Kubernetes pod inside that embedded cluster.").
On v0.0.44 / OpenShell 0.0.39, the actual runtime is the `docker` driver and
the deployment looks like this:
Host CLI → Docker daemon
│
├── openshell-gateway (host PROCESS, not container)
│ pid 142292, /localhome/.../openshell-gateway,
│ endpoint http://127.0.0.1:8080,
│ driver = "docker"
│
└── openshell-- (Docker CONTAINER,
image openshell/sandbox-from:, runs the agent
directly — no k3s, no pod, no kubernetes scheduling)
i.e.:
• the OpenShell gateway is a HOST process, not a Docker container
• there is NO embedded k3s cluster anywhere
• the sandbox is a plain Docker container, not a Kubernetes pod
• Landlock / seccomp / netns are applied to the container, not to a pod
A reader using the docs to reason about the security boundary, the host
attack surface, or the recovery flow will form an incorrect mental model of
where each component lives.
Probable root cause: the diagram describes a non-default driver (likely the
"vm" driver, which historically did embed k3s inside a container), but the
default and the OpenShell version pinned by `nemoclaw-blueprint/blueprint.yaml`
(`min_openshell_version: "0.0.39"`, `max_openshell_version: "0.0.39"`) ships
with the docker driver active.
Environment
Device: ipp2-1558 (10.176.178.100), x86_64 server, 32 vCPU / 125 GB RAM, NVIDIA A100 80GB PCIe
OS: Ubuntu 24.04.4 LTS (Linux 6.17.0-23-generic)
Architecture: x86_64
Node.js: v22.x (installed via nvm by NemoClaw installer)
npm: bundled
Docker: 29.5.0
OpenShell CLI: 0.0.39
NemoClaw: v0.0.44
OpenClaw: N/A (docs-only bug)
Steps to Reproduce
1. Open https://docs.nvidia.com/nemoclaw/latest/reference/architecture.html
and scroll to "Deployment Topology". Read both the mermaid diagram and
the 5-row layering table immediately below.
2. Confirm the actual runtime on a v0.0.44 host:
cat ~/.local/state/nemoclaw/openshell-docker-gateway/runtime.json
ps -p $(cat ~/.local/state/nemoclaw/openshell-docker-gateway/openshell-gateway.pid) -o pid,cmd
docker ps --format "{{.Names}}\t{{.Image}}"
3. Look for any embedded k3s cluster:
docker ps | grep -iE "k3s|gateway" # nothing matches
pgrep -af k3s # nothing
4. Confirm the OpenShell version pin:
grep -E "min_openshell|max_openshell" \
~/.nemoclaw/source/nemoclaw-blueprint/blueprint.yaml
Expected Result
The "Deployment Topology" diagram and the layering table accurately
describe what runs where on a v0.0.44 host with the default driver. Either
the doc presents the `docker` driver as the default and labels the k3s
variant as an alternative, or it shows both side-by-side and explains when
each is used.
Actual Result
Step 2 on this host:
runtime.json:
"driver": "docker"
"endpoint": "http://127.0.0.1:8080"
"openshellVersion": "0.0.39"
"gatewayBin": "/localhome/local-glennz/.local/bin/openshell-gateway"
"pid": 142292
ps:
PID CMD
142292 /localhome/local-glennz/.local/bin/openshell-gateway
docker ps:
openshell-hermes-bug- openshell/sandbox-from:1779088500 Up 2 hours
→ gateway is a host process (pid 142292), not a Docker container.
→ the only Docker container is the sandbox itself.
→ there is no separate gateway container, no embedded k3s, no pod
scheduler.
Step 3:
→ no k3s anywhere on the host.
Step 4:
min_openshell_version: "0.0.39"
max_openshell_version: "0.0.39"
→ the pinned-and-only-supported OpenShell version is 0.0.39, which is
the version that uses the docker driver. The doc topology cannot be
realised on the supported OpenShell version.
Specific incorrect claims in the layering table:
• "Docker daemon ... Runs the OpenShell gateway container."
Wrong — Docker runs the sandbox container; the gateway is a host process.
• "Gateway container ... Docker container ... Hosts the credential store,
the L7 proxy, and the embedded k3s control plane."
Wrong — there is no gateway container and no embedded k3s.
• "k3s ... Process tree inside the gateway container ... Kubernetes
control plane that schedules the sandbox pod."
Wrong — k3s does not run on the docker driver.
• "Sandbox pod ... Pod in the embedded k3s cluster ..."
Wrong — sandbox is a Docker container, not a Kubernetes pod.
Logs
$ cat ~/.local/state/nemoclaw/openshell-docker-gateway/runtime.json
{
"version": 1,
"pid": 142292,
"driver": "docker",
"platform": "linux",
"arch": "x64",
"endpoint": "http://127.0.0.1:8080",
"desiredEnvHash": "eaa3f1da31b0055ba16cb068e20b2ff87cceaf470f4325361ffc3fe1a06bab35",
"gatewayBin": "/localhome/local-glennz/.local/bin/openshell-gateway",
"openshellVersion": "0.0.39",
"dockerHost": "unix:///run/docker.sock",
"createdAt": "2026-05-18T07:14:30.387Z"
}
$ docker ps --format "{{.Names}}\t{{.Image}}\t{{.Status}}"
openshell-hermes-bug-ccd4651e-ff25-430c-8665-413bec910f84 openshell/sandbox-from:1779088500 Up 2 hours
Suggested Fix
Two options:
(1) Rewrite the "Deployment Topology" section to describe the docker driver
(the default and the only driver supported by the pinned OpenShell
0.0.39). The corrected layering table would look like:
Host CLI Host process (nemoclaw on Node.js)
Docker daemon Host service
OpenShell Host process (openshell-gateway), hosts the
gateway credential store + L7 proxy. Endpoint
http://127.0.0.1:8080.
Sandbox Docker container (openshell/sandbox-from:),
container runs the OpenClaw agent and NemoClaw plugin
under Landlock + seccomp + container netns.
The mermaid diagram should drop the GWCON / K3S / POD nesting and
instead show Docker daemon on the same level as the gateway process,
with one or more sandbox containers as Docker peers of the gateway.
(2) If both drivers (`docker` and the legacy k3s-pod-in-container variant)
are still meant to be documented, present them side-by-side as two
deployment modes and call out which one the default OpenShell 0.0.39
install uses. Make it clear that the diagram with embedded k3s does
NOT apply to a default v0.0.44 install.
Either way the unconditional claim in the section intro ("NemoClaw uses a
Docker daemon. The OpenShell gateway runs as a container that embeds a k3s
cluster.") must be removed or made conditional, because it is false for
every v0.0.44 install on the pinned OpenShell 0.0.39.
Bug Details
| Field |
Value |
| Priority |
Unprioritized |
| Action |
Dev - Open - To fix |
| Disposition |
Open issue |
| Module |
Machine Learning - NemoClaw |
| Keyword |
NemoClaw, NemoClaw_Docs, NEMOCLAW_GH_SYNC_APPROVAL |
[NVB#6186777]
Description
Description
The "Deployment Topology" section of docs/reference/architecture.md (rendered at https://docs.nvidia.com/nemoclaw/latest/reference/architecture.html under the "Deployment Topology" heading) shows a mermaid diagram and a 5-row "Layering" table that describes the following architecture: Host CLI → Docker daemon → OpenShell gateway container → Embedded k3s cluster → Sandbox pod (Landlock + seccomp + netns) The intro text is unconditional ("NemoClaw uses a Docker daemon. The OpenShell gateway runs as a container that embeds a k3s cluster. The sandbox runs as a Kubernetes pod inside that embedded cluster."). On v0.0.44 / OpenShell 0.0.39, the actual runtime is the `docker` driver and the deployment looks like this: Host CLI → Docker daemon │ ├── openshell-gateway (host PROCESS, not container) │ pid 142292, /localhome/.../openshell-gateway, │ endpoint http://127.0.0.1:8080, │ driver = "docker" │ └── openshell-- (Docker CONTAINER, image openshell/sandbox-from:, runs the agent directly — no k3s, no pod, no kubernetes scheduling) i.e.: • the OpenShell gateway is a HOST process, not a Docker container • there is NO embedded k3s cluster anywhere • the sandbox is a plain Docker container, not a Kubernetes pod • Landlock / seccomp / netns are applied to the container, not to a pod A reader using the docs to reason about the security boundary, the host attack surface, or the recovery flow will form an incorrect mental model of where each component lives. Probable root cause: the diagram describes a non-default driver (likely the "vm" driver, which historically did embed k3s inside a container), but the default and the OpenShell version pinned by `nemoclaw-blueprint/blueprint.yaml` (`min_openshell_version: "0.0.39"`, `max_openshell_version: "0.0.39"`) ships with the docker driver active.Environment Steps to Reproduce1. Open https://docs.nvidia.com/nemoclaw/latest/reference/architecture.html and scroll to "Deployment Topology". Read both the mermaid diagram and the 5-row layering table immediately below. 2. Confirm the actual runtime on a v0.0.44 host: cat ~/.local/state/nemoclaw/openshell-docker-gateway/runtime.json ps -p $(cat ~/.local/state/nemoclaw/openshell-docker-gateway/openshell-gateway.pid) -o pid,cmd docker ps --format "{{.Names}}\t{{.Image}}" 3. Look for any embedded k3s cluster: docker ps | grep -iE "k3s|gateway" # nothing matches pgrep -af k3s # nothing 4. Confirm the OpenShell version pin: grep -E "min_openshell|max_openshell" \ ~/.nemoclaw/source/nemoclaw-blueprint/blueprint.yamlExpected Result Actual ResultStep 2 on this host: runtime.json: "driver": "docker" "endpoint": "http://127.0.0.1:8080" "openshellVersion": "0.0.39" "gatewayBin": "/localhome/local-glennz/.local/bin/openshell-gateway" "pid": 142292 ps: PID CMD 142292 /localhome/local-glennz/.local/bin/openshell-gateway docker ps: openshell-hermes-bug- openshell/sandbox-from:1779088500 Up 2 hours → gateway is a host process (pid 142292), not a Docker container. → the only Docker container is the sandbox itself. → there is no separate gateway container, no embedded k3s, no pod scheduler. Step 3: → no k3s anywhere on the host. Step 4: min_openshell_version: "0.0.39" max_openshell_version: "0.0.39" → the pinned-and-only-supported OpenShell version is 0.0.39, which is the version that uses the docker driver. The doc topology cannot be realised on the supported OpenShell version. Specific incorrect claims in the layering table: • "Docker daemon ... Runs the OpenShell gateway container." Wrong — Docker runs the sandbox container; the gateway is a host process. • "Gateway container ... Docker container ... Hosts the credential store, the L7 proxy, and the embedded k3s control plane." Wrong — there is no gateway container and no embedded k3s. • "k3s ... Process tree inside the gateway container ... Kubernetes control plane that schedules the sandbox pod." Wrong — k3s does not run on the docker driver. • "Sandbox pod ... Pod in the embedded k3s cluster ..." Wrong — sandbox is a Docker container, not a Kubernetes pod.Logs$ cat ~/.local/state/nemoclaw/openshell-docker-gateway/runtime.json { "version": 1, "pid": 142292, "driver": "docker", "platform": "linux", "arch": "x64", "endpoint": "http://127.0.0.1:8080", "desiredEnvHash": "eaa3f1da31b0055ba16cb068e20b2ff87cceaf470f4325361ffc3fe1a06bab35", "gatewayBin": "/localhome/local-glennz/.local/bin/openshell-gateway", "openshellVersion": "0.0.39", "dockerHost": "unix:///run/docker.sock", "createdAt": "2026-05-18T07:14:30.387Z" } $ docker ps --format "{{.Names}}\t{{.Image}}\t{{.Status}}" openshell-hermes-bug-ccd4651e-ff25-430c-8665-413bec910f84 openshell/sandbox-from:1779088500 Up 2 hoursSuggested FixTwo options: (1) Rewrite the "Deployment Topology" section to describe the docker driver (the default and the only driver supported by the pinned OpenShell 0.0.39). The corrected layering table would look like: Host CLI Host process (nemoclaw on Node.js) Docker daemon Host service OpenShell Host process (openshell-gateway), hosts the gateway credential store + L7 proxy. Endpoint http://127.0.0.1:8080. Sandbox Docker container (openshell/sandbox-from:), container runs the OpenClaw agent and NemoClaw plugin under Landlock + seccomp + container netns. The mermaid diagram should drop the GWCON / K3S / POD nesting and instead show Docker daemon on the same level as the gateway process, with one or more sandbox containers as Docker peers of the gateway. (2) If both drivers (`docker` and the legacy k3s-pod-in-container variant) are still meant to be documented, present them side-by-side as two deployment modes and call out which one the default OpenShell 0.0.39 install uses. Make it clear that the diagram with embedded k3s does NOT apply to a default v0.0.44 install. Either way the unconditional claim in the section intro ("NemoClaw uses a Docker daemon. The OpenShell gateway runs as a container that embeds a k3s cluster.") must be removed or made conditional, because it is false for every v0.0.44 install on the pinned OpenShell 0.0.39.Bug Details
[NVB#6186777]