[Ubuntu 22.04][Onboard] nemohermes re-onboard re-asks all 5 messaging per-channel prompts; credentials.json never written; "Messaging: none" on Run 2 review

## Description

Description
<pre>The wizard accepts all five per-channel values on the first run and prints "✓ saved" after each,
but a second `nemohermes onboard --recreate-sandbox` re-asks every one of them,
the second-run review screen shows "Messaging: none",
and ~/.nemoclaw/credentials.json is never written. Spec expects "already set" / skip messages and an 8-key credentials.json.
</pre>Environment

<pre>Device: Brev shell `nemoclaw-0514` (shadeform-managed; host brev-w4rqzli3u)
OS: Ubuntu 22.04.5 LTS, kernel 6.8.0-90-generic
Architecture: x86_64
GPU: NVIDIA H100 PCIe (81559 MiB)
Node.js: v22.22.3
npm: 10.9.8
Docker: 29.1.3 (server)
OpenShell CLI: openshell 0.0.39
NemoClaw: v0.0.43
NemoHermes: v0.0.43
OpenClaw: Not reached (sandbox build failed at GPU patch step — see related bug section)
Docker nvidia runtime: registered (Default Runtime: nvidia)
nvidia-container-toolkit: 1.19.0-1
</pre>Steps to Reproduce

<pre>Pre-conditions:
 - NemoClaw + NemoHermes v0.0.43 installed (curl|bash; license accepted)
 - ~/.nemoclaw/credentials.json absent (confirm with ls)
 - Export three bot tokens (placeholder format OK; test verifies prompt persistence
 not message delivery):
 export TELEGRAM_BOT_TOKEN="123456789:AAAAAAAAAAAAAAAAAAAAAAAAAAAA****"
 export DISCORD_BOT_TOKEN="MTAxMjM0NTY3ODkwMTIzNDU2Nzg5MA.Gabcde.AAAAAAAA..."
 export SLACK_BOT_TOKEN="xoxb-1234567890123-1234567890123-AbCdEfGhIjKlMnOp..."

Run 1:
 1. nemohermes onboard --recreate-sandbox
 2. Walk through wizard: provider 1 (NVIDIA Endpoints) → API key → default model
 (Nemotron 3 Super 120B) → sandbox name "hermes" → Apply.
 3. At [5/8] Messaging channels, all three channels are auto-toggled ON
 (env tokens detected). Press Enter to advance to per-channel prompts.
 4. Fill the 5 spec-listed prompts in wizard order:
 a) Telegram "Reply only when @mentioned? [Y/n]:" → y
 b) Telegram "User ID (for DM access)": → 12345,67890
 c) Discord "Server ID": → 11111
 d) Discord "Reply only when @mentioned? [Y/n]:" → n
 e) Discord "User ID (optional guild allowlist)": → 22222
 (Slack also asks for App Token + Member IDs — not in spec but wizard requires.)
 5. Wizard prints "✓ saved" after each input.
 6. Inspect host state:
 ls -la ~/.nemoclaw/credentials.json
 python3 -c "import json; print(json.load(open('/home/.../onboard-session.json')).get('messagingChannels'))"

Run 2:
 7. Clean up failed sandbox: openshell sandbox delete hermes
 8. nemohermes onboard --recreate-sandbox (same env)
 9. Repeat the same wizard sequence and observe the messaging step.
</pre>Expected Result

<pre>Per T6002672 spec:
 1) Run 1 prompts for all five values; onboard completes.
 2) ~/.nemoclaw/credentials.json contains:
 TELEGRAM_BOT_TOKEN, TELEGRAM_ALLOWED_IDS, TELEGRAM_REQUIRE_MENTION,
 DISCORD_BOT_TOKEN, DISCORD_SERVER_ID, DISCORD_USER_ID, DISCORD_REQUIRE_MENTION,
 SLACK_BOT_TOKEN
 3) Run 2 reports "already set" (or equivalent skip) on each of the five
 per-channel prompts; the same values are NOT re-asked.
 FAIL signal per spec: any prompt re-asks a value entered in Run 1.
</pre>Actual Result

<pre>Run 1 — credentials.json NEVER written:
 $ ls -la ~/.nemoclaw/credentials.json
 ls: cannot access ...: No such file or directory

 $ python3 -c "import json; d=json.load(open('~/.nemoclaw/onboard-session.json')); print(d.get('messagingChannels'), d.get('messagingConfig'))"
 ['telegram','discord','slack'] None

 The five per-channel values are NOT on the host filesystem. Instead they are
 baked into the sandbox IMAGE build args (visible in the Dockerfile build log):
 ARG NEMOCLAW_MESSAGING_CHANNELS_B64=WyJkaXNjb3JkIiwic2x******
 ARG NEMOCLAW_MESSAGING_ALLOWED_IDS_B64=eyJ0ZWxlZ3JhbSI6******wIl0...
 ARG NEMOCLAW_DISCORD_GUILDS_B64=eyIxMTExMSI6eyJyZXF1aXJl****2UsIn...
 ARG NEMOCLAW_TELEGRAM_CONFIG_B64=eyJyZXF1aXJl*****J1ZX0=
 This is a different persistence model than the spec assumes.

Run 2 — Review configuration shows "Messaging: none":
 Provider: nvidia-prod
 Model: nvidia/nemotron-3-super-120b-a12b
 API key: NVIDIA_API_KEY (staged for OpenShell gateway registration)
 Web search: disabled
 Messaging: none ← KEY EVIDENCE: Run 1 config not remembered
 Sandbox name: hermes

Run 2 — All 5 spec-listed per-channel prompts re-asked verbatim:

 Prompt | Run 1 input | Run 2 wizard behavior
 --------------------------------------+----------------+----------------------
 telegram Reply only when @mentioned? | y | RE-ASKED
 telegram User ID (allowlist) | 12345,67890 | RE-ASKED
 discord Server ID | 11111 | RE-ASKED
 discord Reply only when @mentioned? | n | RE-ASKED
 discord User ID | 22222 | RE-ASKED
 (extra) Slack App Token | xapp-1-... | RE-ASKED
 (extra) Slack Member IDs | U01ABC..,U04.. | RE-ASKED

The "✓ telegram — already configured" header on Run 2 refers ONLY to the
bot TOKEN env var being present, NOT to the channel's full configuration
having survived from Run 1.
</pre>Logs

<pre>Run 1 wizard transcripts: /tmp/hermes-run1.log
Run 2 wizard transcripts: /tmp/hermes-run2b.log
Failure diagnostics: ~/.nemoclaw/onboard-failures/2026-05-15T09-41-20-223Z-hermes-docker-gpu-patch/

Two interpretations — needs PM/Eng triage:

(a) Product bug: the wizard's persistence layer for messaging config does not
 survive a re-onboard. The "✓ saved" messages are misleading because the
 values only land in the sandbox image build args (and even those are lost
 once `--recreate-sandbox` rebuilds the image from scratch on Run 2).
 To meet the spec, the wizard must persist the 5 per-channel values to a
 location that survives between onboards (credentials.json on host, OR
 a gateway-side store the wizard re-reads on launch).

(b) Spec is stale: the persistence model has intentionally shifted from
 host-side credentials.json to image-time build args. In that case, the
 T6002672 verification approach needs updating — e.g. assert that the
 NEMOCLAW_*_B64 ARGs inside the latest sandbox image match the entered
 values, AND make the wizard report "already set" on Run 2 by reading
 those ARG values from the existing image before recreating it.

Either way, the on-screen behavior contradicts the spec's FAIL criterion
("any prompt re-asks for a value the user already entered in run 1") — all
5 prompts re-ask.

Notes on related findings (separate issues, mentioned for triage context, not duplicates):
 - Docker GPU patch failed in Run 1 with "OpenShell supervisor did not
 reconnect to the GPU-enabled container." Different from the AMD CDI
 spec bug (6126101 / 6110214) — `--gpus all` mode select succeeded but
 supervisor reconnect timed out. Container stuck in Restarting loop.
 Should be filed separately if not already tracked.
 - Wizard preflight UX improvement (positive): when sandbox→gateway is
 blocked by UFW on the 172.18.0.0/16 bridge, the wizard now prints the
 exact `sudo ufw allow ...` remediation command. Big improvement over
 earlier "auth proxy unreachable" cryptic error.
</pre>

## Bug Details

| Field | Value |
|-------|-------|
| Priority | Unprioritized |
| Action | Dev - Open - To fix |
| Disposition | Open issue |
| Module | Machine Learning - NemoClaw |
| Keyword | NemoClaw, NemoClaw_CLI&UX, NEMOCLAW_GH_SYNC_APPROVAL, NemoClaw_Onboard |

---
[NVB#6180486]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Ubuntu 22.04][Onboard] nemohermes re-onboard re-asks all 5 messaging per-channel prompts; credentials.json never written; "Messaging: none" on Run 2 review #3581

Description

Bug Details

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Field	Value
Priority	Unprioritized
Action	Dev - Open - To fix
Disposition	Open issue
Module	Machine Learning - NemoClaw
Keyword	NemoClaw, NemoClaw_CLI&UX, NEMOCLAW_GH_SYNC_APPROVAL, NemoClaw_Onboard

[Ubuntu 22.04][Onboard] nemohermes re-onboard re-asks all 5 messaging per-channel prompts; credentials.json never written; "Messaging: none" on Run 2 review #3581

Description

Description

Bug Details

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions