Summary
Introduce a Dashboard Delivery Contract — a single source of truth for dashboard reachability configuration, a chain-level health verifier, and layered recovery. This replaces the current scattered approach where CORS origins, port forwarding, health probes, and access URLs are derived independently in 4+ files with no shared model.
Motivation
15 closed bugs, same files, same root cause
Over the past 5 weeks, 15 dashboard-related bugs have been filed, fixed as point patches, and closed — almost all touching onboard.ts, Dockerfile, nemoclaw-start.sh, and dashboard.ts:
| # |
Date |
Bug |
Point fix |
| #311 |
Mar 20 |
Gateway not running after reboot |
Manual restart docs |
| #328 |
Mar 20 |
Can't connect Mission Control to gateway |
CORS + device auth patch |
| #820 |
Mar 24 |
Can't connect to localhost |
Port forward docs |
| #20 |
Apr 3 |
Remote dashboard binds 127.0.0.1 only |
Bake CHAT_UI_URL into Docker |
| #957 |
Mar 31 |
Forward binds wrong address, not restored after recreate |
Fix forward target |
| #684 |
Apr 13 |
Default ports conflict with common services |
Add configurable ports |
| #739 |
Apr 13 |
Restrictive allowedOrigins, can't update |
Widen CORS in Dockerfile |
| #1376 |
Apr 6 |
Connect fails after gateway restart |
Fix TLS cert handling |
| #1425 |
Apr 8 |
Gateway restart regenerates TLS, breaks connections |
Certificate persistence |
| #716 |
Apr 13 |
WSL2 restart breaks everything |
Fix reconnect logic |
| #1690 |
Apr 10 |
Destroy kills forward even when other sandbox uses it |
Guard shared forward |
| #1925 |
Apr 16 |
Custom port not forwarded correctly |
Inject NEMOCLAW_DASHBOARD_PORT |
| #1950 |
Apr 16 |
Re-onboard doesn't clean orphaned forward |
Auto-cleanup in onboard |
| #2020 |
Apr 17 |
"Healthy" gateway is actually dead |
Probe container, not stale metadata |
| #2167 |
Apr 23 |
Status doesn't show dashboard URL |
Add URL to status output |
12 of these 15 would have been prevented by a contract that derives all dashboard config from a single source and verifies the full delivery chain.
6 more open bugs in the same area
| # |
Age |
Assignee |
Title |
| #2042 |
6 days |
unassigned |
Pod restart leaves gateway + forward dead; recovery buried in connect |
| #2342 |
today |
jyaunches |
Dashboard "Version n/a" / "Health Offline" on Brev Launchable |
| #1178 |
23 days |
senthilr-nv, cjagwani |
Openclaw UI link intermittently unhealthy on Brev |
| #2258 |
1 day |
unassigned |
Brev onboard UI: 7 failures (CORS, probe, forward, SSE) |
| #2174 |
2 days |
cjagwani |
Second onboard crashes on port 18789 conflict |
| #2100 |
3 days |
cjagwani |
No E2E test for dashboard reachability |
The Problem
The dashboard delivery chain has 4 links:
Link 1: Gateway Process → running inside sandbox on :18789
Link 2: Port Forward → SSH tunnel: host:18789 → sandbox:18789
Link 3: CORS / Auth Config → allowedOrigins includes the browser's origin
Link 4: External Routing → [Brev] nginx + cloudflared → host:18789
Today, each link is configured in a different file, checked in a different way (or not at all), and has no recovery mechanism:
CHAT_UI_URL is derived independently in onboard.ts, Dockerfile, nemoclaw-start.sh, and dashboard.ts
- CORS origins are baked at Dockerfile build time with no runtime detection of the actual access URL
- Health probes check
/ (returns 401 with auth enabled) instead of /health
- Port forwarding is fire-and-forget with no health monitoring
- No function exists that answers "is the dashboard reachable end-to-end and if not, which link is broken?"
- Recovery only happens as a hidden side-effect of
nemoclaw connect
Proposed Architecture
1. dashboard-contract.ts — Single Source of Truth (~80 lines)
interface DashboardDeliveryChain {
accessUrl: string; // auto-detected: Brev public URL, WSL host IP, or loopback
corsOrigins: string[]; // always includes accessUrl origin + loopback
forwardTarget: string; // loopback → port-only; non-loopback → 0.0.0.0:port
healthEndpoint: string; // /health — accepts 200 or 401 as "alive"
port: number; // from NEMOCLAW_DASHBOARD_PORT or default
}
function buildChain(options?: { chatUiUrl?: string; platform?: string }): DashboardDeliveryChain;
All consumers (Dockerfile, onboard.ts, nemoclaw-start.sh, status, connect) read from this instead of deriving independently.
2. dashboard-health.ts — Chain Verification (~120 lines)
interface ChainStatus {
healthy: boolean;
links: {
gateway: { ok: boolean; detail: string };
forward: { ok: boolean; detail: string };
cors: { ok: boolean; detail: string };
external: { ok: boolean; detail: string };
};
diagnosis: string; // human-readable: "CORS allowedOrigins missing https://..."
}
function verifyDashboardChain(sandboxName: string): ChainStatus;
Used by nemoclaw status, onboard.ts (verify before printing success), and the dashboard UI.
3. dashboard-recover.ts — Layered Recovery (~100 lines)
function recoverDashboardChain(sandboxName: string): RecoverResult;
Idempotent, link-aware. Only fixes broken links:
- Link 1 down → restart gateway inside sandbox
- Link 2 down → re-establish
openshell forward
- Link 3 wrong → patch CORS with detected
accessUrl
- Link 4 down → diagnose and report (outside our control)
Used by nemoclaw recover (PR #2050), nemoclaw connect, and nemoclaw-start.sh on boot.
Files Changed
| File |
Change |
Lines |
src/lib/dashboard-contract.ts |
New |
~80 |
src/lib/dashboard-health.ts |
New |
~120 |
src/lib/dashboard-recover.ts |
New |
~100 |
src/lib/dashboard.ts |
Refactor to use contract |
net reduction |
src/lib/onboard.ts |
Replace scattered CHAT_UI_URL derivation with buildChain() |
net reduction |
scripts/nemoclaw-start.sh |
Call verify/recover on boot |
~10 lines added |
3 new files (~300 lines), 3 modified files (net code reduction in onboard.ts).
Open Issues This Would Address
Once landed, the following open issues can be resolved or significantly simplified:
Non-Goals
Sequencing
This refactor touches onboard.ts and nemoclaw-start.sh, which currently have 9 open PRs against them. However, none of those PRs touch the dashboard delivery chain code (CORS, ensureDashboardForward, health probes, CHAT_UI_URL derivation). The contract can land independently or after the current wave of PRs clears.
Recommended approach: land #2342 (Brev Launchable point fix) first as a quick win, then follow with this refactor so the pattern doesn't repeat.
Summary
Introduce a Dashboard Delivery Contract — a single source of truth for dashboard reachability configuration, a chain-level health verifier, and layered recovery. This replaces the current scattered approach where CORS origins, port forwarding, health probes, and access URLs are derived independently in 4+ files with no shared model.
Motivation
15 closed bugs, same files, same root cause
Over the past 5 weeks, 15 dashboard-related bugs have been filed, fixed as point patches, and closed — almost all touching
onboard.ts,Dockerfile,nemoclaw-start.sh, anddashboard.ts:12 of these 15 would have been prevented by a contract that derives all dashboard config from a single source and verifies the full delivery chain.
6 more open bugs in the same area
connectThe Problem
The dashboard delivery chain has 4 links:
Today, each link is configured in a different file, checked in a different way (or not at all), and has no recovery mechanism:
CHAT_UI_URLis derived independently inonboard.ts,Dockerfile,nemoclaw-start.sh, anddashboard.ts/(returns 401 with auth enabled) instead of/healthnemoclaw connectProposed Architecture
1.
dashboard-contract.ts— Single Source of Truth (~80 lines)All consumers (Dockerfile, onboard.ts, nemoclaw-start.sh, status, connect) read from this instead of deriving independently.
2.
dashboard-health.ts— Chain Verification (~120 lines)Used by
nemoclaw status,onboard.ts(verify before printing success), and the dashboard UI.3.
dashboard-recover.ts— Layered Recovery (~100 lines)Idempotent, link-aware. Only fixes broken links:
openshell forwardaccessUrlUsed by
nemoclaw recover(PR #2050),nemoclaw connect, andnemoclaw-start.shon boot.Files Changed
src/lib/dashboard-contract.tssrc/lib/dashboard-health.tssrc/lib/dashboard-recover.tssrc/lib/dashboard.tssrc/lib/onboard.tsbuildChain()scripts/nemoclaw-start.sh3 new files (~300 lines), 3 modified files (net code reduction in onboard.ts).
Open Issues This Would Address
Once landed, the following open issues can be resolved or significantly simplified:
recoverDashboardChainon bootverifyDashboardChain()becomes the test primitiveNon-Goals
Sequencing
This refactor touches
onboard.tsandnemoclaw-start.sh, which currently have 9 open PRs against them. However, none of those PRs touch the dashboard delivery chain code (CORS,ensureDashboardForward, health probes,CHAT_UI_URLderivation). The contract can land independently or after the current wave of PRs clears.Recommended approach: land #2342 (Brev Launchable point fix) first as a quick win, then follow with this refactor so the pattern doesn't repeat.