Skip to content

refactor(cli): typed command registry to fix docs-drift and re-enable cloud-experimental-e2e #2388

@jyaunches

Description

@jyaunches

Description

Problem Statement

The cloud-experimental-e2e nightly job has been disabled since PR #2367 (via vars.CLOUD_EXPERIMENTAL_E2E_ENABLED) and has never passed in the nightly pipeline. It was previously blocked by two issues:

  1. Landlock /sandbox writability regressionnow resolved. OpenShell fix: remove gcc and netcat from sandbox image (#807, #808) #810 (two-phase Landlock fix) merged April 13, and NemoClaw issue [All Platform][Security]OpenShell 0.0.26 does not enforce Landlock filesystem policy — /sandbox writable on all platforms #1739 was closed April 21. The blueprint already pins min_openshell_version: "0.0.32" and install-openshell.sh pins MIN_VERSION="0.0.32", both above the fix.

  2. CLI/docs command reference drift (Phase 5f)still broken. check-docs.sh compares nemoclaw --help output against ### \nemoclaw …`headings indocs/reference/commands.md. The check finds 36 commands in --help` but only 31 headings in the docs. The mismatch comes from two sources:

    • Parser brittleness: the perl regex that normalizes --help lines fails to strip inline description text when commands use single-space separators (e.g., nemoclaw <name> channels remove <channel> Clear credentials and rebuild leaks the description)
    • Real drift: commands like nemoclaw onboard --from, nemoclaw setup, nemoclaw start, nemoclaw stop were added to help() without corresponding ### headings in commands.md

There is an existing issue (#2333) and open PR (#2332) that patches both the parser and the docs. But patching is not the right long-term fix — the same problem will recur every time someone adds or modifies a CLI command.

Root Cause

Commands are defined in three independent, untyped places with no shared source of truth:

  1. help() function in src/nemoclaw.ts (~line 2929) — hand-written console.log() strings with ANSI formatting
  2. Dispatch switch in src/nemoclaw.ts (~line 3026) — the actual command routing logic
  3. docs/reference/commands.md — hand-written Markdown headings

Every time someone adds a command, they must update all three by hand. check-docs.sh tries to catch drift after the fact by parsing the --help text output, but the parser itself is fragile and contributes false mismatches.

Proposed Design

Create a typed command registry as the single source of truth, then derive all three outputs from it.

Phase 1: Typed command registry

Create src/lib/command-registry.ts:

interface CommandDef {
  usage: string;           // "nemoclaw <name> snapshot create"
  description: string;     // "Create a snapshot of sandbox state"
  flags?: string;          // "[--name <label>]"
  group: CommandGroup;     // "Sandbox Management"
  deprecated?: boolean;
}

type CommandGroup =
  | "Getting Started"
  | "Sandbox Management"
  | "Skills"
  | "Policy Presets"
  | "Messaging Channels"
  | "Compatibility Commands"
  | "Services"
  | "Troubleshooting"
  | "Credentials"
  | "Backup"
  | "Upgrade"
  | "Cleanup";

export const COMMANDS: readonly CommandDef[] = [
  { usage: "nemoclaw onboard", description: "Configure inference endpoint and credentials", group: "Getting Started" },
  // ...all commands
] as const;

TypeScript enforces that every entry has the required fields and valid group. Adding a command means adding one entry to this array.

Phase 2: Generate help() from registry

Rewrite the help() function to iterate COMMANDS grouped by group, applying ANSI formatting. No more hand-written command lines in the help output.

Phase 3: Validate docs from registry

Either:

  • Option A: Generate docs/reference/commands.md headings from the registry (like the existing docs-to-skills.py pattern), or
  • Option B: Simplify check-docs.sh to import the registry directly and compare against Markdown headings — no perl parsing of --help output

Option A is stronger (eliminates drift entirely), but Option B is simpler and may be more practical given the Sphinx/MyST doc pipeline.

Phase 4: Re-enable cloud-experimental-e2e

Once Phase 5f passes reliably, flip vars.CLOUD_EXPERIMENTAL_E2E_ENABLED to true. The Landlock blocker is already resolved.

Alternatives Considered

  • Patch check-docs.sh and commands.md (PR fix(docs): align CLI help text with commands reference for check-docs parity #2332): Fixes the immediate mismatch but doesn't prevent recurrence. Every future command addition risks the same drift.
  • Remove cloud-experimental-e2e entirely: Rejected — it's the only test that exercises the public installer flow, interactive onboard, skill injection + agent verification, OpenClaw TUI, Landlock enforcement, and doc validation. No other E2E test covers these.

What cloud-experimental-e2e uniquely covers

This is the most comprehensive single E2E test in the suite (2,450 lines, 12 scripts). Capabilities not tested elsewhere:

Phase Unique Coverage
Phase 3 Full public installer (curl nvidia.com/nemoclaw.sh | bash) with interactive expect-driven onboard
Phase 5 checks Landlock read-only enforcement (04-landlock-readonly.sh)
Phase 5b Live cloud chat via inference.local → gateway → NVIDIA Cloud
Phase 5c/5d Agent skills validation + skill injection + agent turn verification
Phase 5e OpenClaw TUI smoke (expect-driven connect → tui → message → Ctrl+C quit)
Phase 5f Documentation validation — the focus of this issue

Related Issues / PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    area: architectureArchitecture, design debt, major refactors, or maintainabilityarea: e2eEnd-to-end tests, nightly failures, or validation infrastructure

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions