Skip to content

refactor(cli): migrate to oclif + TypeScript #909

@prekshivyas

Description

@prekshivyas

Summary

Migrate the NemoClaw CLI from the current hand-rolled switch/case dispatcher (bin/nemoclaw.js, 515 lines) to oclif (Open CLI Framework) with TypeScript. This is a foundational cleanup that addresses systemic CLI issues — inconsistent help, missing validations, no completions, docs drift, and poor error handling — in one structural change rather than piecemeal fixes.

This wouldn't take precedence over security work, but it should be high priority.@cv

Motivation

The current CLI has accumulated enough problems that fixing them individually is less efficient than fixing the architecture:

oclif solves most of these structurally: auto-generated help, built-in flag validation, shell completions via plugin, one-file-per-command architecture, and TypeScript support.

Note: oclif's @oclif/plugin-autocomplete supports bash, zsh, and PowerShell — not fish. Fish completion will need a lightweight custom generator or a community plugin on top of oclif.

Issues addressed (24)

Completions

Help / docs consistency

Flag / parameter validation

CLI structure / architecture

Error handling

PRs that would be superseded or simplified (17)

PR Title How migration addresses it
#895 Shell completion (bash/zsh/fish) Replaced by oclif autocomplete plugin (bash/zsh/PowerShell; fish needs custom work)
#160 Shell completion (earlier attempt) Same
#472 Sandbox export + shell completion Completion half solved by oclif
#901 Sync command reference with CLI Automated by oclif readme
#848 Fix policy CLI examples in docs Auto-generated docs stay in sync
#182 Add --json to list command oclif's enableJsonFlag + this.logJson()
#350 Fix logs subcommand routing oclif command router eliminates this class of bug
#604 Validate deploy instance names oclif args with parse
#477 Fix --type flag for brev create TypeScript catches flag name mismatches at compile time
#631 Fix defaultValue in promptOrDefault interactive mode (Fixes #360) TypeScript strict null checks flag this
#853 Improve model entry validation TypeScript interfaces enforce structure
#644 Self-update command Replaced by @oclif/plugin-update
#826 CLI metrics collection oclif hooks provide cleaner pattern
#672 Redact secrets from output oclif base class catch() for global redaction
#794 Redact secrets (alternate approach) Same
#638 Error handling for JSON.parse TypeScript + oclif error classes
#398 Error handling for gateway startup oclif catch()/finally() lifecycle

Migration plan

Phase 1 — Scaffold + simple commands (1-2 days)

  • Initialize oclif project structure with TypeScript
  • Migrate trivial commands: list, status, start, stop, setup, setup-spark, debug, uninstall
  • Verify oclif entry point works

Phase 2 — Sandbox commands + deploy (1-2 days)

  • Migrate sandbox-scoped commands under sandbox/ topic
  • Add command_not_found hook so nemoclaw <sandbox-name> <action> still works (zero UX breakage)
  • Migrate deploy
  • Replace process.exit() calls in lib modules with thrown errors (~15-20 call sites)

Phase 3 — Onboard wizard (2-3 days)

  • Migrate the ~900-line onboard wizard (most complex command)
  • Replace custom readline/promptSecret with oclif-compatible prompts
  • Verify --non-interactive mode still works

Phase 4 — Cleanup, tests, hardening (1-2 days)

  • Delete old dispatcher (bin/nemoclaw.js)
  • Rewrite CLI integration tests for oclif
  • Verify auto-generated help, completions, and --json output
  • Update README and contributing guide

Estimated total: 5-9 days of focused work, ~30-35 files touched.

Proposed structure

src/
  commands/
    onboard.ts
    deploy.ts
    start.ts / stop.ts / status.ts / list.ts
    debug.ts / uninstall.ts
    setup.ts / setup-spark.ts
    sandbox/
      connect.ts          # nemoclaw sandbox connect <name>
      status.ts           # nemoclaw sandbox status <name>
      logs.ts             # nemoclaw sandbox logs <name>
      destroy.ts          # nemoclaw sandbox destroy <name>
      policy/
        add.ts            # nemoclaw sandbox policy add <name>
        list.ts           # nemoclaw sandbox policy list <name>
  lib/                    # existing modules, converted to TS
    runner.ts
    credentials.ts
    registry.ts
    nim.ts
    policies.ts
    onboard-wizard.ts
    platform.ts
    preflight.ts
    inference-config.ts
    local-inference.ts
    resolve-openshell.ts
  hooks/
    command-not-found.ts  # backward compat for "nemoclaw <sandbox> <action>"

UX compatibility

The nemoclaw <sandbox-name> <action> syntax is preserved via a command_not_found hook — no breaking changes to existing workflows or scripts. The new nemoclaw sandbox <action> <name> syntax is also available.

Risks

  • Onboard wizard complexity — At ~900 lines with deeply nested conditional flows, interactive prompts, process spawning, and polling loops, this is where bugs are most likely during migration. Plan: keep the wizard logic in its own module and have the oclif command class be a thin wrapper.
  • Interactive prompt fidelity — The custom promptSecret in credentials.js does raw-mode terminal input with careful escape sequence handling. oclif's prompts or inquirer may behave differently. Needs careful testing on macOS and Linux.
  • Atomic cutoverpackage.json bin field can only point to one dispatcher. There's no way to run both the old and new entry points simultaneously for the same binary name. The switch happens all at once.
  • stdio: 'inherit' with oclif — Several commands (debug, uninstall, start, stop, setup-spark) delegate to bash scripts via spawnSync(..., { stdio: 'inherit' }). oclif's output handling could interfere. Test early in Phase 1.
  • Fish completion gap — oclif's autocomplete plugin does not support fish. Since feat: add shell completion for nemoclaw CLI #155 and feat(cli): add shell completion for bash, zsh, and fish #895 both explicitly request fish, this needs a custom fish completion generator or community plugin as a follow-up.

Build & tooling changes

  • TypeScript compilation — Adding a tsc build step. The project already uses tsc for type-checking via jsconfig.json, but now it becomes a required build step for the CLI to run.
  • package.json bin field — Must change from ./bin/nemoclaw.js to oclif's entry point. This is the atomic cutover point.
  • Test runner — The project uses Vitest with ESM imports. oclif scaffolds mocha/jest by default, but we will keep Vitest to avoid rewriting all existing lib module tests. Only CLI integration tests (cli.test.js) need rewriting for oclif's test patterns.
  • ESM vs CommonJS — The project is currently CommonJS, but tests use ESM via Vitest. Newer oclif versions support ESM. Decision on module system should be made in Phase 1 scaffolding.

PR disposition

Definition of done

This issue is complete when:

  • All commands are migrated to oclif command classes (Phases 1-3)
  • All lib modules are converted to TypeScript
  • Old dispatcher (bin/nemoclaw.js) is deleted
  • nemoclaw <sandbox-name> <action> backward compat works via hook
  • Auto-generated --help works for all commands
  • Shell completions work for bash and zsh via @oclif/plugin-autocomplete
  • All existing tests pass (Vitest)
  • CI pipeline updated for TypeScript build step

Fish completion and further hardening (per-command examples, stricter validations, etc.) are follow-up PRs.

Out of scope

  • Security fixes — those continue on their own track with higher priority
  • New features — this is purely structural migration + TypeScript conversion
  • The onboard wizard's business logic — it moves to an oclif command class but the flow stays the same

Impact

24 issues + 17 PRs = 41 items fully or partially addressed by this migration.

/cc @cv @rubenhagege

Metadata

Metadata

Assignees

No one assigned

    Labels

    area: cliCommand line interface, flags, terminal UX, or output
    No fields configured for Enhancement.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions