Skip to content

refactor(cli): add onboard FSM transition types#3848

Merged
jyaunches merged 4 commits into
mainfrom
refactor/3802-1/types
May 20, 2026
Merged

refactor(cli): add onboard FSM transition types#3848
jyaunches merged 4 commits into
mainfrom
refactor/3802-1/types

Conversation

@cv

@cv cv commented May 20, 2026

Copy link
Copy Markdown
Collaborator

Summary

Add the first observable onboarding FSM shell primitives: coarse machine state names, event vocabulary, transition metadata, and validation helpers. This is the no-behavior-change foundation for the #3802 onboarding FSM workstream.

Related Issue

Refs #3802

Changes

  • Added src/lib/onboard/machine/types.ts with onboarding machine states, event types, transition/context types, and terminal-state vocabulary.
  • Added src/lib/onboard/machine/transitions.ts with the canonical transition graph, failure transitions from non-terminal states, lookup helpers, and invalid-transition errors.
  • Added src/lib/onboard/machine/transitions.test.ts covering vocabulary, transition ordering, retry/branch/failure classification, terminal states, and duplicate edge protection.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • make docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: Carlos Villela cvillela@nvidia.com

Summary by CodeRabbit

  • New Features

    • Improved onboarding flow with explicit transition rules, deterministic next-state ordering, and runtime validation to prevent invalid moves.
  • Bug Fixes

    • Terminal states are enforced and failure transitions are consistently handled to avoid unexpected progression.
  • Tests

    • Added comprehensive tests validating onboarding transition invariants and error handling.
  • Chores

    • Added type-safe definitions and utilities to support onboarding state management.

Review Change Stack

@cv cv self-assigned this May 20, 2026
@coderabbitai

coderabbitai Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 9bcaacf2-136b-4459-906d-717b6fd382f7

📥 Commits

Reviewing files that changed from the base of the PR and between 9ee6cfa and a3c9eb2.

📒 Files selected for processing (2)
  • src/lib/onboard/machine/transitions.ts
  • src/lib/onboard/machine/types.ts

📝 Walkthrough

Walkthrough

Adds an onboarding finite-state machine: typed state/event vocabularies, explicit and failure transitions combined into a transition graph, navigation and validation helpers (including a custom invalid-transition error), and tests validating graph invariants.

Changes

Onboarding FSM Implementation

Layer / File(s) Summary
FSM type vocabulary
src/lib/onboard/machine/types.ts
Defines state and event constant arrays, derives union types for terminal/non-terminal partitions, and specifies transition shape (from, to, kind) and a redacted OnboardMachineContext.
Transition sets and combined graph
src/lib/onboard/machine/transitions.ts (lines 1–35)
Introduces explicit direct transitions, auto-generates failure transitions from every non-terminal state to failed, and combines them into a readonly ONBOARD_MACHINE_TRANSITIONS list.
Next-states map
src/lib/onboard/machine/transitions.ts (lines 36–46)
Computes ONBOARD_MACHINE_NEXT_STATES mapping each OnboardMachineState to its allowed successors derived from the combined transition set.
Invalid transition error
src/lib/onboard/machine/transitions.ts (lines 48–58)
Adds InvalidOnboardMachineTransitionError storing from and to and setting a formatted error message.
State guards
src/lib/onboard/machine/transitions.ts (lines 60–68)
Adds runtime type guards: isOnboardMachineState and isTerminalOnboardMachineState.
Navigation helpers
src/lib/onboard/machine/transitions.ts (lines 70–81)
Implements getNextOnboardMachineStates and canTransitionOnboardMachineState for successor lookup and membership checks.
Transition lookup & validation
src/lib/onboard/machine/transitions.ts (lines 83–103)
Implements getOnboardMachineTransition (nullable lookup) and assertValidOnboardMachineTransition which throws InvalidOnboardMachineTransitionError for invalid {from,to} pairs.
FSM transition tests
src/lib/onboard/machine/transitions.test.ts
Vitest suite asserting state/event vocabularies, membership guards, failure/terminal semantics, deterministic next-state ordering, transition kind classification, invalid-transition errors, consistency between transition lists and maps, and no duplicate edges.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested labels

NemoClaw CLI, refactor

Suggested reviewers

  • cjagwani
  • jyaunches
  • prekshivyas
  • ericksoa

Poem

🐰 I hopped through states with tidy hops and bounds,
Mapped every path where failure might be found,
Guards and errors snug in code so neat,
Tests that check each step and every repeat,
A rabbit's stamp of approval—onboard and sound.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding foundational FSM transition types for the onboarding state machine, which is the primary objective of the PR.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch refactor/3802-1/types

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.


Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: cloud-onboard-e2e, onboard-resume-e2e
Optional E2E: onboard-negative-paths-e2e, onboard-repair-e2e, double-onboard-e2e

Dispatch hint: cloud-onboard-e2e,onboard-resume-e2e

Auto-dispatched E2E: cloud-onboard-e2e, onboard-resume-e2e via nightly-e2e.yaml at 70c19d8f3372a75a21a138909ea137524f0a451enightly run

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: medium

Required E2E

  • cloud-onboard-e2e (high): Best existing happy-path coverage for end-to-end onboarding through gateway, inference.local, sandbox/OpenClaw setup, policy presets, credential leak checks, and post-onboard verification. The changed FSM vocabulary includes these major onboarding boundaries.
  • onboard-resume-e2e (high): Exercises interrupted onboarding, failed session state, resume from a partial run, skipped cached phases, and final completion. This directly matches the added failure, terminal, retry/resume-related vocabulary and transition semantics.

Optional E2E

  • onboard-negative-paths-e2e (high): Useful adjacent confidence for invalid/failure paths because the new FSM allows every non-terminal state to transition to failed, but this is broader than the immediate happy/resume coverage and can be optional unless the FSM is already wired into runtime control flow.
  • onboard-repair-e2e (high): Optional coverage for repair-oriented onboarding behavior referenced by the added event vocabulary (state.repair.*). Recommend if reviewers expect this FSM to become the source of truth for repair transitions.
  • double-onboard-e2e (high): Optional lifecycle/re-onboard confidence for terminal/completed state handling and repeat onboarding behavior, but less directly targeted than resume.

New E2E recommendations

  • onboarding-state-machine-events (medium): Existing E2E validates legacy session step status, but there does not appear to be a dedicated E2E asserting emitted FSM event ordering (state.entered, state.completed, onboard.failed, onboard.completed) against a real onboard run once this machine is wired into runtime execution.
    • Suggested test: Add a lightweight scenario or shell E2E that runs a forced-failure onboard and a successful resume, captures onboarding machine events, and asserts legal transition ordering with no raw secrets in event/context payloads.

Dispatch hint

  • Workflow: nightly-e2e.yaml
  • jobs input: cloud-onboard-e2e,onboard-resume-e2e

@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26137466357
Target ref: 450716bd6dccbb07cc908fd4a61ec48de3f365b4
Workflow ref: main
Requested jobs: onboard-resume-e2e,onboard-negative-paths-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job Result
onboard-negative-paths-e2e ✅ success
onboard-resume-e2e ✅ success

@github-actions

github-actions Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor

Recommendation: info only
Confidence: low
Analyzed HEAD: 70c19d8f3372a75a21a138909ea137524f0a451e
Findings: 0 blocker(s), 1 warning(s), 0 suggestion(s)

This is an automated advisory review. A human maintainer must make the final merge decision.

Limitations: Advisor execution failed: Could not configure advisor model openai/openai/gpt-5.5

Workflow run

Full advisor summary

PR Review Advisor

Base: origin/main
Head: HEAD
Analyzed SHA: 70c19d8f3372a75a21a138909ea137524f0a451e
Recommendation: info only
Confidence: low

PR review advisor failed: Could not configure advisor model openai/openai/gpt-5.5

Gate status

  • CI: pending — 6 status context(s) appear pending.
  • Mergeability: fail — mergeStateStatus=BLOCKED
  • Review threads: unknown — No review thread state was available.
  • Risky code tested: pass — No risky code areas detected by path heuristics.

🔴 Blockers

  • None.

🟡 Warnings

  • PR review advisor unavailable: The automated advisor could not complete: Could not configure advisor model openai/openai/gpt-5.5
    • Recommendation: Re-run the PR Review Advisor or perform a manual review.
    • Evidence: Could not configure advisor model openai/openai/gpt-5.5

🔵 Suggestions

  • None.

Acceptance coverage

  • No linked acceptance clauses were analyzed.

Security review

  • warning — Secrets and Credentials: Advisor unavailable; human review required.
  • warning — Input Validation and Data Sanitization: Advisor unavailable; human review required.
  • warning — Authentication and Authorization: Advisor unavailable; human review required.
  • warning — Dependencies and Third-Party Libraries: Advisor unavailable; human review required.
  • warning — Error Handling and Logging: Advisor unavailable; human review required.
  • warning — Cryptography and Data Protection: Advisor unavailable; human review required.
  • warning — Configuration and Security Headers: Advisor unavailable; human review required.
  • warning — Security Testing: Advisor unavailable; human review required.
  • warning — Holistic Security Posture: Advisor unavailable; human review required.

Test / E2E status

  • Test depth: mocks_recommended — Changed code has I/O, state, credentials, provider, or config behavior that should be covered with behavioral mocks: src/lib/onboard/machine/transitions.ts, src/lib/onboard/machine/types.ts.
  • E2E Advisor: not_found (not found)

✅ What looks good

  • No positives were identified by the advisor.

Review completeness

  • Advisor execution failed: Could not configure advisor model openai/openai/gpt-5.5
  • Human maintainer review required: yes

@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26170203727
Target ref: 70c19d8f3372a75a21a138909ea137524f0a451e
Workflow ref: main
Requested jobs: cloud-onboard-e2e,onboard-resume-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job Result
cloud-onboard-e2e ✅ success
onboard-resume-e2e ✅ success

@wscurran wscurran added NemoClaw CLI refactor PR restructures code without intended behavior change labels May 20, 2026
@wscurran

Copy link
Copy Markdown
Contributor

Related open issues:

@jyaunches jyaunches merged commit 8bdba46 into main May 20, 2026
25 checks passed
cv added a commit that referenced this pull request May 21, 2026
## Summary
Emit structured, redacted onboarding machine events from the existing
session step mutation helpers. This stacks on #3848 and keeps event
delivery process-local/non-persistent as the PR 2 foundation for the
#3802 FSM workstream.

## Related Issue
Refs #3802
Stacked on #3848

## Changes
- Added `src/lib/onboard/machine/events.ts` with structured event
creation, redacted context building, session-step-to-machine-state
mapping, and process-local listeners.
- Augmented `markStepStarted`, `markStepComplete`, `markStepSkipped`,
`markStepFailed`, and `completeSession` to emit `state.*`,
`context.updated`, and onboarding terminal events after persisted
session mutations.
- Added session tests that verify event ordering, redaction, observer
failure isolation, unknown-step behavior, and that full event logs are
not persisted by default.

## Type of Change
- [x] Code change (feature, bug fix, or refactor)
- [ ] Code change with doc updates
- [ ] Doc only (prose changes, no code sample modifications)
- [ ] Doc only (includes code sample changes)

## Verification
- [x] `npx prek run --all-files` passes
- [x] `npm test` passes
- [x] Tests added or updated for new or changed behavior
- [x] No secrets, API keys, or credentials committed
- [ ] Docs updated for user-facing behavior changes
- [ ] `make docs` builds without warnings (doc changes only)
- [ ] Doc pages follow the [style
guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md)
(doc changes only)
- [ ] New doc pages include SPDX header and frontmatter (new pages only)

---
Signed-off-by: Carlos Villela <cvillela@nvidia.com>

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Onboarding now emits structured events for step/state transitions and
context updates, with sanitized metadata and origin parsing.
* **Bug Fixes**
* Listener errors are isolated so they cannot disrupt onboarding
progression.
* Emissions guarded to avoid duplicate or spurious events for unknown or
no-op transitions; events emit only on real state changes.
* **Tests**
* Added tests validating event ordering, redaction of sensitive data,
persistence excludes events, and observer-failure isolation.
* **Chores**
* Updated test harness command invocation argument order for sandboxed
checks.

<!-- review_stack_entry_start -->

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/NVIDIA/NemoClaw/pull/3849?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

<!-- review_stack_entry_end -->
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Julie Yaunches <jyaunches@nvidia.com>
@cv cv deleted the refactor/3802-1/types branch May 27, 2026 21:17
@wscurran wscurran added area: cli Command line interface, flags, terminal UX, or output and removed NemoClaw CLI labels Jun 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: cli Command line interface, flags, terminal UX, or output refactor PR restructures code without intended behavior change v0.0.47 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants