Skip to content

Validate sandbox sessionId to prevent command injection#4

Closed
ericksoa wants to merge 1 commit into
mainfrom
vincentkoc-validate-sessionid
Closed

Validate sandbox sessionId to prevent command injection#4
ericksoa wants to merge 1 commit into
mainfrom
vincentkoc-validate-sessionid

Conversation

@ericksoa

Copy link
Copy Markdown
Contributor

Migrated from NVIDIA/openshell-openclaw-plugin#17 by @vincentkoc

Summary

Validate sessionId on the /api/sandbox-chat path before it is used to build the sandbox execution command.

Changes

  • add normalizeSessionId() and require sessionId to match ^[A-Za-z0-9_-]{1,64}$
  • reject empty message values and invalid sessionId values with 400
  • use the normalized session id for logging and for the sandbox execution path

Testing

  • node --check .jensenclaw/server.js

Signed-off-by: Vincent Koc <vincentkoc@ieee.org>
@ericksoa

Copy link
Copy Markdown
Contributor Author

Egg was removed, closing issue

@ericksoa ericksoa closed this Mar 16, 2026
jessesanford pushed a commit to jessesanford/NemoClaw that referenced this pull request Mar 24, 2026
Fix Brev CLI link and add Brev account link in README
jyaunches pushed a commit to jyaunches/NemoClaw that referenced this pull request Apr 14, 2026
- Guard runArgv/runArgvCapture against shell:true to prevent security
  bypass (finding #1) — throws if a caller attempts to re-enable shell
  interpretation. Added 2 tests.
- Document the intentional bash -c exception in getOllamaWarmupCommand
  explaining why it's safe (finding NVIDIA#2).
- Remove dead getOpenshellCommand() from policies.ts (finding NVIDIA#3).
- Remove unused shellQuote import from nim.ts (finding NVIDIA#4).
- Fix brittle indexOf assertion in onboard-readiness test (finding NVIDIA#5).
jyaunches pushed a commit that referenced this pull request Apr 20, 2026
- Remove unused getForwardList() call from getActiveSandboxSessions —
  only pgrep/ps is needed for SSH session detection (warning #1)
- Consolidate double-prompt in sandboxDestroy into single enriched
  confirmation prompt (warning #2)
- Remove noisy cleanupGatewayAfterLastSandbox forward check that would
  always fire due to dashboard forward (warning #3)
- Use word-boundary regex in parseSshProcesses to prevent false positives
  when sandbox names share prefixes (warning #4)
- Export SessionClassification as named interface (suggestion #1)
- Use cross-platform ps -axo instead of Linux-only pgrep -a for macOS
  compatibility (suggestion #2)
- Add forwardCount to SessionClassification for future consumers
- Add tests for word-boundary matching edge cases
cv pushed a commit that referenced this pull request May 7, 2026
)

## Summary

Fixes #3138 — On some deployments (e.g. Brev launchables) the sandbox
stays in `Running` phase which is functionally equivalent to `Ready`.
The CLI previously treated this as "not ready", causing false-negative
health reports, dashboard showing the agent as stopped, and recovery
logic not triggering.

## Changes

| File | Change |
|------|--------|
| `src/lib/state/gateway.ts` | `isSandboxReady()` now matches both
`Ready` and `Running` columns |
| `src/lib/actions/sandbox/gateway-state.ts` |
`ensureLiveSandboxOrExit()` no longer exits on `Running` phase |
| `test/onboard-readiness.test.ts` | +2 tests for Running phase
detection (plain text + ANSI-wrapped) |

## Testing

- `test/onboard-readiness.test.ts` — 36/36 passed
- `test/gateway-state.test.ts` — 37/37 passed

## Context

This is the CLI-side prerequisite for
[brevdev/nemoclaw-image#10](brevdev/nemoclaw-image#10),
which replaces the onboard-ui's reinvented health checks with NemoClaw
CLI calls. Originated from #2258 (sub-item #4), which was closed with
work redirected to the nemoclaw-image repo.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

## Release Notes

* **New Features**
* Sandboxes in "Running" state are now recognized as operational,
expanding support for different sandbox phase conditions.

* **Tests**
* Added test coverage for sandbox readiness detection with "Running"
state and ANSI-encoded outputs.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
jyaunches added a commit to jyaunches/NemoClaw that referenced this pull request May 11, 2026
Lands shared fixtures, helpers, assertion helpers, install-method
splits, conventions + lint, and the parity-compare CI harness that
unblock the per-wave migration phases (2\u201312).

Deliverables (per specs/2026-05-11_e2e-test-migration/spec.md Phase 1):

Fixtures (test/e2e/lib/fixtures/):
- fake-openai.sh: local OpenAI-compatible endpoint (Risk NVIDIA#2 mitigation)
- fake-{telegram,discord,slack}.sh: messaging stubs via shared
  _fake-http-stub.sh harness
- older-base-image.sh: tagged ghcr base-image Dockerfile generator

Helpers (test/e2e/lib/):
- logging.sh: canonical e2e_{section,info,pass,fail} with stable
  PASS:/FAIL:/=== Phase markers (absorbs reuse category #1)
- sandbox-exec.sh: canonical nemoclaw-shell wrapper with safe quoting,
  exit-code propagation, and dry-run short-circuit (category NVIDIA#10)
- env.sh: auto-sources logging.sh so every consumer gets it for free

Assertion helpers (test/e2e/lib/assert/):
- inference-works.sh: chat-completion round-trip
- no-credentials-leaked.sh: credential-pattern scan
- policy-preset-applied.sh: gateway policy preset verification
- messaging-bridge-reachable.sh: L7 proxy / bridge reachability

Install dispatcher splits (test/e2e/lib/setup/):
- install-{repo,curl,ollama,launchable}.sh (four profiles)
- install.sh: dispatcher routes by profile/method name (category NVIDIA#5)

Runtime probe wiring:
- run-scenario.sh: adds --validate-only flag (probe-only, no setup)
- resolver/index.ts: E2E_PROBE_OVERRIDES_JSON escape hatch for keys
  with embedded underscores (e.g. security.policy_engine)

Convention lint + parity harness:
- scripts/e2e/lint-conventions.ts: enforces 6 conventions on suite
  step scripts + requires parity-map.yaml entries for legacy scripts
- scripts/e2e/compare-parity.sh: diffs legacy vs scenario PASS/FAIL
  via parity-map.yaml; flaky: true marker supported (Risk NVIDIA#4)
- test/e2e/parity-map.yaml: seeded with one entry per existing legacy
  script; migration phases 2\u201312 append assertion mappings
- .github/workflows/e2e-parity-compare.yaml: dispatches legacy script
  + migrated scenario on same runner and diffs outcomes

Tests (all passing, 41 total):
- test/e2e-lib-helpers.test.ts: +18 tests (1.A\u20131.E)
- test/e2e-convention-lint.test.ts: +11 tests (1.G\u20131.H)
- test/e2e-expected-state-validator.test.ts: +2 tests (1.F)

No regressions: full cli Vitest project (3258 tests) still green.
cjagwani pushed a commit that referenced this pull request May 14, 2026
Resolve the two output threads in #3456 left after the core dead-loop fix
landed via #3459 + #3434:

Sub-bug #3 — `src/lib/onboard.ts` printed
  `nemoclaw <name> destroy --yes && nemoclaw onboard --gpu`
with a literal `<name>` placeholder, and assumed at least one sandbox
was registered. When the GPU-passthrough mismatch hit on the State B
re-run path with an empty registry (the dead-loop case), the hint was
not actionable. Replace with a registry-aware helper at
`src/lib/onboard/gpu-recovery.ts` that renders the right shape:
  - empty registry → suggest `nemoclaw uninstall && nemoclaw onboard --gpu`
  - one sandbox → suggest destroy --yes --cleanup-gateway for that name
  - multiple sandboxes → list each, only the last gets --cleanup-gateway

Sub-bug #4 — `src/lib/actions/uninstall/run-plan.ts` printed
  `Destroyed gateway 'nemoclaw' skipped`
when the openshell destroy no-op'd (gateway already gone) — the
"Destroyed … skipped" wording was self-contradictory. Extend
`runOptional` with an `onSkip` option; route the gateway destroy to
emit `Gateway 'nemoclaw' already removed or unreachable` on no-op.

Tests:
- `src/lib/onboard/gpu-recovery.test.ts` (6 tests): forbid literal
  `<name>` placeholder anywhere in the output; cover empty / single /
  multi-sandbox cases; defensive filter on whitespace names so a
  `nemoclaw  destroy` rendering can never happen.
- `src/lib/actions/uninstall/run-plan.test.ts`: assert the new
  "already removed or unreachable" wording and the absence of the
  "Destroyed gateway 'nemoclaw' skipped" string.

The core dead loop itself (sub-bugs #1, #2 and State B GPU mismatch)
is already addressed by #3459 + #3434 + #3483; #3456 will close once
this lands. See the #3456 status comment for the full mapping.

Refs #3456. Mirrors (and tightens) the approach in the closed PR #3464,
which left the literal `<name>` placeholder in tests per CodeRabbit
feedback that was never addressed.

Signed-off-by: Charan Jagwani <charjags100@gmail.com>
cv added a commit that referenced this pull request May 14, 2026
…3520)

> **Draft for visibility.** Issue-autopilot Stages 4-5 of #3456. Will
mark ready once batch self-review + CI complete.

## Summary

Closes the two remaining output threads in #3456 after the core
dead-loop fix already landed on `main` (via #3459, #3434, #3483). Full
sub-bug mapping in the [#3456 status
comment](#3456 (comment)).

- **Sub-bug #3** — `nemoclaw <name> destroy --yes` recovery hint
replaced with a registry-aware helper.
- **Sub-bug #4** — `Destroyed gateway 'nemoclaw' skipped`
self-contradictory wording replaced with `Gateway 'nemoclaw' already
removed or unreachable`.

## Acceptance criteria mapping

| Sub-bug | Resolution | Evidence |
|---|---|---|
| #1 dead loop | Already fixed on main (#3459) | out of scope |
| #2 firewall diagnostic | Already fixed on main (#3459) | out of scope
|
| **#3** literal `<name>` placeholder | **This PR** |
`src/lib/onboard/gpu-recovery.ts` + `onboard.ts:10387-10405` |
| **#4** misleading "skipped" wording | **This PR** |
`src/lib/actions/uninstall/run-plan.ts:210-228, 407-414` |
| #5 uninstall residuals | Already fixed on main (#3483) | out of scope
|

## Behavior matrix

`gpuPassthroughRecoveryLines(names)`:

| Input | Suggestion |
|---|---|
| `null` / `[]` | `nemoclaw uninstall && nemoclaw onboard --gpu` |
| one sandbox | `nemoclaw <name> destroy --yes --cleanup-gateway &&
nemoclaw onboard --gpu` |
| many sandboxes | each `destroy --yes`, only the last gets
`--cleanup-gateway` |

## Test plan

```
npm run typecheck:cli
npx vitest run src/lib/onboard/gpu-recovery.test.ts src/lib/actions/uninstall/run-plan.test.ts
```

22 tests pass (6 new + 16 existing).

## Notes for reviewers

- This is the work [#3464
attempted](#3464); that PR was
closed without merging after CodeRabbit asked for the `<name>`
placeholder to be forbidden in tests via negative assertion. This PR
adopts that refinement.
- `runOptional` extension is backwards-compatible — existing callers
without `onSkip` get the original wording.

Closes #3456 once merged.

---------

Signed-off-by: Charan Jagwani <charjags100@gmail.com>
Co-authored-by: Charan Jagwani <charjags100@gmail.com>
Co-authored-by: Carlos Villela <cvillela@nvidia.com>
jyaunches added a commit that referenced this pull request May 28, 2026
…advisor #4)

Advisor finding: README and scenario model define
'setup scenario -> expected state -> suite sequence', but the live TS
path skipped expected-state validation entirely. The legacy bash
runner gated this behind E2E_VALIDATE_EXPECTED_STATE=1 and used
env-var probe overrides only; the actual environment-shape contract
ran inline as e2e_gateway_assert_healthy / e2e_sandbox_assert_running
between onboarding and suite execution. The TS runner did neither.

Promote the inline preconditions to a first-class typed phase, in
the spirit of EnvironmentOrchestrator/OnboardingOrchestrator: real
probes, real clients, framework-owned timeouts and redaction, single
source of truth in the typed registry.

- types.ts: PhaseName extends to 'state-validation'. New
  ExpectedState typed shape (cli/gateway/sandbox/inference/credentials
  with present|absent|optional). New StateProbeId union. ExpectedFailurePhase
  intentionally excludes state-validation so scenarios cannot declare
  expected failures against an internal phase.

- scenarios/expected-states.ts (new): typed mirror of
  nemoclaw_scenarios/expected-states.yaml. Source of truth for the TS
  runner during transition; YAML stays in place for the legacy
  resolver until that path is fully retired. probesForState() maps
  the typed contract to the concrete probe ids the orchestrator emits.
  Inference and credentials remain declared but emit no probe actions
  yet (probe scripts not implemented); a registry test pins this gap
  so a future probe-script PR is forced to update the mapping.

- nemoclaw_scenarios/probes/ (new):
  - dispatch.sh exports e2e_state_probe <id>; the typed runner spawns
    it via the shared dispatch-action.sh launcher.
  - cli-installed.sh: command -v nemoclaw + executable check.
  - gateway-healthy.sh: defers to validation_suites/assert/gateway-alive.sh
    so there's one implementation of the gateway-health contract.
  - sandbox-running.sh: defers to validation_suites/assert/sandbox-alive.sh.
  - gateway-absent.sh: nemoclaw gateway status + URL reachability.
    Typed replacement for the run-scenario.sh inline forbidden-effect
    check on the gateway axis.
  - sandbox-absent.sh: nemoclaw list + openshell sandbox list. Typed
    replacement for the inline 'openshell sandbox list | grep -Fq'.

- compiler.ts: state-validation phase actions emitted from
  scenario.expectedStateId via probesForState(). Hard error on
  unknown expected_state id (typed runner is stricter than legacy
  resolver). Phase order is environment -> onboarding ->
  state-validation -> runtime.

- orchestrators/state-validation.ts (new): tiny subclass of
  PhaseOrchestrator. No new control flow; probe actions reuse the
  existing shell-fn action machinery (timeouts, redaction, evidence
  logs).

- orchestrators/runner.ts: phase-blocking semantics get one rule
  refinement. environment failure blocks onboarding, state-validation,
  and runtime. onboarding failure does NOT block state-validation -
  negative scenarios deliberately fail onboarding and rely on
  absent-state probes running afterward to verify forbidden side
  effects did not occur. state-validation failure blocks runtime so
  suites never run against a missing/wedged environment.

- orchestrators/negative-matcher.ts: state-validation forbidden-effect
  probe ids (gateway-absent, sandbox-absent) excluded from observed-failure
  scanning. They are post-failure verification, not the failure mode
  itself; their pass/fail status is reported separately and feeds the
  phase chain through normal action-failure semantics.

- 16 new tests in e2e-expected-state.test.ts cover: typed registry
  mirrors YAML structurally, probesForState mapping for ready/absent/
  optional states (inference/credentials gap pinned), compiler emits
  the right probe actions per scenario, phase order, hard error on
  unknown state id, and the three runner short-circuit cases
  (env failure blocks state-validation, onboarding failure does not,
  state-validation failure blocks runtime). Existing tests updated
  for the new phase ordering (e2e-phase-orchestrators,
  e2e-negative-matcher).

Spec ownership boundaries kept honest:
- Typed contract over env-var probes: replaces E2E_PROBE_OVERRIDE_*
  with structurally-typed expected-state declarations.
- One mode, no opt-out: state-validation always runs after onboarding
  for any scenario with an expectedStateId.
- Framework infra, not shell: orchestrator + clients + redaction
  reused; only the probe scripts are bash, same shape as install/
  onboard dispatchers.
- Single source of truth: scenario.expectedStateId -> typed registry
  -> emitted probe actions; one declaration drives everything.
- Failures stay visibly real: probes that aren't implemented yet
  (inference, credentials) stay declared but inert with a registry
  test pinning the gap. Their first execution lands when the probe
  ships, not as silent green.

Out of scope (deferred follow-ups):
- inference-available and credentials-present probe scripts.
- Per-suite requires_state gating at the assertion level.
- Retiring expected-states.yaml and runtime/resolver/validator.ts;
  the typed registry is now the SoT for the TS runner, but both
  artifacts remain alongside until the legacy resolver is unused.
- Replacing the runtime.expected-failure.no-side-effects required
  pending step. State-validation absent probes now do the real work,
  but the placeholder stays put until a follow-up confirms the
  switchover and removes it.

418 framework tests pass (16 new). tsc clean. Plan output verified
on positive (ubuntu-repo-cloud-openclaw) and negative
(ubuntu-no-docker-preflight-negative) scenarios.

Signed-off-by: Julie Yaunches <jyaunches@nvidia.com>
jyaunches added a commit that referenced this pull request May 28, 2026
…h (PR #4380)

Closes the architectural loose end from advisor #4: the typed
state-validation phase landed in ff4b0a8 with a parity test pinning
scenarios/expected-states.ts to nemoclaw_scenarios/expected-states.yaml.
Two sources of truth tied by a brittle parity test is exactly the
'localized fix without source removal' the advisor warns about. Now
that nothing live consumes the YAML (e2e-scenarios.yaml workflow goes
through scenarios/run.ts; the bash run-scenario.sh is a stub), retire
the whole legacy resolver path.

Deleted (2,236 lines of source + test):
- nemoclaw_scenarios/expected-states.yaml
- runtime/resolver/ (load.ts, plan.ts, schema.ts, coverage.ts, index.ts,
  validator.ts, expected-failure.ts, js-yaml.d.ts)
- runtime/coverage-report.sh
- 6 framework tests that exercised only the resolver path:
  e2e-scenario-schema, e2e-scenario-resolver, e2e-coverage-report,
  e2e-scenario-additional-families, e2e-metadata-final-hygiene,
  e2e-expected-failure (its typed equivalent now lives in
  e2e-negative-matcher.test.ts).

Trimmed:
- e2e-migration-inventory-lock.test.ts: drop the
  expected-states.yaml-vs-typed-registry test; expected-state migration
  is complete and parity is now intra-typed.
- e2e-expected-state.test.ts: drop the YAML-mirror parity tests;
  replace with id-coverage assertions against listExpectedStates().
- types.ts / expected-states.ts / compiler.ts comments: state plainly
  that the typed registry is the single source of truth.

Updated:
- docs/README.md: typed scenarios/ tree is now the documented source
  of truth; scenarios.yaml + suites.yaml called out as historical
  references during cleanup.

Out of scope (deferred):
- nemoclaw_scenarios/scenarios.yaml and validation_suites/suites.yaml
  remain on disk. With the resolver gone, both are now non-runtime
  reference docs. Their final removal needs a follow-up because
  e2e-assertion-modules.test.ts and e2e-migration-inventory-lock.test.ts
  still cross-check against suites.yaml, and e2e-scenario-advisor.test.ts
  references the path string. Removing them is mechanical but its own
  blast radius.

348 framework tests pass after deletion (was 418 \u2014 70 tests gone with
the 5 deleted resolver test files; 16 expected-state tests trimmed to
3 since the YAML is gone). tsc clean.

Signed-off-by: Julie Yaunches <jyaunches@nvidia.com>
cjagwani added a commit that referenced this pull request Jun 3, 2026
…m anchor

Addresses @cv's review ask #4 on PR #4175 — runtime validation through the
Hermes gateway first-message path — by adding an integration-style test
that drives the actual `_pre_llm_call` → `AIAgent._strip_think_blocks`
chain Hermes uses, rather than only exercising the normalizer in isolation.

The new test:
- Stubs `run_agent.AIAgent._strip_think_blocks` as a real method.
- Calls `_pre_llm_call(platform="telegram", is_first_turn=True)` to simulate
  the first Telegram turn arriving via the gateway hook — this both sets
  the platform anchor and installs the patch via the same code path Hermes
  uses at runtime.
- Calls `AIAgent._strip_think_blocks(...)` twice to assert (a) same-platform
  body is extracted, (b) cross-platform `to slack: ...` is preserved.
- Calls `_pre_llm_call(platform="discord", is_first_turn=True)` to simulate
  a turn on a different platform and asserts the anchor refreshes per-turn
  (Discord body extracted, telegram target preserved).

Combined with the existing isolated-unit cases and the cited
`hermes-e2e` / `hermes-discord-e2e` / `hermes-slack-e2e` scenarios, this
gives both "added" and "cited" runtime validation for the gateway
first-message path.

Signed-off-by: Charan Jagwani <cjagwani@nvidia.com>
Co-authored-by: Chengjie Wang <chengjiew@nvidia.com>
@cv cv mentioned this pull request Jun 5, 2026
12 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature PR adds or expands user-visible functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants