Validate sandbox sessionId to prevent command injection#4
Closed
ericksoa wants to merge 1 commit into
Closed
Conversation
Signed-off-by: Vincent Koc <vincentkoc@ieee.org>
Contributor
Author
|
Egg was removed, closing issue |
jessesanford
pushed a commit
to jessesanford/NemoClaw
that referenced
this pull request
Mar 24, 2026
Fix Brev CLI link and add Brev account link in README
jyaunches
pushed a commit
to jyaunches/NemoClaw
that referenced
this pull request
Apr 14, 2026
- Guard runArgv/runArgvCapture against shell:true to prevent security bypass (finding #1) — throws if a caller attempts to re-enable shell interpretation. Added 2 tests. - Document the intentional bash -c exception in getOllamaWarmupCommand explaining why it's safe (finding NVIDIA#2). - Remove dead getOpenshellCommand() from policies.ts (finding NVIDIA#3). - Remove unused shellQuote import from nim.ts (finding NVIDIA#4). - Fix brittle indexOf assertion in onboard-readiness test (finding NVIDIA#5).
jyaunches
pushed a commit
that referenced
this pull request
Apr 20, 2026
- Remove unused getForwardList() call from getActiveSandboxSessions — only pgrep/ps is needed for SSH session detection (warning #1) - Consolidate double-prompt in sandboxDestroy into single enriched confirmation prompt (warning #2) - Remove noisy cleanupGatewayAfterLastSandbox forward check that would always fire due to dashboard forward (warning #3) - Use word-boundary regex in parseSshProcesses to prevent false positives when sandbox names share prefixes (warning #4) - Export SessionClassification as named interface (suggestion #1) - Use cross-platform ps -axo instead of Linux-only pgrep -a for macOS compatibility (suggestion #2) - Add forwardCount to SessionClassification for future consumers - Add tests for word-boundary matching edge cases
6 tasks
This was referenced Apr 28, 2026
cv
pushed a commit
that referenced
this pull request
May 7, 2026
) ## Summary Fixes #3138 — On some deployments (e.g. Brev launchables) the sandbox stays in `Running` phase which is functionally equivalent to `Ready`. The CLI previously treated this as "not ready", causing false-negative health reports, dashboard showing the agent as stopped, and recovery logic not triggering. ## Changes | File | Change | |------|--------| | `src/lib/state/gateway.ts` | `isSandboxReady()` now matches both `Ready` and `Running` columns | | `src/lib/actions/sandbox/gateway-state.ts` | `ensureLiveSandboxOrExit()` no longer exits on `Running` phase | | `test/onboard-readiness.test.ts` | +2 tests for Running phase detection (plain text + ANSI-wrapped) | ## Testing - `test/onboard-readiness.test.ts` — 36/36 passed - `test/gateway-state.test.ts` — 37/37 passed ## Context This is the CLI-side prerequisite for [brevdev/nemoclaw-image#10](brevdev/nemoclaw-image#10), which replaces the onboard-ui's reinvented health checks with NemoClaw CLI calls. Originated from #2258 (sub-item #4), which was closed with work redirected to the nemoclaw-image repo. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes * **New Features** * Sandboxes in "Running" state are now recognized as operational, expanding support for different sandbox phase conditions. * **Tests** * Added test coverage for sandbox readiness detection with "Running" state and ANSI-encoded outputs. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
This was referenced May 7, 2026
jyaunches
added a commit
to jyaunches/NemoClaw
that referenced
this pull request
May 11, 2026
Lands shared fixtures, helpers, assertion helpers, install-method splits, conventions + lint, and the parity-compare CI harness that unblock the per-wave migration phases (2\u201312). Deliverables (per specs/2026-05-11_e2e-test-migration/spec.md Phase 1): Fixtures (test/e2e/lib/fixtures/): - fake-openai.sh: local OpenAI-compatible endpoint (Risk NVIDIA#2 mitigation) - fake-{telegram,discord,slack}.sh: messaging stubs via shared _fake-http-stub.sh harness - older-base-image.sh: tagged ghcr base-image Dockerfile generator Helpers (test/e2e/lib/): - logging.sh: canonical e2e_{section,info,pass,fail} with stable PASS:/FAIL:/=== Phase markers (absorbs reuse category #1) - sandbox-exec.sh: canonical nemoclaw-shell wrapper with safe quoting, exit-code propagation, and dry-run short-circuit (category NVIDIA#10) - env.sh: auto-sources logging.sh so every consumer gets it for free Assertion helpers (test/e2e/lib/assert/): - inference-works.sh: chat-completion round-trip - no-credentials-leaked.sh: credential-pattern scan - policy-preset-applied.sh: gateway policy preset verification - messaging-bridge-reachable.sh: L7 proxy / bridge reachability Install dispatcher splits (test/e2e/lib/setup/): - install-{repo,curl,ollama,launchable}.sh (four profiles) - install.sh: dispatcher routes by profile/method name (category NVIDIA#5) Runtime probe wiring: - run-scenario.sh: adds --validate-only flag (probe-only, no setup) - resolver/index.ts: E2E_PROBE_OVERRIDES_JSON escape hatch for keys with embedded underscores (e.g. security.policy_engine) Convention lint + parity harness: - scripts/e2e/lint-conventions.ts: enforces 6 conventions on suite step scripts + requires parity-map.yaml entries for legacy scripts - scripts/e2e/compare-parity.sh: diffs legacy vs scenario PASS/FAIL via parity-map.yaml; flaky: true marker supported (Risk NVIDIA#4) - test/e2e/parity-map.yaml: seeded with one entry per existing legacy script; migration phases 2\u201312 append assertion mappings - .github/workflows/e2e-parity-compare.yaml: dispatches legacy script + migrated scenario on same runner and diffs outcomes Tests (all passing, 41 total): - test/e2e-lib-helpers.test.ts: +18 tests (1.A\u20131.E) - test/e2e-convention-lint.test.ts: +11 tests (1.G\u20131.H) - test/e2e-expected-state-validator.test.ts: +2 tests (1.F) No regressions: full cli Vitest project (3258 tests) still green.
cjagwani
pushed a commit
that referenced
this pull request
May 14, 2026
Resolve the two output threads in #3456 left after the core dead-loop fix landed via #3459 + #3434: Sub-bug #3 — `src/lib/onboard.ts` printed `nemoclaw <name> destroy --yes && nemoclaw onboard --gpu` with a literal `<name>` placeholder, and assumed at least one sandbox was registered. When the GPU-passthrough mismatch hit on the State B re-run path with an empty registry (the dead-loop case), the hint was not actionable. Replace with a registry-aware helper at `src/lib/onboard/gpu-recovery.ts` that renders the right shape: - empty registry → suggest `nemoclaw uninstall && nemoclaw onboard --gpu` - one sandbox → suggest destroy --yes --cleanup-gateway for that name - multiple sandboxes → list each, only the last gets --cleanup-gateway Sub-bug #4 — `src/lib/actions/uninstall/run-plan.ts` printed `Destroyed gateway 'nemoclaw' skipped` when the openshell destroy no-op'd (gateway already gone) — the "Destroyed … skipped" wording was self-contradictory. Extend `runOptional` with an `onSkip` option; route the gateway destroy to emit `Gateway 'nemoclaw' already removed or unreachable` on no-op. Tests: - `src/lib/onboard/gpu-recovery.test.ts` (6 tests): forbid literal `<name>` placeholder anywhere in the output; cover empty / single / multi-sandbox cases; defensive filter on whitespace names so a `nemoclaw destroy` rendering can never happen. - `src/lib/actions/uninstall/run-plan.test.ts`: assert the new "already removed or unreachable" wording and the absence of the "Destroyed gateway 'nemoclaw' skipped" string. The core dead loop itself (sub-bugs #1, #2 and State B GPU mismatch) is already addressed by #3459 + #3434 + #3483; #3456 will close once this lands. See the #3456 status comment for the full mapping. Refs #3456. Mirrors (and tightens) the approach in the closed PR #3464, which left the literal `<name>` placeholder in tests per CodeRabbit feedback that was never addressed. Signed-off-by: Charan Jagwani <charjags100@gmail.com>
cv
added a commit
that referenced
this pull request
May 14, 2026
…3520) > **Draft for visibility.** Issue-autopilot Stages 4-5 of #3456. Will mark ready once batch self-review + CI complete. ## Summary Closes the two remaining output threads in #3456 after the core dead-loop fix already landed on `main` (via #3459, #3434, #3483). Full sub-bug mapping in the [#3456 status comment](#3456 (comment)). - **Sub-bug #3** — `nemoclaw <name> destroy --yes` recovery hint replaced with a registry-aware helper. - **Sub-bug #4** — `Destroyed gateway 'nemoclaw' skipped` self-contradictory wording replaced with `Gateway 'nemoclaw' already removed or unreachable`. ## Acceptance criteria mapping | Sub-bug | Resolution | Evidence | |---|---|---| | #1 dead loop | Already fixed on main (#3459) | out of scope | | #2 firewall diagnostic | Already fixed on main (#3459) | out of scope | | **#3** literal `<name>` placeholder | **This PR** | `src/lib/onboard/gpu-recovery.ts` + `onboard.ts:10387-10405` | | **#4** misleading "skipped" wording | **This PR** | `src/lib/actions/uninstall/run-plan.ts:210-228, 407-414` | | #5 uninstall residuals | Already fixed on main (#3483) | out of scope | ## Behavior matrix `gpuPassthroughRecoveryLines(names)`: | Input | Suggestion | |---|---| | `null` / `[]` | `nemoclaw uninstall && nemoclaw onboard --gpu` | | one sandbox | `nemoclaw <name> destroy --yes --cleanup-gateway && nemoclaw onboard --gpu` | | many sandboxes | each `destroy --yes`, only the last gets `--cleanup-gateway` | ## Test plan ``` npm run typecheck:cli npx vitest run src/lib/onboard/gpu-recovery.test.ts src/lib/actions/uninstall/run-plan.test.ts ``` 22 tests pass (6 new + 16 existing). ## Notes for reviewers - This is the work [#3464 attempted](#3464); that PR was closed without merging after CodeRabbit asked for the `<name>` placeholder to be forbidden in tests via negative assertion. This PR adopts that refinement. - `runOptional` extension is backwards-compatible — existing callers without `onSkip` get the original wording. Closes #3456 once merged. --------- Signed-off-by: Charan Jagwani <charjags100@gmail.com> Co-authored-by: Charan Jagwani <charjags100@gmail.com> Co-authored-by: Carlos Villela <cvillela@nvidia.com>
3 tasks
jyaunches
added a commit
that referenced
this pull request
May 28, 2026
…advisor #4) Advisor finding: README and scenario model define 'setup scenario -> expected state -> suite sequence', but the live TS path skipped expected-state validation entirely. The legacy bash runner gated this behind E2E_VALIDATE_EXPECTED_STATE=1 and used env-var probe overrides only; the actual environment-shape contract ran inline as e2e_gateway_assert_healthy / e2e_sandbox_assert_running between onboarding and suite execution. The TS runner did neither. Promote the inline preconditions to a first-class typed phase, in the spirit of EnvironmentOrchestrator/OnboardingOrchestrator: real probes, real clients, framework-owned timeouts and redaction, single source of truth in the typed registry. - types.ts: PhaseName extends to 'state-validation'. New ExpectedState typed shape (cli/gateway/sandbox/inference/credentials with present|absent|optional). New StateProbeId union. ExpectedFailurePhase intentionally excludes state-validation so scenarios cannot declare expected failures against an internal phase. - scenarios/expected-states.ts (new): typed mirror of nemoclaw_scenarios/expected-states.yaml. Source of truth for the TS runner during transition; YAML stays in place for the legacy resolver until that path is fully retired. probesForState() maps the typed contract to the concrete probe ids the orchestrator emits. Inference and credentials remain declared but emit no probe actions yet (probe scripts not implemented); a registry test pins this gap so a future probe-script PR is forced to update the mapping. - nemoclaw_scenarios/probes/ (new): - dispatch.sh exports e2e_state_probe <id>; the typed runner spawns it via the shared dispatch-action.sh launcher. - cli-installed.sh: command -v nemoclaw + executable check. - gateway-healthy.sh: defers to validation_suites/assert/gateway-alive.sh so there's one implementation of the gateway-health contract. - sandbox-running.sh: defers to validation_suites/assert/sandbox-alive.sh. - gateway-absent.sh: nemoclaw gateway status + URL reachability. Typed replacement for the run-scenario.sh inline forbidden-effect check on the gateway axis. - sandbox-absent.sh: nemoclaw list + openshell sandbox list. Typed replacement for the inline 'openshell sandbox list | grep -Fq'. - compiler.ts: state-validation phase actions emitted from scenario.expectedStateId via probesForState(). Hard error on unknown expected_state id (typed runner is stricter than legacy resolver). Phase order is environment -> onboarding -> state-validation -> runtime. - orchestrators/state-validation.ts (new): tiny subclass of PhaseOrchestrator. No new control flow; probe actions reuse the existing shell-fn action machinery (timeouts, redaction, evidence logs). - orchestrators/runner.ts: phase-blocking semantics get one rule refinement. environment failure blocks onboarding, state-validation, and runtime. onboarding failure does NOT block state-validation - negative scenarios deliberately fail onboarding and rely on absent-state probes running afterward to verify forbidden side effects did not occur. state-validation failure blocks runtime so suites never run against a missing/wedged environment. - orchestrators/negative-matcher.ts: state-validation forbidden-effect probe ids (gateway-absent, sandbox-absent) excluded from observed-failure scanning. They are post-failure verification, not the failure mode itself; their pass/fail status is reported separately and feeds the phase chain through normal action-failure semantics. - 16 new tests in e2e-expected-state.test.ts cover: typed registry mirrors YAML structurally, probesForState mapping for ready/absent/ optional states (inference/credentials gap pinned), compiler emits the right probe actions per scenario, phase order, hard error on unknown state id, and the three runner short-circuit cases (env failure blocks state-validation, onboarding failure does not, state-validation failure blocks runtime). Existing tests updated for the new phase ordering (e2e-phase-orchestrators, e2e-negative-matcher). Spec ownership boundaries kept honest: - Typed contract over env-var probes: replaces E2E_PROBE_OVERRIDE_* with structurally-typed expected-state declarations. - One mode, no opt-out: state-validation always runs after onboarding for any scenario with an expectedStateId. - Framework infra, not shell: orchestrator + clients + redaction reused; only the probe scripts are bash, same shape as install/ onboard dispatchers. - Single source of truth: scenario.expectedStateId -> typed registry -> emitted probe actions; one declaration drives everything. - Failures stay visibly real: probes that aren't implemented yet (inference, credentials) stay declared but inert with a registry test pinning the gap. Their first execution lands when the probe ships, not as silent green. Out of scope (deferred follow-ups): - inference-available and credentials-present probe scripts. - Per-suite requires_state gating at the assertion level. - Retiring expected-states.yaml and runtime/resolver/validator.ts; the typed registry is now the SoT for the TS runner, but both artifacts remain alongside until the legacy resolver is unused. - Replacing the runtime.expected-failure.no-side-effects required pending step. State-validation absent probes now do the real work, but the placeholder stays put until a follow-up confirms the switchover and removes it. 418 framework tests pass (16 new). tsc clean. Plan output verified on positive (ubuntu-repo-cloud-openclaw) and negative (ubuntu-no-docker-preflight-negative) scenarios. Signed-off-by: Julie Yaunches <jyaunches@nvidia.com>
jyaunches
added a commit
that referenced
this pull request
May 28, 2026
…h (PR #4380) Closes the architectural loose end from advisor #4: the typed state-validation phase landed in ff4b0a8 with a parity test pinning scenarios/expected-states.ts to nemoclaw_scenarios/expected-states.yaml. Two sources of truth tied by a brittle parity test is exactly the 'localized fix without source removal' the advisor warns about. Now that nothing live consumes the YAML (e2e-scenarios.yaml workflow goes through scenarios/run.ts; the bash run-scenario.sh is a stub), retire the whole legacy resolver path. Deleted (2,236 lines of source + test): - nemoclaw_scenarios/expected-states.yaml - runtime/resolver/ (load.ts, plan.ts, schema.ts, coverage.ts, index.ts, validator.ts, expected-failure.ts, js-yaml.d.ts) - runtime/coverage-report.sh - 6 framework tests that exercised only the resolver path: e2e-scenario-schema, e2e-scenario-resolver, e2e-coverage-report, e2e-scenario-additional-families, e2e-metadata-final-hygiene, e2e-expected-failure (its typed equivalent now lives in e2e-negative-matcher.test.ts). Trimmed: - e2e-migration-inventory-lock.test.ts: drop the expected-states.yaml-vs-typed-registry test; expected-state migration is complete and parity is now intra-typed. - e2e-expected-state.test.ts: drop the YAML-mirror parity tests; replace with id-coverage assertions against listExpectedStates(). - types.ts / expected-states.ts / compiler.ts comments: state plainly that the typed registry is the single source of truth. Updated: - docs/README.md: typed scenarios/ tree is now the documented source of truth; scenarios.yaml + suites.yaml called out as historical references during cleanup. Out of scope (deferred): - nemoclaw_scenarios/scenarios.yaml and validation_suites/suites.yaml remain on disk. With the resolver gone, both are now non-runtime reference docs. Their final removal needs a follow-up because e2e-assertion-modules.test.ts and e2e-migration-inventory-lock.test.ts still cross-check against suites.yaml, and e2e-scenario-advisor.test.ts references the path string. Removing them is mechanical but its own blast radius. 348 framework tests pass after deletion (was 418 \u2014 70 tests gone with the 5 deleted resolver test files; 16 expected-state tests trimmed to 3 since the YAML is gone). tsc clean. Signed-off-by: Julie Yaunches <jyaunches@nvidia.com>
cjagwani
added a commit
that referenced
this pull request
Jun 3, 2026
…m anchor Addresses @cv's review ask #4 on PR #4175 — runtime validation through the Hermes gateway first-message path — by adding an integration-style test that drives the actual `_pre_llm_call` → `AIAgent._strip_think_blocks` chain Hermes uses, rather than only exercising the normalizer in isolation. The new test: - Stubs `run_agent.AIAgent._strip_think_blocks` as a real method. - Calls `_pre_llm_call(platform="telegram", is_first_turn=True)` to simulate the first Telegram turn arriving via the gateway hook — this both sets the platform anchor and installs the patch via the same code path Hermes uses at runtime. - Calls `AIAgent._strip_think_blocks(...)` twice to assert (a) same-platform body is extracted, (b) cross-platform `to slack: ...` is preserved. - Calls `_pre_llm_call(platform="discord", is_first_turn=True)` to simulate a turn on a different platform and asserts the anchor refreshes per-turn (Discord body extracted, telegram target preserved). Combined with the existing isolated-unit cases and the cited `hermes-e2e` / `hermes-discord-e2e` / `hermes-slack-e2e` scenarios, this gives both "added" and "cited" runtime validation for the gateway first-message path. Signed-off-by: Charan Jagwani <cjagwani@nvidia.com> Co-authored-by: Chengjie Wang <chengjiew@nvidia.com>
12 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Migrated from NVIDIA/openshell-openclaw-plugin#17 by @vincentkoc
Summary
Validate
sessionIdon the/api/sandbox-chatpath before it is used to build the sandbox execution command.Changes
normalizeSessionId()and requiresessionIdto match^[A-Za-z0-9_-]{1,64}$messagevalues and invalidsessionIdvalues with400Testing
node --check .jensenclaw/server.js