Skip to content

fix: add fork bomb protection via process limits#811

Closed
cluster2600 wants to merge 4 commits into
NVIDIA:mainfrom
cluster2600:security/sandbox-process-limits
Closed

fix: add fork bomb protection via process limits#811
cluster2600 wants to merge 4 commits into
NVIDIA:mainfrom
cluster2600:security/sandbox-process-limits

Conversation

@cluster2600

@cluster2600 cluster2600 commented Mar 24, 2026

Copy link
Copy Markdown
Contributor

Summary

Closes #809.

Without process limits, a prompt-injected agent can exhaust host resources by spawning processes recursively (fork bomb). This PR adds three layers of defence:

  • Container level: docker update --pids-limit 512 applied post-creation in onboard.js
  • In-sandbox fallback: ulimit -u 512 set in nemoclaw-start.sh when the current limit is unlimited
  • Documentation: Comment in the sandbox policy YAML noting that OpenShell does not currently expose a pids_limit field

Defence layers

graph TD
    subgraph "Process Limit Enforcement"
        A[Agent spawns processes] --> B{ulimit -u set?}
        B -->|Yes| C[Kernel enforces RLIMIT_NPROC]
        B -->|No / unlimited| D[nemoclaw-start sets ulimit -u 512]
        D --> C
        C --> E{Over limit?}
        E -->|Yes| F[fork returns EAGAIN]
        E -->|No| G[Process created]
    end

    subgraph "Container Level"
        H[--pids-limit 512] --> I[cgroup pids.max]
        I --> C
    end

    F --> J[Agent receives error]
    J --> K[System stays responsive]
Loading

Changes

File Change
scripts/nemoclaw-start.sh Set ulimit -u 512 when current limit is unlimited
bin/lib/onboard.js Apply docker update --pids-limit after sandbox creation
nemoclaw-blueprint/policies/openclaw-sandbox.yaml Document limitation in process section
docs/security/process-limits.md Full documentation with mermaid diagram
test/process-limits.test.js Vitest assertions for all three components

Configuration

The default limit of 512 processes is configurable via NEMOCLAW_PIDS_LIMIT environment variable at onboard time.

Test plan

  • npx vitest run test/process-limits.test.js — 8/8 tests pass
  • Run nemoclaw onboard on a test host and verify docker inspect shows PidsLimit: 512
  • Verify ulimit -u inside the sandbox returns 512
  • Verify fork bomb (:(){ :|:& };:) is contained and doesn't affect the host

Summary by CodeRabbit

  • New Features

    • Layered fork‑bomb protection: onboarding applies a container-level process limit (default 512, overridable via environment variable) with an in‑sandbox startup fallback enforcing a per-user process limit.
  • Documentation

    • New security guide describing the two‑layer model, defaults, override behavior, and system behavior when limits are exceeded; notes that runtime enforcement is required.
  • Tests

    • Added tests validating presence and configuration of the protections.

@coderabbitai

coderabbitai Bot commented Mar 24, 2026

Copy link
Copy Markdown
Contributor

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Onboarding now reads NEMOCLAW_PIDS_LIMIT (default "512"), attempts docker update --pids-limit after sandbox creation, and logs outcome. The sandbox entrypoint best-effort sets ulimit -u 512 if currently unlimited. Documentation and tests for the two-layer enforcement were added.

Changes

Cohort / File(s) Summary
Container-Level Enforcement
bin/lib/onboard.js
Reads NEMOCLAW_PIDS_LIMIT (defaults to "512"), validates it as a positive integer, runs docker update --pids-limit <value> post-creation with errors ignored, and logs success or fallback.
In-Sandbox Fallback
scripts/nemoclaw-start.sh
Adds startup check of ulimit -u; if unlimited, attempts ulimit -u 512 in a best-effort, error-suppressed manner before launching services.
Documentation
docs/security/process-limits.md
New page describing two-layer enforcement (container pids.max + in-sandbox ulimit), defaults, NEMOCLAW_PIDS_LIMIT behavior, validation, and control-flow diagram.
Policy Note
nemoclaw-blueprint/policies/openclaw-sandbox.yaml
Inserted comment noting sandbox policy lacks a pids-limit field and that runtime-level enforcement is required.
Tests
test/process-limits.test.js
New Vitest suite asserting presence/structure of ulimit -u handling, policy YAML documentation, and docker update --pids-limit usage with NEMOCLAW_PIDS_LIMIT default fallback.

Sequence Diagram

sequenceDiagram
    participant OB as Onboarding
    participant DR as Docker_Runtime
    participant SB as Sandbox
    participant SS as Start_Script

    OB->>OB: Read NEMOCLAW_PIDS_LIMIT (default "512")
    OB->>DR: docker update --pids-limit 512 (ignore errors)
    DR-->>OB: Success or Error
    OB->>OB: Log success or fallback notice

    Note over SB: Sandbox starts
    SB->>SS: Execute nemoclaw-start.sh
    SS->>SS: Check ulimit -u
    alt ulimit is "unlimited"
        SS->>SS: Attempt ulimit -u 512 (best-effort, ignore errors)
        SS-->>SS: Possibly fails silently
    else ulimit limited
        SS-->>SS: No change
    end
    SS->>SB: Start gateway & services

    Note over SB: During runtime
    SB->>SB: Fork attempts beyond pids.max
    SB-->>SB: Kernel returns EAGAIN / fork fails
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇 I capped the bursts where wild processes play,
Docker and ulimit mind the fray.
A quiet burrow, threads kept small,
I hop, content — we guard them all.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly and concisely describes the main change: adding fork bomb protection via process limits, which is the primary objective of the changeset.
Linked Issues check ✅ Passed The PR fully addresses all objectives from issue #809: enforcing process limits at container level via docker update, providing in-sandbox ulimit fallback, making limits configurable via NEMOCLAW_PIDS_LIMIT, and documenting the implementation and gaps.
Out of Scope Changes check ✅ Passed All changes directly support fork bomb protection: container-level enforcement, in-sandbox fallback, comprehensive documentation, policy notes, and validation tests—no extraneous modifications present.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
docs/security/process-limits.md (2)

33-34: One sentence per line.

Per docs style guide, each sentence should be on its own line in the source to make diffs more readable.

📝 Suggested fix
 The default process limit is 512 per sandbox.
-agent operation (typically < 50 processes) whilst preventing fork bombs.
+This is sufficient for normal agent operation (typically < 50 processes) whilst preventing fork bombs.

Full replacement:

The default process limit is 512 per sandbox.
This is sufficient for normal agent operation (typically < 50 processes) whilst preventing fork bombs.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/security/process-limits.md` around lines 33 - 34, Split the two
sentences in the paragraph into separate source lines so each sentence is on its
own line: replace the single-line text "The default process limit is 512 per
sandbox. This is sufficient for normal agent operation (typically < 50
processes) whilst preventing fork bombs." with two lines — "The default process
limit is 512 per sandbox." and "This is sufficient for normal agent operation
(typically < 50 processes) whilst preventing fork bombs." — so diffs are more
readable per the docs style guide.

64-64: Consider adding a "Next Steps" section.

Per docs style guide, pages should include a "Next Steps" section at the bottom linking to related pages (e.g., sandbox policy documentation, security overview).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/security/process-limits.md` at line 64, Add a "Next Steps" section at
the end of the process-limits documentation: create a new heading "Next Steps"
(or "Next steps") and include bullet links to related pages such as the sandbox
policy documentation and the security overview so readers can navigate to those
resources from the Process Limits page; ensure link text is descriptive and
follows the existing docs style guide conventions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bin/lib/onboard.js`:
- Around line 637-641: The pidsLimit value read from
process.env.NEMOCLAW_PIDS_LIMIT is interpolated directly into the shell command
in the runCapture call (see pidsLimit and the docker update invocation),
allowing shell injection; validate and sanitize it before use by parsing it as
an integer, ensuring it is a positive integer within an acceptable range (e.g.,
1–32768), falling back to the default "512" if parsing fails or the value is out
of range, and then use that sanitized numeric value (or, preferably, avoid shell
interpolation entirely by passing arguments to a spawn/exec API that accepts an
args array) when constructing the docker update command in onboard.js.

In `@docs/security/process-limits.md`:
- Around line 62-64: Replace the CLI code block in
docs/security/process-limits.md that currently uses the ```bash fence and raw
command with a ```console fence and a dollar prompt; specifically, change the
code block marker from ```bash to ```console and prefix the command
"NEMOCLAW_PIDS_LIMIT=1024 nemoclaw onboard" with "$ " so the block follows the
CLI style guide (use the shown command string as the content inside the new
console block).
- Around line 1-2: Add required frontmatter YAML and an SPDX license header to
the document that currently only contains the Markdown title "# Process Limits";
specifically insert a frontmatter block including title, description, keywords,
topics, tags, content type, difficulty, audience, and status fields at the top
of the file and then add the SPDX license header immediately after the
frontmatter so the page complies with the docs style guide and contains the
necessary metadata and licensing information.

---

Nitpick comments:
In `@docs/security/process-limits.md`:
- Around line 33-34: Split the two sentences in the paragraph into separate
source lines so each sentence is on its own line: replace the single-line text
"The default process limit is 512 per sandbox. This is sufficient for normal
agent operation (typically < 50 processes) whilst preventing fork bombs." with
two lines — "The default process limit is 512 per sandbox." and "This is
sufficient for normal agent operation (typically < 50 processes) whilst
preventing fork bombs." — so diffs are more readable per the docs style guide.
- Line 64: Add a "Next Steps" section at the end of the process-limits
documentation: create a new heading "Next Steps" (or "Next steps") and include
bullet links to related pages such as the sandbox policy documentation and the
security overview so readers can navigate to those resources from the Process
Limits page; ensure link text is descriptive and follows the existing docs style
guide conventions.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 677b5aec-8453-46e4-b9f1-13ae89a39b4b

📥 Commits

Reviewing files that changed from the base of the PR and between 6e1208c and 0ca735d.

📒 Files selected for processing (5)
  • bin/lib/onboard.js
  • docs/security/process-limits.md
  • nemoclaw-blueprint/policies/openclaw-sandbox.yaml
  • scripts/nemoclaw-start.sh
  • test/process-limits.test.js

Comment thread bin/lib/onboard.js Outdated
Comment thread docs/security/process-limits.md Outdated
Comment thread docs/security/process-limits.md Outdated
@wscurran wscurran added bug Something fails against expected or documented behavior NemoClaw CLI security Potential vulnerability, unsafe behavior, or access risk labels Mar 25, 2026
@wscurran

Copy link
Copy Markdown
Contributor

✨ Thanks for submitting this PR with a detailed summary, it addresses a security issue that could cause problems during runtime, particularly if an agent is injected with malicious prompts, which could improve the security of NemoClaw.

@cluster2600 cluster2600 force-pushed the security/sandbox-process-limits branch from 79fcf8e to aebd5c7 Compare March 25, 2026 19:23

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
bin/lib/onboard.js (1)

1511-1512: ⚠️ Potential issue | 🟠 Major

Reject NEMOCLAW_PIDS_LIMIT=0 to avoid silently disabling PID protection.

Line 1512 uses /^\d+$/, which still allows 0. For this security control, treat 0 as invalid and fall back to 512 (or fail fast).

In current Docker documentation, what is the behavior of `docker update --pids-limit 0`? Does it mean "unlimited"?
🔧 Suggested patch
-  const pidsLimit = /^\d+$/.test(rawPidsLimit) ? rawPidsLimit : "512";
+  const pidsLimit = /^[1-9]\d*$/.test(rawPidsLimit) ? rawPidsLimit : "512";
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bin/lib/onboard.js` around lines 1511 - 1512, The current parsing of
NEMOCLAW_PIDS_LIMIT accepts "0" which disables PID protection; update the
validation for rawPidsLimit (env var NEMOCLAW_PIDS_LIMIT) so that pidsLimit only
accepts positive integers > 0 (e.g., via /^\d+$/ plus explicit check that the
parsed value !== 0 or parseInt(rawPidsLimit,10) > 0); if validation fails, fall
back to "512" (or alternatively throw/fail fast) and ensure the code sets the
pidsLimit constant accordingly.
🧹 Nitpick comments (1)
bin/lib/onboard.js (1)

1513-1521: Use exit status, not output text, to determine docker update success.

runCapture(..., { ignoreError: true }) discards failure details and returns "", so the includes("Error") check is not a reliable signal. Prefer status-based handling so logs are deterministic and actionable.

♻️ Suggested refactor
-  const dockerUpdate = runCapture(
-    `docker update --pids-limit ${pidsLimit} "${sandboxName}" 2>&1`,
-    { ignoreError: true }
-  );
-  if (dockerUpdate && !dockerUpdate.includes("Error")) {
+  const dockerUpdate = spawnSync(
+    "bash",
+    ["-lc", `docker update --pids-limit ${pidsLimit} "${sandboxName}"`],
+    { cwd: ROOT, encoding: "utf8" }
+  );
+  if (dockerUpdate.status === 0) {
     console.log(`  ✓ Process limit set (--pids-limit ${pidsLimit})`);
   } else {
+    const detail = String(dockerUpdate.stderr || dockerUpdate.stdout || "").trim();
     console.log(`  ⓘ Could not set --pids-limit (container runtime may not support it)`);
+    if (detail) console.log(`    Detail: ${detail}`);
     console.log("    The in-sandbox ulimit fallback in nemoclaw-start.sh will apply.");
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bin/lib/onboard.js` around lines 1513 - 1521, The success check for `docker
update` is using the captured stdout string (`dockerUpdate.includes("Error")`)
which is unreliable because `runCapture(..., { ignoreError: true })` can return
an empty string on failure; update the logic in onboard.js to inspect the
command exit status (or use a non-ignored run that throws on non-zero) instead
of text-matching: call the underlying runner (or adjust `runCapture`) so you
capture an explicit success boolean or exit code for the `docker update
--pids-limit ${pidsLimit} "${sandboxName}"` invocation, then log the success
message when exit code === 0 (or success === true) and the fallback/info
messages on non-zero exit, referencing the `runCapture` call and the
`dockerUpdate` handling around that command.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@bin/lib/onboard.js`:
- Around line 1511-1512: The current parsing of NEMOCLAW_PIDS_LIMIT accepts "0"
which disables PID protection; update the validation for rawPidsLimit (env var
NEMOCLAW_PIDS_LIMIT) so that pidsLimit only accepts positive integers > 0 (e.g.,
via /^\d+$/ plus explicit check that the parsed value !== 0 or
parseInt(rawPidsLimit,10) > 0); if validation fails, fall back to "512" (or
alternatively throw/fail fast) and ensure the code sets the pidsLimit constant
accordingly.

---

Nitpick comments:
In `@bin/lib/onboard.js`:
- Around line 1513-1521: The success check for `docker update` is using the
captured stdout string (`dockerUpdate.includes("Error")`) which is unreliable
because `runCapture(..., { ignoreError: true })` can return an empty string on
failure; update the logic in onboard.js to inspect the command exit status (or
use a non-ignored run that throws on non-zero) instead of text-matching: call
the underlying runner (or adjust `runCapture`) so you capture an explicit
success boolean or exit code for the `docker update --pids-limit ${pidsLimit}
"${sandboxName}"` invocation, then log the success message when exit code === 0
(or success === true) and the fallback/info messages on non-zero exit,
referencing the `runCapture` call and the `dockerUpdate` handling around that
command.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 35bd017d-e950-44fb-818b-9cca440e72fd

📥 Commits

Reviewing files that changed from the base of the PR and between 79fcf8e and aebd5c7.

📒 Files selected for processing (5)
  • bin/lib/onboard.js
  • docs/security/process-limits.md
  • nemoclaw-blueprint/policies/openclaw-sandbox.yaml
  • scripts/nemoclaw-start.sh
  • test/process-limits.test.js
✅ Files skipped from review due to trivial changes (3)
  • nemoclaw-blueprint/policies/openclaw-sandbox.yaml
  • test/process-limits.test.js
  • docs/security/process-limits.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • scripts/nemoclaw-start.sh

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
bin/lib/onboard.js (1)

1897-1898: ⚠️ Potential issue | 🟡 Minor

NEMOCLAW_PIDS_LIMIT=0 silently disables fork bomb protection.

The regex /^\d+$/ allows "0", but Docker interprets --pids-limit 0 as unlimited, which would silently disable the protection instead of falling back to the default 512. This was noted in the previous review discussion.

🛡️ Proposed fix to reject zero
 const rawPidsLimit = process.env.NEMOCLAW_PIDS_LIMIT || "512";
-const pidsLimit = /^\d+$/.test(rawPidsLimit) ? rawPidsLimit : "512";
+const pidsLimit = /^[1-9][0-9]*$/.test(rawPidsLimit) ? rawPidsLimit : "512";
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bin/lib/onboard.js` around lines 1897 - 1898, The pids-limit parsing allows
"0" which Docker treats as unlimited; update the validation around
rawPidsLimit/pidsLimit so it only accepts positive integers (>0) and falls back
to "512" for any non-positive or non-numeric value (e.g., reject "0", negative
numbers, empty string); modify the /^\d+$/ check to also ensure numeric value
!== "0" or parseInt(rawPidsLimit,10) > 0 and set pidsLimit accordingly using the
existing rawPidsLimit and pidsLimit identifiers.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@bin/lib/onboard.js`:
- Around line 1897-1898: The pids-limit parsing allows "0" which Docker treats
as unlimited; update the validation around rawPidsLimit/pidsLimit so it only
accepts positive integers (>0) and falls back to "512" for any non-positive or
non-numeric value (e.g., reject "0", negative numbers, empty string); modify the
/^\d+$/ check to also ensure numeric value !== "0" or parseInt(rawPidsLimit,10)
> 0 and set pidsLimit accordingly using the existing rawPidsLimit and pidsLimit
identifiers.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c5a5b8ec-ae9f-4490-851f-80eac6b43c0e

📥 Commits

Reviewing files that changed from the base of the PR and between aebd5c7 and 1a9811b.

📒 Files selected for processing (5)
  • bin/lib/onboard.js
  • docs/security/process-limits.md
  • nemoclaw-blueprint/policies/openclaw-sandbox.yaml
  • scripts/nemoclaw-start.sh
  • test/process-limits.test.js
✅ Files skipped from review due to trivial changes (3)
  • nemoclaw-blueprint/policies/openclaw-sandbox.yaml
  • test/process-limits.test.js
  • docs/security/process-limits.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • scripts/nemoclaw-start.sh

@cluster2600 cluster2600 force-pushed the security/sandbox-process-limits branch from b6eecb2 to f64ef8b Compare April 8, 2026 07:16
Set ulimit -u 512 in nemoclaw-start.sh when the container runtime
hasn't set a process limit. Documents the need for --pids-limit
at the container level as a follow-up.

Signed-off-by: Maxime Grenu <maxime.grenu@gmail.com>
Sanitise NEMOCLAW_PIDS_LIMIT to reject non-numeric values. Align
docs with project style guide: frontmatter, SPDX header, setext
headings via myst-compatible fencing.

Signed-off-by: Maxime Grenu <maxime.grenu@gmail.com>
ESLint flags __dirname as undefined in ESM context. Use
import.meta.dirname which is the standard pattern in this project.

Signed-off-by: Maxime Grenu <maxime.grenu@gmail.com>
Docker treats --pids-limit 0 as unlimited, which would silently
disable fork bomb protection. Tighten the regex from /^\d+$/ to
/^[1-9][0-9]*$/ so zero falls back to the default 512.

Signed-off-by: Maxime Grenu <maxime.grenu@gmail.com>
@cluster2600 cluster2600 force-pushed the security/sandbox-process-limits branch from f64ef8b to 7e35b75 Compare April 14, 2026 09:10
@cluster2600

Copy link
Copy Markdown
Contributor Author

Closing — fork bomb protection (ulimit -u 512) and process limits landed in #830. No longer needed.

@wscurran wscurran added area: cli Command line interface, flags, terminal UX, or output bug-fix PR fixes a bug or regression and removed NemoClaw CLI bug Something fails against expected or documented behavior labels Jun 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: cli Command line interface, flags, terminal UX, or output bug-fix PR fixes a bug or regression security Potential vulnerability, unsafe behavior, or access risk

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[SECURITY] No process limit (ulimit -u unlimited) — fork bomb unprotected

2 participants