feat(onboard): add NEMOCLAW_AGENT_HEARTBEAT_EVERY env var#3158
Conversation
…nv var (NVIDIA#2880) OpenClaw's periodic heartbeat (default 30m) can flood the dashboard and freeze long-running agent turns. The fix to set `agents.defaults.heartbeat.every "0m"` is documented for OpenClaw, but `openclaw config set` is read-only inside a NemoClaw sandbox and the generated openclaw.json had no env-var passthrough for this key — leaving no supported way to disable the heartbeat. Add NEMOCLAW_AGENT_HEARTBEAT_EVERY mirroring the existing NEMOCLAW_AGENT_TIMEOUT plumbing: Dockerfile ARG → ENV → Python config generator → openclaw.json. Empty/unset preserves the OpenClaw default; "0m" disables; other Go-style durations (e.g. "1h") tune the cadence. Signed-off-by: Dongni Yang <dongniy@nvidia.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a build-time Docker ARG and exported ENV ChangesAgent Heartbeat Configuration
Sequence DiagramsequenceDiagram
participant Dev as Developer (host)
participant Onb as nemoclaw onboard
participant Docker as Docker build
participant Gen as generate-openclaw-config.py
participant OC as OpenClaw runtime
Dev->>Onb: export NEMOCLAW_AGENT_HEARTBEAT_EVERY=<value>
Dev->>Onb: run nemoclaw onboard --resume
Onb->>Docker: patch staged Dockerfile (rewrite ARG if valid)
Docker->>Docker: build image with ENV NEMOCLAW_AGENT_HEARTBEAT_EVERY
Docker->>Gen: container startup runs config generator (reads ENV)
Gen->>Gen: validate duration, warn and omit if invalid
Gen->>OC: write openclaw.json (with or without heartbeat.every)
OC->>OC: agent heartbeat cadence honored at runtime
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (4)
docs/inference/switch-inference-providers.md (1)
149-149: ⚡ Quick winFormat default value as inline code.
Line 149 uses
unsetas a parameter value but does not wrap it in inline code formatting.Proposed fix
-| `NEMOCLAW_AGENT_HEARTBEAT_EVERY` | Go-style duration (`30m`, `1h`, `0m` to disable) | unset (OpenClaw default) | +| `NEMOCLAW_AGENT_HEARTBEAT_EVERY` | Go-style duration (`30m`, `1h`, `0m` to disable) | `unset` (OpenClaw default) |As per coding guidelines "CLI commands, file paths, flags, parameter names, and values must use inline
codeformatting."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/inference/switch-inference-providers.md` at line 149, The table row for the parameter NEMOCLAW_AGENT_HEARTBEAT_EVERY uses the plain word unset for the default value; update that cell so the default value is formatted as inline code (e.g., `unset`) to follow the guideline that CLI flags, parameter names and values use inline code formatting and match the surrounding table style.test/generate-openclaw-config.test.ts (1)
246-264: ⚡ Quick winAdd explicit empty-string heartbeat test for build-arg parity.
Line 246 validates the unset case, but the Docker build path commonly passes an empty string. Add one assertion to lock in that
""also omitsagents.defaults.heartbeat.Proposed test addition
it("omits heartbeat when NEMOCLAW_AGENT_HEARTBEAT_EVERY is unset", () => { const config = runConfigScript(); expect(config.agents.defaults.heartbeat).toBeUndefined(); }); + it("omits heartbeat when NEMOCLAW_AGENT_HEARTBEAT_EVERY is empty", () => { + const config = runConfigScript({ NEMOCLAW_AGENT_HEARTBEAT_EVERY: "" }); + expect(config.agents.defaults.heartbeat).toBeUndefined(); + }); + it("propagates heartbeat cadence into agents.defaults.heartbeat.every", () => {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/generate-openclaw-config.test.ts` around lines 246 - 264, Add a new test that covers the empty-string build-arg case by calling runConfigScript({ NEMOCLAW_AGENT_HEARTBEAT_EVERY: "" }) and asserting that config.agents.defaults.heartbeat is undefined; place it alongside the existing heartbeat tests (the ones using "unset", "30m", "0m", and malformed values) so the suite verifies that an empty string omits the heartbeat field just like the unset case.src/lib/onboard.ts (1)
2364-2535: Run the onboarding E2E suite for thissrc/lib/onboard.tschange path.Because this touches core sandbox build/config wiring, it’s worth running the recommended nightly E2E job set before merge to catch integration regressions.
As per coding guidelines:
src/lib/onboard.ts: “This file contains core onboarding logic. Changes here affect the full sandbox creation and configuration flow.” and the listed E2E recommendations.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/lib/onboard.ts` around lines 2364 - 2535, The change modifies sandbox image build metadata in patchStagedDockerfile (ARG replacements including BASE_IMAGE, NEMOCLAW_* args, proxy, messaging and web search flags); before merging, run the onboarding E2E suite (the recommended nightly job set) to validate full sandbox creation: build the image, confirm BASE_IMAGE pinning behavior, ensure NEMOCLAW_* ARG values (MODEL, PROVIDER_KEY, PRIMARY_MODEL_REF, INFERENCE_* and BUILD_ID) propagate, verify proxy env handling (NEMOCLAW_PROXY_HOST/PORT), messaging/telegram/discord arg encoding, web-search toggle, and that the resulting sandbox boots and exposes the dashboard without device auth.Dockerfile (1)
265-270: Run the Dockerfile E2E matrix before merge.Since this changes sandbox image build-time config behavior, run the recommended E2Es (
cloud-e2e,sandbox-survival-e2e,hermes-e2e,rebuild-openclaw-e2e) to validate onboard/rebuild behavior end-to-end.As per coding guidelines,
Dockerfile: This file affects the sandbox container image. Layer ordering, permissions, and baked config changes are only testable with a real container build. E2E test recommendation: cloud-e2e, sandbox-survival-e2e, hermes-e2e, rebuild-openclaw-e2e.Also applies to: 323-323
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Dockerfile` around lines 265 - 270, This change introduces a build-time ARG (NEMOCLAW_AGENT_HEARTBEAT_EVERY) in the Dockerfile that affects sandbox image behavior; before merging, run the full Dockerfile E2E matrix to validate rebuild/onboard behavior: execute cloud-e2e, sandbox-survival-e2e, hermes-e2e, and rebuild-openclaw-e2e against an image built with the new ARG and verify openclaw.json behavior and heartbeat cadence; report any failures and only merge once all four E2Es pass.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@Dockerfile`:
- Around line 265-270: This change introduces a build-time ARG
(NEMOCLAW_AGENT_HEARTBEAT_EVERY) in the Dockerfile that affects sandbox image
behavior; before merging, run the full Dockerfile E2E matrix to validate
rebuild/onboard behavior: execute cloud-e2e, sandbox-survival-e2e, hermes-e2e,
and rebuild-openclaw-e2e against an image built with the new ARG and verify
openclaw.json behavior and heartbeat cadence; report any failures and only merge
once all four E2Es pass.
In `@docs/inference/switch-inference-providers.md`:
- Line 149: The table row for the parameter NEMOCLAW_AGENT_HEARTBEAT_EVERY uses
the plain word unset for the default value; update that cell so the default
value is formatted as inline code (e.g., `unset`) to follow the guideline that
CLI flags, parameter names and values use inline code formatting and match the
surrounding table style.
In `@src/lib/onboard.ts`:
- Around line 2364-2535: The change modifies sandbox image build metadata in
patchStagedDockerfile (ARG replacements including BASE_IMAGE, NEMOCLAW_* args,
proxy, messaging and web search flags); before merging, run the onboarding E2E
suite (the recommended nightly job set) to validate full sandbox creation: build
the image, confirm BASE_IMAGE pinning behavior, ensure NEMOCLAW_* ARG values
(MODEL, PROVIDER_KEY, PRIMARY_MODEL_REF, INFERENCE_* and BUILD_ID) propagate,
verify proxy env handling (NEMOCLAW_PROXY_HOST/PORT), messaging/telegram/discord
arg encoding, web-search toggle, and that the resulting sandbox boots and
exposes the dashboard without device auth.
In `@test/generate-openclaw-config.test.ts`:
- Around line 246-264: Add a new test that covers the empty-string build-arg
case by calling runConfigScript({ NEMOCLAW_AGENT_HEARTBEAT_EVERY: "" }) and
asserting that config.agents.defaults.heartbeat is undefined; place it alongside
the existing heartbeat tests (the ones using "unset", "30m", "0m", and malformed
values) so the suite verifies that an empty string omits the heartbeat field
just like the unset case.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 701ce69a-0154-41af-a209-25a8714e81a2
📒 Files selected for processing (5)
Dockerfiledocs/inference/switch-inference-providers.mdscripts/generate-openclaw-config.pysrc/lib/onboard.tstest/generate-openclaw-config.test.ts
…DIA#2880) - docs: wrap default value `unset` in inline code per coding guidelines - test: cover empty-string env value, which is what Docker passes when the unset ARG is promoted through ENV (the actual build-time path) Signed-off-by: Dongni Yang <dongniy@nvidia.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
docs/inference/switch-inference-providers.md (1)
170-175: 💤 Low valueOne sentence per line — formatting rule violated.
Sentence 1 ends and sentence 2 begins on the same physical line (line 170). The style guide requires one sentence per source line for readable diffs.
📝 Suggested reformatting
-`NEMOCLAW_AGENT_HEARTBEAT_EVERY` sets `agents.defaults.heartbeat.every`. Set it -to `0m` to disable the periodic heartbeat when it disrupts long-running agent -turns; leave it unset to preserve the OpenClaw default cadence. `openclaw.json` -is immutable at runtime, so the in-sandbox `openclaw config set` command cannot -change this — rebuild the sandbox via `nemoclaw onboard --resume` to apply a -new value. +`NEMOCLAW_AGENT_HEARTBEAT_EVERY` sets `agents.defaults.heartbeat.every`. +Set it to `0m` to disable the periodic heartbeat when it disrupts long-running agent turns; leave it unset to preserve the OpenClaw default cadence. +`openclaw.json` is immutable at runtime, so the in-sandbox `openclaw config set` command cannot change this — rebuild the sandbox via `nemoclaw onboard --resume` to apply a new value.As per coding guidelines: "One sentence per line in source (makes diffs readable). Flag paragraphs where multiple sentences appear on the same line."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/inference/switch-inference-providers.md` around lines 170 - 175, The paragraph contains multiple sentences on the same physical line; split it so each sentence is on its own source line to satisfy the "one sentence per line" guideline. Specifically, break the line that mentions NEMOCLAW_AGENT_HEARTBEAT_EVERY, agents.defaults.heartbeat.every, the `0m` disable note, the immutability of openclaw.json, the inability of the in-sandbox `openclaw config set` command to change it, and the instruction to rebuild the sandbox via `nemoclaw onboard --resume` into separate lines so each sentence stands alone; keep the same wording and the same references to NEMOCLAW_AGENT_HEARTBEAT_EVERY, agents.defaults.heartbeat.every, openclaw.json, `openclaw config set`, and `nemoclaw onboard --resume`.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/generate-openclaw-config.test.ts`:
- Around line 268-271: The test "rejects malformed heartbeat values and
preserves OpenClaw default" currently only asserts that
config.agents.defaults.heartbeat is undefined but does not assert the required
`[SECURITY]` stderr warning; update the test to call runConfigScriptRaw instead
of runConfigScript so you can capture stderr and add an assertion that stderr
contains the `[SECURITY]` warning for the malformed
NEMOCLAW_AGENT_HEARTBEAT_EVERY value, keeping the existing assertion on
config.agents.defaults.heartbeat; locate the test in
test/generate-openclaw-config.test.ts and modify the spec that references
runConfigScript to use runConfigScriptRaw and assert stderr includes the
`[SECURITY]` message.
---
Nitpick comments:
In `@docs/inference/switch-inference-providers.md`:
- Around line 170-175: The paragraph contains multiple sentences on the same
physical line; split it so each sentence is on its own source line to satisfy
the "one sentence per line" guideline. Specifically, break the line that
mentions NEMOCLAW_AGENT_HEARTBEAT_EVERY, agents.defaults.heartbeat.every, the
`0m` disable note, the immutability of openclaw.json, the inability of the
in-sandbox `openclaw config set` command to change it, and the instruction to
rebuild the sandbox via `nemoclaw onboard --resume` into separate lines so each
sentence stands alone; keep the same wording and the same references to
NEMOCLAW_AGENT_HEARTBEAT_EVERY, agents.defaults.heartbeat.every, openclaw.json,
`openclaw config set`, and `nemoclaw onboard --resume`.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: da2eb287-ca47-41e8-84aa-71dfc65fedfd
📒 Files selected for processing (2)
docs/inference/switch-inference-providers.mdtest/generate-openclaw-config.test.ts
NVIDIA#2880) Mirror the existing CONTEXT_WINDOW/MAX_TOKENS validation tests by capturing stderr and asserting that the rejected NEMOCLAW_AGENT_HEARTBEAT_EVERY value emits the documented [SECURITY] warning before falling back to the OpenClaw default. Per CodeRabbit review on PR NVIDIA#3158. Signed-off-by: Dongni Yang <dongniy@nvidia.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…VIDIA#2880) Per CodeRabbit nit on PR NVIDIA#3158: split the NEMOCLAW_AGENT_HEARTBEAT_EVERY paragraph so each sentence is on its own source line to match the project's docs style guide. Wording unchanged. Signed-off-by: Dongni Yang <dongniy@nvidia.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…DIA#2880) Replace the "disable when it disrupts long-running agent turns" framing (which traced to a misread of NVIDIA#2880) with what heartbeat actually does per the OpenClaw heartbeat docs: a scheduled main-session agent turn that reviews follow-ups and reads HEARTBEAT.md from the workspace. Spell out the cost of `0m` (loses periodic supervision and drops HEARTBEAT.md from normal-run context). Signed-off-by: Dongni Yang <dongniy@nvidia.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sync with upstream main commit 2c3a392 (vi/jq/dos2unix in base images); no conflicts with the heartbeat passthrough changes. Signed-off-by: Dongni Yang <dongniy@nvidia.com>
…RY (NVIDIA#2880) Tighten the validation regex from ^\d+(s|m|h)?$ to ^\d+(s|m|h)$ on both sides (Python config generator + TypeScript Dockerfile patcher) so a bare "30" without a unit no longer slips through. OpenClaw's heartbeat docs always show the suffixed form (e.g. 30m, 2h, 0m), and a unit-less number is ambiguous — OpenClaw may parse it as seconds or reject it at startup. Update the existing Python-side stderr-warning test to match the new error message. Signed-off-by: Dongni Yang <dongniy@nvidia.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tcher Mirror the existing NVIDIA#2281 (NEMOCLAW_AGENT_TIMEOUT) and NVIDIA#2421 (NEMOCLAW_INFERENCE_INPUTS) regression tests for the new NEMOCLAW_AGENT_HEARTBEAT_EVERY patcher in src/lib/onboard.ts. Covers five valid durations (0m, 30m, 5m, 1h, 30s) plus six rejected forms (undefined, "", "30 minutes", "5", "5x", "fast") to lock in the suffix-required regex behavior. Signed-off-by: Dongni Yang <dongniy@nvidia.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Summary - Bump the docs release metadata to `0.0.38`. - Document release-prep updates for status policy versions, Local Ollama validation and cleanup, blueprint policy additions, rebuild backup handling, and NemoHermes uninstall branding. - Refresh generated `nemoclaw-user-*` skills from the updated docs. ## Source summary - #3185 -> `docs/reference/commands.md`: Documents that `nemoclaw <name> status` displays the gateway active policy version when OpenShell reports one. - #3167 -> `docs/reference/commands.md`, `docs/inference/use-local-inference.md`: Documents uninstall cleanup for matching Local Ollama auth proxy processes. - #2737 -> `docs/inference/use-local-inference.md`, `docs/network-policy/customize-network-policy.md`, `docs/manage-sandboxes/lifecycle.md`, `docs/reference/commands.md`: Documents stricter Local Ollama tool-call validation, blueprint policy additions, and partial rebuild backup handling. - #3220 -> `docs/reference/commands.md`: Documents NemoHermes-specific uninstall progress and completion text. - #3158 -> `.agents/skills/nemoclaw-user-configure-inference/*`: Refreshes generated user skills from existing `docs/inference/switch-inference-providers.md` heartbeat documentation. - #3199 -> `.agents/skills/nemoclaw-user-get-started/SKILL.md`: Refreshes generated user skills from existing `docs/get-started/quickstart.md` Model Router wording. ## Skipped - #3272 and #3268 were already documented by their merged docs updates on `main`. - #3154, #3216, #3166, and #3195 have no additional user-facing docs impact for this release-prep pass. - No commits matched `docs/.docs-skip`. ## Test plan - `python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix nemoclaw-user` - `make docs` - `npm run build:cli` - Commit and pre-push hooks: markdownlint, docs-to-skills verification, gitleaks, commitlint, skills YAML tests, CLI typecheck Made with [Cursor](https://cursor.com) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Behavior Changes** * Rebuild now safely handles partial backups, preserving successfully captured entries while reporting only unarchived paths * Uninstall for Local Ollama setups now stops proxy processes before cleanup * Local Ollama models require stricter tool-call response validation during onboarding * Blueprint policy additions enable custom network policy extensions via `components.policy.additions` * New `NEMOCLAW_AGENT_HEARTBEAT_EVERY` configuration controls agent periodic task frequency * Status display now shows active policy version when available <!-- end of auto-generated comment: release notes by coderabbit.ai --> Co-authored-by: Cursor <cursoragent@cursor.com>
Summary
Add a
NEMOCLAW_AGENT_HEARTBEAT_EVERYbuild-time env var that setsagents.defaults.heartbeat.everyin the generatedopenclaw.json. Mirrors the existingNEMOCLAW_AGENT_TIMEOUTplumbing exactly: DockerfileARG→ENV→ Python config generator →agents.defaults. Empty/unset preserves the OpenClaw default cadence;0mdisables; other Go-style durations (e.g.1h) tune it.What heartbeat is and what
0mcostsPer OpenClaw's heartbeat docs, heartbeat is a scheduled main-session agent turn — not a background task. When the interval fires, OpenClaw runs a model invocation with a default prompt that nudges the agent to "review follow-ups (inbox, calendar, reminders, queued work) and surface anything urgent" and to "read
HEARTBEAT.mdif it exists (workspace context)." Default interval is 30m (1h for Anthropic OAuth / Claude CLI reuse).Setting
every: 0mdisables the periodic turns and also dropsHEARTBEAT.mdfrom normal-run bootstrap context (per the upstream docs), so the model no longer sees heartbeat-only instructions at all. Trade-off summary:5m,1h,2h0mHEARTBEAT.mdcontextRelated Issue
Refs #2880 — reporter ran into heartbeat-related dashboard flooding on DGX Spark; this env var gives them (and other users) a supported path to reach the upstream
0mknob, which previously had no NemoClaw plumbing becauseopenclaw.jsonis Landlock read-only inside the sandbox. The reporter has since clarified the underlying scheduling behavior is also at play — that part is OpenClaw-side and tracked separately.Changes
scripts/generate-openclaw-config.py— parseNEMOCLAW_AGENT_HEARTBEAT_EVERY(regex^\d+(s|m|h)?$), injectheartbeat: { every: ... }intoagents.defaultsonly when set; reject malformed values with a[SECURITY]warning and fall back to the OpenClaw defaultDockerfile— addARG NEMOCLAW_AGENT_HEARTBEAT_EVERY=(empty default) next toNEMOCLAW_AGENT_TIMEOUT, and promote it through theENVblocksrc/lib/onboard.ts— clone theNEMOCLAW_AGENT_TIMEOUTDockerfile-ARG-replacement block so a host export reaches image buildtest/generate-openclaw-config.test.ts— 5 cases: unset, empty string,30m,0m, malformed-with-stderr-assertiondocs/inference/switch-inference-providers.md— document the new var alongsideNEMOCLAW_AGENT_TIMEOUTwith thenemoclaw onboard --resumerebuild noteType of Change
Verification
npx prek run --all-filespassesnpm testpasses (5 new heartbeat cases pass; the 2 unrelatedint-string digit limitfailures pre-exist onupstream/main)make docsbuilds without warnings (doc changes only)Test plan
python3 scripts/generate-openclaw-config.pywithNEMOCLAW_AGENT_HEARTBEAT_EVERY=0mwritesagents.defaults.heartbeat.every = "0m"to~/.openclaw/openclaw.jsonheartbeatkey entirely"5 minutes") logs a[SECURITY]warning and omits the keynpx vitest run --project cli test/generate-openclaw-config.test.ts -t heartbeat— 5 passedNEMOCLAW_AGENT_HEARTBEAT_EVERY=0m nemoclaw onboard --resumerebuilds a sandbox whose dashboard no longer shows the periodic heartbeat turns (requires DGX Spark or another live install — not exercised here)Signed-off-by: Dongni Yang dongniy@nvidia.com
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Documentation