fix(hermes): suspend kanban dispatcher under shields-up#4703
Conversation
Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughHermes startup now defines ChangesShields-up kanban dispatcher override
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
PR Review AdvisorFindings: 0 needs attention, 3 worth checking, 0 nice ideas Review findings🛠️ Needs attention
🔎 Worth checking
🌱 Nice ideas
Since last review detailsCurrent findings:
This is an automated advisory review. A human maintainer must make the final merge decision. |
E2E Advisor RecommendationRequired E2E: None Full advisor summaryE2E Recommendation AdvisorFailed: Could not parse JSON from advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/e2e-advisor/e2e-advisor-raw-output.txt |
E2E Scenario Advisor RecommendationRequired scenario E2E: None Full scenario advisor summaryE2E Scenario AdvisorFailed: Could not parse JSON from advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/e2e-advisor/e2e-scenario-advisor-raw-output.txt |
Selective E2E Results — ✅ All requested jobs passedRun: 26868069659
|
There was a problem hiding this comment.
🧹 Nitpick comments (2)
test/hermes-start.test.ts (2)
484-501: ⚡ Quick winExtract the duplicated
stat()mock into a shared helper.This block is a near-verbatim copy of the
stat()mock inrunHermesGatewayRuntimeCleanup(Lines 222-235). The two must stay byte-for-byte in sync to keep shields-up detection consistent; ifhermes_config_root_is_locked'sstatinvocations change, both copies have to be updated, which is easy to miss.Consider hoisting it to a module-level constant (e.g.,
LOCKED_CONFIG_STAT_MOCK) reused by both helpers.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/hermes-start.test.ts` around lines 484 - 501, The duplicated multi-line stat() mock (currently in the local variable statMock and duplicated in runHermesGatewayRuntimeCleanup) should be extracted to a single module-level constant (e.g., LOCKED_CONFIG_STAT_MOCK) and both places should reference that constant; update the test file to replace the inline statMock definition with a reference to LOCKED_CONFIG_STAT_MOCK and update runHermesGatewayRuntimeCleanup to use the same constant so the byte-for-byte mock remains in one place and is reused by hermes_config_root_is_locked and the other helper.
510-518: ⚡ Quick winRuntime replication:
idmock isn’t needed for this snippet; consider aligningsetflags.
apply_shields_up_runtime_env(and its dependencyhermes_config_root_is_locked/hermes_config_path_is_locked) never branches onid -uinagents/hermes/start.sh, so the lack of anid()mock intest/hermes-start.test.tslines 510-518 shouldn’t change behavior.- The helper script uses
set -uo pipefailwhileagents/hermes/start.shusesset -euo pipefail; for strict fidelity (and to catch unexpected command failures), align the helper toset -euo pipefail—though in these extracted functions,statfailures are already masked (... || ... || true) andapply_shields_up_runtime_envis explicitly guarded with|| return 0.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/hermes-start.test.ts` around lines 510 - 518, Remove the unnecessary `id` mock from the test snippet because apply_shields_up_runtime_env and its dependent functions hermes_config_root_is_locked / hermes_config_path_is_locked never branch on `id -u`; instead, update the helper invocation that currently uses "set -uo pipefail" to "set -euo pipefail" so the extracted functions mirror agents/hermes/start.sh's strict flags and will fail on unexpected command errors (note these functions already mask stat failures and apply_shields_up_runtime_env is guarded with `|| return 0`).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@test/hermes-start.test.ts`:
- Around line 484-501: The duplicated multi-line stat() mock (currently in the
local variable statMock and duplicated in runHermesGatewayRuntimeCleanup) should
be extracted to a single module-level constant (e.g., LOCKED_CONFIG_STAT_MOCK)
and both places should reference that constant; update the test file to replace
the inline statMock definition with a reference to LOCKED_CONFIG_STAT_MOCK and
update runHermesGatewayRuntimeCleanup to use the same constant so the
byte-for-byte mock remains in one place and is reused by
hermes_config_root_is_locked and the other helper.
- Around line 510-518: Remove the unnecessary `id` mock from the test snippet
because apply_shields_up_runtime_env and its dependent functions
hermes_config_root_is_locked / hermes_config_path_is_locked never branch on `id
-u`; instead, update the helper invocation that currently uses "set -uo
pipefail" to "set -euo pipefail" so the extracted functions mirror
agents/hermes/start.sh's strict flags and will fail on unexpected command errors
(note these functions already mask stat failures and
apply_shields_up_runtime_env is guarded with `|| return 0`).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 421d89c6-2db5-47dc-b4d8-a777f97d1a80
📒 Files selected for processing (2)
agents/hermes/start.shtest/hermes-start.test.ts
Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
…tcher-shields-up Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
Summary
Under shields-up the Hermes config root
/sandbox/.hermesis root-owned read-only, so the gateway's embedded kanban dispatcher fails every tick whenkanban_db.connect()runsPRAGMA journal_mode=WALagainst the lockedkanban.db. The dispatcher reconnects each cycle, filling/tmp/gateway.logwithsqlite3.OperationalError: attempt to write a readonly databasetracebacks while the health endpoint stays 200, so the breakage is silent. This change exports the upstream-supportedHERMES_KANBAN_DISPATCH_IN_GATEWAY=0env override when the config root is locked, cleanly suspending the dispatcher and notifier loops without touching upstream Hermes.Related Issue
Fixes #4299
Changes
agents/hermes/start.sh: newapply_shields_up_runtime_envhelper that detects the shields-up posture via the existinghermes_config_root_is_lockedprobe and exportsHERMES_KANBAN_DISPATCH_IN_GATEWAY=0if the caller has not set it. Called in both the non-root and root entrypoint paths, after config-integrity verification and before gateway launch so the gateway child inherits it.test/hermes-start.test.ts: three new cases asserting the override fires under shields-up, stays off when shields are down, and preserves a caller-supplied value.Type of Change
Verification
Signed-off-by: Tinson Lai tinsonl@nvidia.com
Summary by CodeRabbit
New Features
Tests