Rescue: add watchdog core service and cron engine#46499
Rescue: add watchdog core service and cron engine#46499shichangs wants to merge 3 commits intoopenclaw:mainfrom
Conversation
|
Superseded by #46502. I rebased the daemon + cron core stack onto the latest main and opened a fresh PR instead of force-pushing over this branch. That keeps the review entry non-conflicting and preserves incremental history. |
Greptile SummaryThis PR lands the core service layer for the rescue-watchdog feature: it introduces the Key findings:
Confidence Score: 3/5
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5dac7762cc
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (raw === "rescuewatchdog") { | ||
| payload.kind = "rescueWatchdog"; | ||
| return true; |
There was a problem hiding this comment.
Avoid marking canonical rescue payload kinds as migrated
When payload.kind is already "rescueWatchdog", this branch still returns true, so normalizeStoredCronJobs treats every load as a mutation and persists the store again. That causes unnecessary rewrites/churn (and inflates legacyPayloadKind issue counts) for any installation that has rescue watchdog jobs, even when the data is already normalized.
Useful? React with 👍 / 👎.
Summary
rescueWatchdogcron payload + runner, and wires the gateway cron service to execute that watchdog job against a monitored profile.openclaw onboard --rescue-watchdogUX, and no docs changes in this PR.Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
User-visible / Behavior Changes
rescueWatchdog, for watchdog jobs that monitor a target profile and repair it when unhealthy.0600plist permissions.AbortSignal, so watchdog restart attempts can be canceled cleanly instead of hanging until command timeouts.Security Impact (required)
Yes/No): YesYes/No): NoYes/No): YesYes/No): YesYes/No): NoYes, explain risk + mitigation:The new capability is intentionally narrow: an isolated cron job can probe and repair a monitored local profile. Risk is bounded by restricting the payload shape, disallowing delivery targets for
rescueWatchdog, resolving the monitored profile explicitly, and hardening the underlying managed-service restart and launchd plist write paths. The repair fallback uses exact argv viaprocess.execPathand avoids shell-based PATH resolution.Repro + Verification
Environment
Steps
Expected
rescueWatchdogcron job can be normalized, stored, validated, and executed as an isolated job.Actual
Evidence
Human Verification (required)
What you personally verified (not just CI), and how:
HOMEoverride, rejects symlink targets, and writes plist files with0600permissions via temp-file + rename.AbortSignalnow propagates through launchd/systemd/schtasks restart helpers into the process exec layer.rescueWatchdogcron payloads normalize correctly, are forced to isolated session targets, and reject delivery/failureDestination config.doctor --repair --non-interactive, and reports success/error summaries correctly.os.userInfo()failure falls back toos.homedir(), and unresolved trusted home fails closed.launchdtemp plist files are cleaned up if rename fails.Review Conversations
Compatibility / Migration
Yes/No): YesYes/No): NoYes/No): NoFailure Recovery (if this breaks)
src/daemon/*,src/process/exec.ts,src/cron/*,src/rescue/watchdog-shared.ts,src/gateway/server-cron.tsagentTurnjobs after the new payload kind was added.Risks and Mitigations
rescueWatchdogis explicitly restricted to isolated jobs, delivery is rejected, and schema/service/normalization paths have dedicated tests.