fix(daemon): hand off systemd self-restarts#66735
fix(daemon): hand off systemd self-restarts#66735alexlomt wants to merge 2 commits intoopenclaw:mainfrom
Conversation
Greptile SummaryThis PR fixes a Confidence Score: 5/5
Prompt To Fix All With AIThis is a comment left during a code review.
Path: src/daemon/systemd.ts
Line: 125-135
Comment:
**Unbounded PID-wait loop in handoff script**
The `while kill -0 "$wait_pid"` loop has no timeout. If the gateway process ignores `SIGTERM` and never exits, the transient handoff unit will spin indefinitely, and the target service will never restart. Consider adding a deadline (e.g. 30 s) after which the loop falls through or bails with a non-zero exit so the failure is surfaced.
```suggestion
function buildSystemdRestartHandoffScript(): string {
return `wait_pid="$1"
target_unit="$2"
deadline=$(( $(date +%s) + 30 ))
if [ -n "$wait_pid" ] && [ "$wait_pid" -gt 1 ] 2>/dev/null; then
while kill -0 "$wait_pid" >/dev/null 2>&1; do
if [ "$(date +%s)" -ge "$deadline" ]; then
echo "openclaw-systemd-restart-handoff: timed out waiting for pid $wait_pid to exit" >&2
exit 1
fi
sleep 0.1
done
fi
exec systemctl --user restart "$target_unit"
`;
}
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: src/daemon/systemd.ts
Line: 80-90
Comment:
**Unnecessary `undefined` in parameter type**
`cgroupPath` is typed as `string | undefined`, but every call site passes a `string` (the elements of `string[]` from `readCurrentProcessCgroupPaths`). The `| undefined` branch is dead code. Narrowing the type keeps the signature honest.
```suggestion
function matchesSystemdCgroupUnit(cgroupPath: string, unitName: string): boolean {
const normalizedPath = normalizeOptionalString(cgroupPath);
```
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "fix(daemon): hand off systemd self-resta..." | Re-trigger Greptile |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d78d9eb6ba
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
Pushed f852688b9f with the follow-up fixes. What changed:
Validation:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f852688b9f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
Unrelated |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 232b40df87
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
Follow-up added in
Local verification for the follow-up:
|
dd5ac56 to
cf56098
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cf560980da
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
cf56098 to
f6be80b
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f6be80b9b6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
f6be80b to
d43dff0
Compare
|
High-value ops fix for Linux/systemd installs: restarting from inside the managed unit can SIGTERM the caller because of |
d43dff0 to
7a191b5
Compare
|
ProjectClownfish pushed a narrow repair to this branch so the original contributor path can stay canonical. Source PR: #66735 |
7a191b5 to
9eeb526
Compare
|
Codex review: needs maintainer review before merge. What this changes: The PR adds Linux/systemd self-restart detection, a detached Maintainer follow-up before merge: This is a focused contributor implementation with maintainer follow-up, but the branch is dirty/unmergeable and the remaining work is maintainer rebase/review plus live systemd validation rather than a separate autonomous replacement fix. Security review: Security review cleared: No concrete security or supply-chain regression was found; the new command execution path is limited to local Linux/user-systemd self-restarts and passes dynamic values as argv plus quoted shell variables. Review detailsBest possible solution: Rebase this implementation onto current main, preserve the split between synchronous external restarts and detached in-unit self-restarts, keep the marker-based SIGTERM path bounded by the systemd stop budget, then land after targeted daemon/CLI tests, changed-gate validation, and a disposable user-systemd smoke for both external and self-restart paths. Do we have a high-confidence way to reproduce the issue? Yes. A high-confidence source-level reproduction is to invoke a gateway restart from inside the generated Is this the best way to solve the issue? Yes as a direction, pending rebase and smoke. The proposed fix is narrower than changing unit kill semantics: it only hands off same-unit systemd restarts to a transient user unit, keeps external restarts synchronous, and adds a marker so the expected restart SIGTERM is not treated as a normal stop. Acceptance criteria:
What I checked:
Likely related people:
Remaining risk / open question:
Codex review notes: model gpt-5.5, reasoning high; reviewed against a256745323ab. |
Summary
On Linux/systemd, restarting the gateway from inside the running gateway service can kill the calling CLI process with
SIGTERMeven though the service restart succeeds.This happens because the restart command runs inside the same
openclaw-gateway.servicecgroup, and the unit usesKillMode=control-group. A plainsystemctl --user restart openclaw-gateway.servicetherefore kills the gateway and the caller together.This change makes the Linux systemd adapter treat self-restarts differently from external restarts:
systemd-run --userhandoff and returnscheduledcompletedWhy this is safe
Implementation
systemd-run --usersystemctl --user restart <unit>from the detached handoffValidation performed locally
corepack pnpm exec oxfmt --check src/daemon/systemd.ts src/daemon/systemd.test.ts src/cli/gateway-cli/run-loop.ts src/cli/gateway-cli/run-loop.test.ts src/agents/subagent-registry.steer-restart.test.ts src/gateway/server.sessions.gateway-server-sessions-a.test.tscorepack pnpm exec vitest run src/daemon/systemd.test.ts src/cli/gateway-cli/run-loop.test.ts --reporter=verbosecorepack pnpm exec vitest run --config test/vitest/vitest.agents.config.ts src/agents/subagent-registry.steer-restart.test.ts --reporter=verbosecorepack pnpm exec vitest run --config test/vitest/vitest.gateway-server.config.ts src/gateway/server.sessions.gateway-server-sessions-a.test.ts --reporter=verboseNotes
This follows the same scheduled-restart contract already used elsewhere in the daemon lifecycle, rather than changing systemd unit kill behaviour.
Review Focus
src/daemon/systemd.tssrc/cli/gateway-cli/run-loop.tssrc/daemon/systemd.test.tssrc/cli/gateway-cli/run-loop.test.tsScope Boundary
Security Impact
Yes, explain risk + mitigation: self-restarts inside the managed unit now schedule a detachedsystemd-run --userhandoff only for the in-unit restart path. External restarts are unchanged, marker parsing is validated strictly, and tests cover malformed markers, custom unit names, and bounded restart draining.Human Verification