Skip to content

fix(daemon): preserve explicit systemd unit during refresh#87556

Merged
steipete merged 1 commit into
openclaw:mainfrom
TurboTheTurtle:fix/update-systemd-scope-87490
May 28, 2026
Merged

fix(daemon): preserve explicit systemd unit during refresh#87556
steipete merged 1 commit into
openclaw:mainfrom
TurboTheTurtle:fix/update-systemd-scope-87490

Conversation

@TurboTheTurtle

@TurboTheTurtle TurboTheTurtle commented May 28, 2026

Copy link
Copy Markdown
Contributor

Summary

  • preserve explicit OPENCLAW_SYSTEMD_UNIT values when rendering the managed gateway service environment
  • keep caller-selected service identity env (OPENCLAW_SYSTEMD_UNIT, launchd label, Windows task name) ahead of stale persisted service env during service-state inspection
  • add regression coverage for explicit systemd unit preservation and stale persisted-unit merge behavior

Fixes #87490

Verification

  • node scripts/run-vitest.mjs src/daemon/service-env.test.ts src/daemon/service.test.ts src/cli/update-cli.test.ts src/cli/update-cli/restart-helper.test.ts src/cli/daemon-cli/install.test.ts src/daemon/systemd.test.ts (6 files / 347 tests passed)
  • git diff --check
  • git log --format='%h %an <%ae> %s' upstream/main..HEAD

Real behavior proof

Behavior addressed: package/update refresh can keep using the explicitly selected systemd unit instead of drifting back to openclaw-gateway.service from a stale persisted service environment.
Real environment tested: local source checkout running the real OpenClaw TypeScript service-env/state code with Node 24 on macOS; the probe used Linux service-env rendering inputs for the systemd unit behavior.
Exact steps or command run after this patch: ran node --import tsx --input-type=module to render a Linux gateway service env with OPENCLAW_SYSTEMD_UNIT=openclaw-gateway-maintenance.service, then read gateway service state from an existing unit command whose persisted env still contained OPENCLAW_SYSTEMD_UNIT=openclaw-gateway.service.
Evidence after fix: terminal output from the behavior probe was:

{
  "renderedSystemdUnit": "openclaw-gateway-maintenance.service",
  "mergedSystemdUnit": "openclaw-gateway-maintenance.service",
  "runtimeSystemdUnit": "openclaw-gateway-maintenance.service"
}

Observed result after fix: both rendered service env and post-inspection runtime env retained the selected openclaw-gateway-maintenance.service unit despite the stale persisted default unit.
What was not tested: a live Linux systemctl --user package-update smoke, because Blacksmith/Testbox required blacksmith auth login and local Docker was unavailable in this environment.

If maintainers squash or rework this PR, please preserve author attribution or include:

Co-authored-by: Andy Ye 35905412+TurboTheTurtle@users.noreply.github.com

@openclaw-barnacle openclaw-barnacle Bot added gateway Gateway runtime size: S triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 28, 2026
@clawsweeper

clawsweeper Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge. Reviewed May 28, 2026, 3:18 AM ET / 07:18 UTC.

Summary
The PR makes explicit gateway service identity environment values outrank stale persisted service env during service-env rendering and service-state inspection, with focused regression tests.

PR surface: Source +19, Tests +53. Total +72 across 4 files.

Reproducibility: yes. from source inspection, but not as a live Linux reproduction in this review. Current main derives the rendered systemd unit from profile and lets persisted command env overwrite caller env, matching the stale-unit failure shape.

Review metrics: 1 noteworthy metric.

  • Service Identity Precedence: 3 env keys changed. Caller-selected systemd, launchd, and Windows task identities now outrank persisted service env, which matters for upgrade and restart targeting.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🐚 platinum hermit
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • [P2] Run a Linux user-systemd package/update refresh smoke before merge if maintainers want end-to-end upgrade proof.

Risk before merge

  • [P1] No live Linux systemctl package-update smoke was run, so the remaining upgrade risk is whether the selected unit survives the real packaged refresh path end to end.
  • [P1] The merge precedence change covers launchd and Windows task identity keys too; the logic is symmetric, but the supplied real behavior proof demonstrates the systemd case.

Maintainer options:

  1. Add Linux Service-Refresh Proof (recommended)
    Before merge, run a Linux user-systemd package/update refresh smoke with an explicit unit and verify the refreshed gateway still targets that unit.
  2. Accept Source-Level Proof
    Maintainers can land the small precedence fix with the supplied TypeScript runtime probe and focused tests, accepting that live systemctl coverage remains absent.

Next step before merge

  • No automated repair is needed; maintainer review should decide whether the supplied runtime probe is enough or whether to add live Linux systemd update proof before merge.

Security
Cleared: The diff changes daemon environment precedence and tests only; it does not add dependencies, workflows, secret handling, downloads, or package-resolution changes.

Review details

Best possible solution:

Land the narrow precedence fix after maintainers either accept the supplied runtime probe or add a Linux user-systemd update-refresh smoke, then let the linked issue close through the merged PR.

Do we have a high-confidence way to reproduce the issue?

Yes from source inspection, but not as a live Linux reproduction in this review. Current main derives the rendered systemd unit from profile and lets persisted command env overwrite caller env, matching the stale-unit failure shape.

Is this the best way to solve the issue?

Yes, the PR fixes the narrow ownership boundary: keep explicit service identity env at render and state-merge time. The safer pre-merge addition is live Linux update-refresh proof, not a broader refactor.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 05f357b13bb3.

Label changes

Label changes:

  • add P2: This is a normal-priority gateway update reliability fix with limited touched surface and a reboot workaround in the linked report.
  • add merge-risk: 🚨 compatibility: Changing service identity precedence can alter which managed service an update or restart targets when existing environments contain explicit unit, label, or task-name values.
  • add proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes terminal output from a real TypeScript runtime probe showing the selected systemd unit remains selected after rendering and state merge.
  • add rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🐚 platinum hermit and patch quality is 🐚 platinum hermit.
  • add status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (terminal): The PR body includes terminal output from a real TypeScript runtime probe showing the selected systemd unit remains selected after rendering and state merge.

Label justifications:

  • P2: This is a normal-priority gateway update reliability fix with limited touched surface and a reboot workaround in the linked report.
  • merge-risk: 🚨 compatibility: Changing service identity precedence can alter which managed service an update or restart targets when existing environments contain explicit unit, label, or task-name values.
  • rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🐚 platinum hermit and patch quality is 🐚 platinum hermit.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (terminal): The PR body includes terminal output from a real TypeScript runtime probe showing the selected systemd unit remains selected after rendering and state merge.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes terminal output from a real TypeScript runtime probe showing the selected systemd unit remains selected after rendering and state merge.
Evidence reviewed

PR surface:

Source +19, Tests +53. Total +72 across 4 files.

View PR surface stats
Area Files Added Removed Net
Source 2 21 2 +19
Tests 2 53 0 +53
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 4 74 2 +72

What I checked:

  • Current main ignores explicit systemd unit while rendering service env: On current main, buildServiceEnvironment always derives OPENCLAW_SYSTEMD_UNIT from OPENCLAW_PROFILE, so an explicit OPENCLAW_SYSTEMD_UNIT is not preserved at render time. (src/daemon/service-env.ts:414, 05f357b13bb3)
  • Current main lets persisted command env replace caller env: mergeGatewayServiceEnv currently spreads command.environment after baseEnv, which allows stale persisted service identity values to replace the caller-selected environment. (src/daemon/service.ts:94, 05f357b13bb3)
  • Systemd helpers already honor OPENCLAW_SYSTEMD_UNIT: resolveSystemdServiceName reads OPENCLAW_SYSTEMD_UNIT before falling back to the profile-derived unit, so preserving that env value is on the correct daemon boundary. (src/daemon/systemd.ts:65, 05f357b13bb3)
  • PR diff is narrow and on-point: The diff changes only service-env/service merge behavior and tests, adding preservation of explicit service identity keys without dependency, workflow, or package changes. (src/daemon/service.ts:91, 079035b1dca6)
  • Contributor supplied real behavior proof: The PR body includes a Node 24 TypeScript runtime probe showing renderedSystemdUnit, mergedSystemdUnit, and runtimeSystemdUnit all retain openclaw-gateway-maintenance.service; it also states live Linux systemctl update smoke was not run. (079035b1dca6)
  • Related issue remains open and paired: The linked Linux update-refresh issue is still open, so this PR is the active implementation candidate rather than a reason to close either side before merge.

Likely related people:

  • vincentkoc: GitHub history ties Vincent Koc to the package update service-env refresh path and later systemd service-scope clarification that this PR exercises. (role: introduced behavior and recent adjacent contributor; confidence: high; commits: 45d9b2069264, 97d35f4c575b; files: src/cli/update-cli/update-command.ts, src/daemon/service.ts)
  • steipete: Recent commit history shows repeated work around daemon service state, stale gateway repair, and managed service env migration. (role: recent area contributor; confidence: medium; commits: 67f1266fe8a2, 3b1a020ebaec, 77d9ac30bb8d; files: src/daemon/service.ts, src/daemon/systemd.ts, src/cli/update-cli/update-command.ts)
  • taw0002: Commit search links taw0002 to earlier launchd/systemd managed-service detection that overlaps the service identity environment surface. (role: feature-history contributor; confidence: medium; commits: 792ce7b5b456; files: src/daemon/service-env.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@TurboTheTurtle TurboTheTurtle force-pushed the fix/update-systemd-scope-87490 branch from 828c553 to 079035b Compare May 28, 2026 07:11
@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. P2 Normal backlog priority with limited blast radius. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. labels May 28, 2026
@clawsweeper

clawsweeper Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper PR egg: ✨ hatched 🥚 common Clockwork Test Hopper. Rarity: 🥚 common. Trait: hums during re-review.

Details

Share on X: post this hatch
Copy: My PR egg hatched a 🥚 common Clockwork Test Hopper in ClawSweeper.
Hatchability:

  • Merged PRs are hatchable.
  • Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
  • Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.

About:

  • Eggs appear after real-behavior proof passes. They are collectible flavor only.
  • Review momentum changes the shell state: follow-up work warms it, re-review makes it wobble, and a clean final review lets it hatch.
  • The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
  • Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

@BingqingLyu

This comment was marked as spam.

@steipete steipete self-assigned this May 28, 2026
@steipete

Copy link
Copy Markdown
Contributor

Maintainer verification before merge:

Behavior addressed: update/service refresh preserves the explicit systemd unit instead of drifting to a stale persisted default unit.
Real environment tested: AWS Crabbox Linux VM with systemd user manager running, Node 24.16.0, pnpm 11.2.2.
Exact steps or command run after this patch:

  • node scripts/run-vitest.mjs src/daemon/service-env.test.ts src/daemon/service.test.ts src/cli/update-cli.test.ts src/cli/update-cli/restart-helper.test.ts src/cli/daemon-cli/install.test.ts src/daemon/systemd.test.ts
  • git diff --check origin/main...pr/87556
  • Crabbox direct AWS run run_f3374bd610f7, lease cbx_754e69eb6c3a, provider aws, target linux, machine c7a.8xlarge: installed a unique explicit systemd user unit with pnpm openclaw gateway install --force --json, rewrote that unit's persisted OPENCLAW_SYSTEMD_UNIT to stale openclaw-gateway.service, then reran pnpm openclaw gateway install --force --json with the explicit unit selected.
    Evidence after fix: the first install wrote Environment=OPENCLAW_SYSTEMD_UNIT=openclaw-gateway-pr87556-1779978394-4723.service; after seeding stale persisted env and refreshing, the unit again contained Environment=OPENCLAW_SYSTEMD_UNIT=openclaw-gateway-pr87556-1779978394-4723.service, remained enabled, and no default unit was written from the stale env.
    Observed result after fix: explicit systemd unit preserved across stale persisted env refresh.
    What was not tested: full CI rerun; this was focused unit coverage plus real Linux systemd install/refresh proof.

Autoreview: /Users/steipete/Projects/agent-skills/skills/autoreview/scripts/autoreview --mode branch --base origin/main returned clean with no accepted/actionable findings.

@steipete steipete left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified with focused tests, autoreview, and direct AWS Crabbox Linux systemd install/refresh proof. No blocking findings.

@steipete steipete merged commit 3fea219 into openclaw:main May 28, 2026
90 of 95 checks passed
@TurboTheTurtle TurboTheTurtle deleted the fix/update-systemd-scope-87490 branch May 28, 2026 16:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gateway Gateway runtime merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. P2 Normal backlog priority with limited blast radius. proof: sufficient ClawSweeper judged the real behavior proof convincing. proof: supplied External PR includes structured after-fix real behavior proof. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. size: S status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

openclaw update fails to refresh gateway service environment when systemd unit scope is missing/mismatched

3 participants