daemon/systemd: distinguish WSL user D-Bus socket missing from missing systemctl#68400
daemon/systemd: distinguish WSL user D-Bus socket missing from missing systemctl#68400lyfuci wants to merge 2 commits intoopenclaw:mainfrom
Conversation
Greptile SummaryThis PR fixes a misclassification bug in Confidence Score: 5/5Safe to merge — targeted two-file logic fix with full regression test coverage for both the classifier ordering and the new WSL2-specific error branch. The fix is minimal and correct: reordering the classifier checks resolves a substring collision, and No files require special attention. Reviews (2): Last reviewed commit: "daemon/systemd test: cover WSL2 D-Bus un..." | Re-trigger Greptile |
0328483 to
ee60f5f
Compare
|
Thanks @greptile-apps — addressed both points in
|
Address Greptile P2 review on openclaw#68400: the previous test only exercised the non-WSL branch of assertSystemdAvailable, so the WSL2-specific hint (the "systemd user D-Bus unavailable on WSL2" message pointing at /etc/wsl.conf + wsl --shutdown) was not locked in by CI. Mock isWSLSync() through a hoisted mock so the test can force WSL detection on any platform, add an additional stopSystemdService case that asserts the WSL2-specific prefix, and reset the mock in the existing non-WSL regression case so both branches stay covered. Refs openclaw#68380
ee60f5f to
01e74ec
Compare
01e74ec to
54c7aad
Compare
Address Greptile P2 review on openclaw#68400: the previous test only exercised the non-WSL branch of assertSystemdAvailable, so the WSL2-specific hint (the "systemd user D-Bus unavailable on WSL2" message pointing at /etc/wsl.conf + wsl --shutdown) was not locked in by CI. Mock isWSLSync() through a hoisted mock so the test can force WSL detection on any platform, add an additional stopSystemdService case that asserts the WSL2-specific prefix, and reset the mock in the existing non-WSL regression case so both branches stay covered. Refs openclaw#68380
54c7aad to
f4f7e4a
Compare
ac70be4 to
b67db89
Compare
Address Greptile P2 review on openclaw#68400: the previous test only exercised the non-WSL branch of assertSystemdAvailable, so the WSL2-specific hint (the "systemd user D-Bus unavailable on WSL2" message pointing at /etc/wsl.conf + wsl --shutdown) was not locked in by CI. Mock isWSLSync() through a hoisted mock so the test can force WSL detection on any platform, add an additional stopSystemdService case that asserts the WSL2-specific prefix, and reset the mock in the existing non-WSL regression case so both branches stay covered. Refs openclaw#68380
b67db89 to
04b7351
Compare
04b7351 to
056bc63
Compare
056bc63 to
91d0519
Compare
91d0519 to
ced6294
Compare
…g systemctl When the systemd user D-Bus socket is missing (for example on WSL2 without systemd boot, or on headless hosts that never ran loginctl enable-linger), systemctl reports `Failed to connect to bus: No such file or directory`. The shared detail classifier used to match the generic "no such file or directory" substring before the user-bus matcher, so that failure got classified as missing-systemctl and `openclaw gateway start` surfaced the misleading message `systemctl not available; systemd user services are required on Linux.` on hosts where systemctl is actually installed and reachable. Reorder `classifySystemdUnavailableDetail` so the user-bus check wins over the missing-systemctl heuristic, update `assertSystemdAvailable` to branch on the classification (emitting a WSL2-specific hint that points at `[boot] systemd=true` in `/etc/wsl.conf` plus `wsl --shutdown` when running under WSL), and add regression tests so the WSL-style bus error no longer collapses into the missing-systemctl branch. Refs openclaw#68380
Address Greptile P2 review on openclaw#68400: the previous test only exercised the non-WSL branch of assertSystemdAvailable, so the WSL2-specific hint (the "systemd user D-Bus unavailable on WSL2" message pointing at /etc/wsl.conf + wsl --shutdown) was not locked in by CI. Mock isWSLSync() through a hoisted mock so the test can force WSL detection on any platform, add an additional stopSystemdService case that asserts the WSL2-specific prefix, and reset the mock in the existing non-WSL regression case so both branches stay covered. Refs openclaw#68380
ced6294 to
7dae35b
Compare
|
Codex review: needs changes before merge. Summary Reproducibility: yes. A source-level reproduction is high-confidence: on current main, Next step before merge Security Review findings
Review detailsBest possible solution: The maintainable end state is classifier precedence for user-bus failures, classifier-based systemd availability messaging, regression coverage for both WSL and non-WSL branches, and a credited changelog entry. Do we have a high-confidence way to reproduce the issue? Yes. A source-level reproduction is high-confidence: on current main, Is this the best way to solve the issue? No, not as-is. The daemon code approach is the narrow fix, but the PR should add the missing changelog credit, rebase if needed, and rerun the targeted daemon checks before landing. Full review comments:
Overall correctness: patch is incorrect Acceptance criteria:
What I checked:
Likely related people:
Remaining risk / open question:
Codex review notes: model gpt-5.5, reasoning high; reviewed against 0c6c1cac7691. |
|
Codex review: found issues before merge. What this changes: The branch reorders systemd unavailable classification, routes systemd availability failures through the shared classifier with WSL2-specific D-Bus guidance, adds daemon regression tests, and adds a changelog entry. Maintainer follow-up before merge: This is an open contributor implementation PR that appears to address a real current-main bug; the next action is maintainer review, rebase, validation, and changelog credit cleanup rather than an automated replacement PR. Review findings:
Review detailsBest possible solution: Land a narrow daemon/systemd fix that makes the user-bus classifier win over the broad missing-systemctl heuristic, makes Full review comments:
Overall correctness: patch is incorrect Acceptance criteria:
What I checked:
Likely related people:
Remaining risk / open question:
Codex review notes: model gpt-5.5, reasoning high; reviewed against 3215ab6de5db. |
Summary
[boot] systemd=true,openclaw gateway startfails with the misleading errorGateway start failed: Error: systemctl not available; systemd user services are required on Linux.even when systemctl is installed and working and the gateway itself is already running and reachable. The real failure is that the systemd user D-Bus socket ($XDG_RUNTIME_DIR/bus) is missing, sosystemctl --user statusreturnsFailed to connect to bus: No such file or directory.openclaw gateway install,systemctl --user start openclaw-gateway.service), and still cannot control the gateway because the systemctl commands themselves will also hit the same D-Bus failure. This pushes users to workarounds like manually backgroundingopenclaw gateway &inside/etc/wsl.conf, which conflicts with the systemd user unit and causes upgrade/restart confusion.classifySystemdUnavailableDetailso the user-bus matcher runs before the missing-systemctl matcher (the substringno such file or directoryis shared by both failure modes, but only the user-bus matcher is a positive identifier). Updated the internalassertSystemdAvailablehelper insrc/daemon/systemd.tsto branch on the classification rather than going through the olderisSystemctlMissingshortcut, and to emit a WSL2-specific error message (with a pointer to[boot] systemd=truein/etc/wsl.confpluswsl --shutdown) when the bus is unavailable andisWSLSync()reports WSL.isSystemdUserServiceAvailablebehavior (it still returns false for both kinds of unavailability), no changes to the machine-user-scope fallback in sudo paths, no changes to documented config, and no env/config surface additions. Thesystemctl not available; systemd user services are required on Linux.message is still emitted for the genuinely missing-binary case.Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
Root Cause (if applicable)
classifySystemdUnavailableDetail(src/daemon/systemd-unavailable.ts) checkedisSystemctlMissingDetailfirst, which matches the loose substringno such file or directory. WSL2's user-D-Bus error string isFailed to connect to bus: No such file or directory, so it hit the missing-systemctl branch despite also matching the more specificfailed to connect to bussubstring inisSystemdUserBusUnavailableDetail.assertSystemdAvailablecalledisSystemctlMissingdirectly instead of the shared classifier, so the ordering fix alone in the classifier would still have left the error message path branched on the older detector.[boot] systemd=true,DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/busis often set but/run/user/1000/busis missing during early boot unless the user's D-Bus socket is explicitly started (for example via a~/.config/systemd/user/default.target.wants/dbus.socketsymlink orloginctl enable-linger).Regression Test Plan (if applicable)
src/daemon/systemd-unavailable.test.ts— asserts the classifier returnsuser_bus_unavailableforFailed to connect to bus: No such file or directory(previously returnedmissing_systemctl).src/daemon/systemd.test.ts— asserts thatstopSystemdService(the nearest integration-style caller ofassertSystemdAvailablein unit coverage) surfaces the bus detail in the thrown error and does not emit thesystemctl not available; systemd user services are requiredstring.Failed to connect to bus: No such file or directoryreturned fromsystemctl --user statusmust classify as bus-unavailable and must not surface a missing-systemctl message downstream.stopSystemdServiceintegration test exercises the exact error plumbing the originalopenclaw gateway startfailure used.systemd-unavailable.test.tsalready covers theNo medium foundbus variant; this PR adds theNo such file or directoryvariant (the WSL-specific bus pattern).Security Impact (required)
Repro + Verification
Environment
openclaw-gatewayuser service viasystemctl --usersystemd=truein/etc/wsl.conf,DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/busset by login,/run/user/1000/busmissing.Steps
/run/user/$UID/busdoes not exist despitesystemctlbeing installed.openclaw gateway start.Expected
[boot] systemd=truein/etc/wsl.conf, not a claim that systemctl is unavailable.Actual (before)
Gateway start failed: Error: systemctl not available; systemd user services are required on Linux.— incorrect; systemctl is installed.Actual (after)
Gateway start failed: Error: systemd user D-Bus unavailable on WSL2: Failed to connect to bus: No such file or directory. Enable systemd by adding \[boot]\nsystemd=true` to /etc/wsl.conf, then run `wsl --shutdown` from PowerShell and reopen your distro; verify with `systemctl --user status`.` (on WSL).systemctl --user unavailable: Failed to connect to bus: ...— accurate and pointing at the right subsystem.Evidence
Human Verification (required)
classifySystemdUnavailableDetailtests — all 5 pass, including the new WSL variant.src/daemon/systemd.test.tssuite (48 tests) plussrc/daemon/systemd-unavailable.test.tsandsrc/daemon/systemd-hints.test.ts(10 tests) — all 58 pass.No medium foundbus variant still classifies asuser_bus_unavailable(existing test).spawn systemctl ENOENTstill classifies asmissing_systemctl(existing test).Failed to connect to busin thestopSystemdServicepath (existing test) keeps its existing error message because the bus detail also surfaces through the genericsystemctl --user unavailable: ...branch when WSL detection is false.Compatibility / Migration
isSystemdUserServiceAvailablereturns the same boolean for the same inputs, only the failure-branch error message/text changes.Risks and Mitigations
failed to connect to buscould now classify asuser_bus_unavailable.systemctl --user unavailable: <detail>remains technically correct (the binary is missing and/or the bus is unreachable); no tested case crosses this line.