fix(desktop): recover from stale openclaw launchd state#734
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3f278473bf
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Deploying nexu-docs with
|
| Latest commit: |
92786e4
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://faaff29f.nexu-docs.pages.dev |
| Branch Preview URL: | https://fix-desktop-openclaw-stale-l.nexu-docs.pages.dev |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 92786e4766
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b817eb2f70
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
lefarcen
left a comment
There was a problem hiding this comment.
Review
The partial launchd state detection (controllerRunning !== openclawRunning → force cold start) is the right fix for the reported issue. A few things to address:
P0: Process patterns narrowed too far — dev mode orphan cleanup regressed
The old patterns matched both dev and packaged mode:
"node.*controller/dist/index.js",
"node.*openclaw.mjs gateway",
"openclaw-gateway",The new patterns only match ~/.nexu/runtime/ (packaged) paths:
"\\.nexu/runtime/controller-sidecar/dist/index\\.js",
"\\.nexu/(runtime/)?openclaw-sidecar",This means killOrphanNexuProcesses() and ensureNexuProcessesDead() can no longer detect dev-mode orphans via the pgrep fallback path. The safety net in teardownLaunchdServices() is also affected.
Suggest keeping both dev and packaged patterns:
const NEXU_PROCESS_PATTERNS = [
"\\.nexu/runtime/controller-sidecar/dist/index\\.js", // packaged
"\\.nexu/(runtime/)?openclaw-sidecar", // packaged
"node.*controller/dist/index\\.js", // dev
"node.*openclaw\\.mjs gateway", // dev
] as const;P1: Partial state bootout doesn't wait for process exit
The new partial-state cleanup calls bootoutService() then immediately runs killOrphanOpenclawProcesses():
await Promise.allSettled([
controllerRunning ? launchd.bootoutService(labels.controller) : Promise.resolve(),
openclawRunning ? launchd.bootoutService(labels.openclaw) : Promise.resolve(),
]);
await killOrphanOpenclawProcesses({ ... });bootoutService() sends launchctl bootout but doesn't wait for the process to actually exit. The orphan kill and subsequent cold start may race against the still-exiting process. Same gap exists in the stale session (line 467) and version mismatch (line 530) bootout paths.
Consider using bootoutAndWaitForExit here, or at minimum adding a short delay.
Minor: Scenario 27 test doesn't assert orphan kill
The test verifies bootout and runtime-ports.json deletion, but doesn't assert that process.kill(77777, "SIGKILL") was called. Since killing the stale orphan is the core fix, this assertion would strengthen the test.
Minor: NEXU_MANAGED_OPENCLAW_PATH_PATTERNS duplicates entry from NEXU_PROCESS_PATTERNS
The single entry "\\.nexu/(runtime/)?openclaw-sidecar" is identical in both arrays. Could extract a shared constant or filter from NEXU_PROCESS_PATTERNS.
Everything else (net mock refactor, scenario renumbering, findProcessPidsByPatterns extraction) looks good.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cb8d485245
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@lefarcen Addressed your review items:
I re-ran the relevant validation set:
The review comments from this round should now be fully addressed. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1abb4d7907
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8a953d2c27
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
What
Make launchd startup recover from stale OpenClaw partial-state leftovers instead of attaching to a broken packaged runtime.
Why
Packaged desktop could fail to start when controller remained launchd-managed but OpenClaw was left behind as stale launchd/orphan state. In that case the app attached to inconsistent runtime metadata until users manually booted out the job and killed lingering OpenClaw processes.
How
openclaw.mjs,openclaw-gateway, and extracted sidecar paths are cleaned up proactivelyAffected areas
Checklist
pnpm typecheckpassespnpm lintpassespnpm testpassespnpm generate-typesrun (if API routes/schemas changed)anytypes introduced (useunknownwith narrowing)Notes for reviewers
The key path to review is the launchd attach decision in
apps/desktop/main/services/launchd-bootstrap.ts, especially the new partial-state cleanup before recovered ports are reused.