fix(desktop): pass --port to openclaw gateway and detect port conflicts#700
Merged
lefarcen merged 19 commits intorelease/v0.1.8from Mar 31, 2026
Merged
fix(desktop): pass --port to openclaw gateway and detect port conflicts#700lefarcen merged 19 commits intorelease/v0.1.8from
lefarcen merged 19 commits intorelease/v0.1.8from
Conversation
Users with 'openclaw install' have a global ai.openclaw.gateway launchd service on port 18789 with KeepAlive=true. When Nexu also used 18789, launchd race conditions caused token mismatch or crash loops. Changed default to 50789 (alongside controller:50800 and web:50810). findFreePort() still handles further conflicts.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 31319d09d6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
After starting the openclaw launchd service, verify the port listener PID matches our service PID. If a competing service (e.g. global ai.openclaw.gateway with KeepAlive=true) grabbed the port, bootout our openclaw, find a new free port, regenerate both openclaw and controller plists with the new port, and restart both services.
Scenario 27: competing service steals port after launch → bootstrap detects PID mismatch and reassigns to next free port. Scenario 28: our openclaw owns the port → no reassignment needed.
- desktop-ci-check.mjs: update health probe URL from 18789 to 50789 - dev-launchd.sh: update OPENCLAW_PORT default to 50789 so cleanup no longer kills a user's unrelated global openclaw on 18789
Deploying nexu-docs with
|
| Latest commit: |
799396b
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://cf1991bc.nexu-docs.pages.dev |
| Branch Preview URL: | https://hotfix-v0-1-8-port-conflict.nexu-docs.pages.dev |
- Replace lsof-based port detection with net.connect — lsof is blocked by macOS hardened runtime in packaged Electron apps, silently returning empty results even when ports are occupied. - Add structured logging via env.log callback so bootstrap diagnostics appear in cold-start.log (console.log is lost in packaged mode). - Fix recovery: add waitForExit after bootout before re-bootstrapping to prevent launchd race conditions. Verified: packaged app correctly detects port 50789 occupied by global openclaw, auto-assigns 50790, both services coexist.
…8789 The openclaw plist ProgramArguments never included --port, so openclaw always bound to its hardcoded default 18789 regardless of what findFreePort allocated. When another service (ClawX, global openclaw) occupied 18789, our openclaw crashed on bind (EADDRINUSE) even though bootstrap had correctly detected the conflict and assigned a new port. Now passing --port explicitly in ProgramArguments. Also reverted the default port back to 18789 since dynamic port allocation handles conflicts correctly.
…ection - Reverted all 50789 port changes back to 18789 (scripts, CI, tests) - Updated detectPortOccupier to use net.createServer().listen() instead of net.connect (avoids conflict with probePort mock) - Updated plist-generator tests for --port in ProgramArguments - Updated port conflict scenario tests (15/16/27/28) to mock createServer instead of lsof/createConnection - All 624 tests pass
Simulates a global openclaw (or ClawX) occupying port 18789 before Nexu launches. Verifies: - Nexu detects the conflict and auto-assigns an alternative port - Controller comes up healthy - OpenClaw runs on 18790+ instead of crashing - The blocker on 18789 is NOT killed (coexistence)
Green card on pass, red card on failure. Shows test mode, source, channel, and links to the CI run for details.
Only sends on failure (not success). Card includes: - Trigger source: PR number, branch, commit SHA - Who triggered it - Test mode/source/channel - Link to CI logs
When launchd has stale state for a service label (e.g. after repeated bootout/bootstrap during port conflict recovery), bootstrap fails with 'Input/output error (code 5)'. Now detects this error, bootout to clear the stale registration, waits 1s, and retries once.
1. Recovery uses bootoutAndWaitForExit (captures PID before bootout) instead of separate bootout + waitForExit without knownPid. 2. Attach path adds token validation: checks launchd service env OPENCLAW_GATEWAY_TOKEN matches expected token before attaching. Prevents attaching to a global openclaw or ClawX on the same port. 3. E2E openclaw port conflict: controller readiness is now a hard fail. Port discovery uses runtime-ports.json instead of hardcoded range, so any valid auto-assigned port is accepted.
Scenario: start app normally on 18789, kill openclaw, occupy 18789 with a blocker, force-quit and re-launch. Verifies the app detects the stolen port on cold start, auto-assigns a new port, and recovers. This covers the post-launch recovery path that unit tests can't easily simulate (requires real launchd timing).
PerishCode
approved these changes
Mar 31, 2026
The openclaw port may be auto-assigned to 18790+ when 18789 is occupied. Updated: - packaged-e2e.mjs: read openclawPort from controller readiness - run-e2e.sh: read from runtime-ports.json, scan port range for diagnostics and port-free checks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Fix openclaw port allocation so Nexu can coexist with other OpenClaw services (ClawX, global
openclaw install, etc).Why
The openclaw launchd plist never included
--portin ProgramArguments (missing since initial implementation in #405). This meant:findFreePortdetected a conflict and allocated a different port, the openclaw process ignored itopenclaw install), our openclaw crashed on bind (EADDRINUSE) and entered a crash loopgateway token mismatch→ all channels stuck in "connecting"This affected any user running ClawX or
openclaw installalongside Nexu.How
Three fixes:
Pass
--portto openclaw (plist-generator.ts): Add--port ${env.openclawPort}to ProgramArguments so openclaw actually uses the allocated port.Use
net.connectfor port detection (launchd-bootstrap.ts): Replacelsof-based detection with TCP probe. macOS hardened runtime blocks packaged Electron apps from seeing other processes' file descriptors vialsof, causing silent detection failures.Post-launch port theft recovery (
launchd-bootstrap.ts): After starting openclaw, verify the port isn't stolen by a competing service. If it is, bootout, find a new port, regenerate plists for both openclaw and controller, and restart.Also added structured logging via
env.logcallback so bootstrap diagnostics appear incold-start.log(packaged mode losesconsole.log).Verified scenarios
openclaw install(KeepAlive=true) → same auto-assignmentAffected areas
Checklist
pnpm typecheckpasses