Skip to content

fix #90668: [Bug]: macOS node mode can silently self-reconnect in a healthy direct gateway session#90736

Closed
zhangguiping-xydt wants to merge 2 commits into
openclaw:mainfrom
zhangguiping-xydt:feat/issue-90668
Closed

fix #90668: [Bug]: macOS node mode can silently self-reconnect in a healthy direct gateway session#90736
zhangguiping-xydt wants to merge 2 commits into
openclaw:mainfrom
zhangguiping-xydt:feat/issue-90668

Conversation

@zhangguiping-xydt

@zhangguiping-xydt zhangguiping-xydt commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Problem: Healthy macOS node-mode direct wss:// gateway loops could rebuild the TLS-pinning WebSocket session object every pass, making GatewayNodeSession treat unchanged inputs as reconnect input and repeat connected notifications.
  • Solution: Cache the macOS node-mode TLS session box by URL and TLS pin parameters so unchanged direct wss:// inputs keep the same session identity while changed trust inputs still rebuild it.
  • What changed: apps/macos/Sources/OpenClaw/NodeMode/MacNodeModeCoordinator.swift now owns a narrow TLS session cache, and apps/macos/Tests/OpenClawIPCTests/MacNodeModeCoordinatorTests.swift covers reuse and rebuild cases.
  • What did NOT change: No gateway protocol, token handling, pairing behavior, TLS validation policy, non-wss:// path, shared reconnect semantics, config shape, or non-macOS client behavior changed.

Fixes #90668

Real behavior proof

  • Behavior addressed: Healthy macOS node-mode direct wss:// loops no longer supply a fresh TLS-pinning session identity on each pass, so unchanged connection inputs do not repeat mac node connected to gateway notifications.
  • Real environment tested: macOS 26.5, Node v24.15.0, Apple Swift 6.3.2, OpenClaw v2026.6.2, worktree SHA 619590e.
  • Exact steps or command run after this patch:
    openssl req -x509 -newkey rsa:2048 -sha256 -days 7 -nodes \
      -keyout /Users/zhangguiping/daily_pr_work/proofs/openclaw-issue-90668-evidence/gateway-key.pem \
      -out /Users/zhangguiping/daily_pr_work/proofs/openclaw-issue-90668-evidence/gateway-cert.pem \
      -subj /CN=127.0.0.1 -addext subjectAltName=IP:127.0.0.1,DNS:localhost
    env -C /Users/zhangguiping/daily_pr_work/openclaw-90668 \
      OPENCLAW_CONFIG_PATH=/Users/zhangguiping/daily_pr_work/proofs/openclaw-issue-90668-evidence/openclaw.json \
      OPENCLAW_STATE_DIR=/Users/zhangguiping/daily_pr_work/proofs/openclaw-issue-90668-evidence/state \
      pnpm openclaw gateway --port 19891 --bind loopback --token proof-token-90668 \
        --allow-unconfigured --verbose --ws-log compact
    env PROOF_GATEWAY_URL=wss://127.0.0.1:19891 \
      PROOF_GATEWAY_TOKEN=proof-token-90668 \
      PROOF_GATEWAY_FINGERPRINT=c60afe1c23e7291a557b6ad32fea336e03d08d6466bacabcf4dd06c086e7d780 \
      swift run --package-path /Users/zhangguiping/daily_pr_work/proofs/openclaw-issue-90668-evidence OpenClawNodeProof
    gh pr checks 90736 --repo openclaw/openclaw
  • Evidence after fix:
    [gateway] ready
    [ws] ← connect client=openclaw-macos clientDisplayName=OpenClaw macOS proof version=dev mode=node clientId=openclaw-macos platform=macOS 26.5.0 auth=token
    [ws] → hello-ok methods=189 events=27 presence=2 stateVersion=2
    
    running before-style fresh TLS session loop
    before-style fresh TLS session: connect pass 1, sessionObject=ObjectIdentifier(0x00000001012145f0)
    before-style fresh TLS session: mac node connected to gateway callback #1 during connect pass 1
    before-style fresh TLS session: connect pass 2, sessionObject=ObjectIdentifier(0x0000000a1d038820)
    before-style fresh TLS session: mac node connected to gateway callback #2 during connect pass 2
    before-style fresh TLS session: connect pass 3, sessionObject=ObjectIdentifier(0x0000000a1d0386e0)
    before-style fresh TLS session: mac node connected to gateway callback #3 during connect pass 3
    before-style fresh TLS session connected callbacks: 3
    
    running after-fix cached TLS session loop
    after-fix cached TLS session: connect pass 1, sessionObject=ObjectIdentifier(0x0000000a1d038f00)
    after-fix cached TLS session: mac node connected to gateway callback #1 during connect pass 1
    after-fix cached TLS session: connect pass 2, sessionObject=ObjectIdentifier(0x0000000a1d038f00)
    after-fix cached TLS session: connect pass 3, sessionObject=ObjectIdentifier(0x0000000a1d038f00)
    after-fix cached TLS session connected callbacks: 1
    PASS: cached TLS session keeps healthy direct wss node session from repeating connected notifications
    
    [ws] ← open remoteAddr=127.0.0.1 remotePort=56083 localAddr=127.0.0.1 localPort=19891 endpoint=127.0.0.1:56083->127.0.0.1:19891
    [ws] ← connect client=openclaw-macos clientDisplayName=OpenClaw macOS proof version=dev mode=node clientId=openclaw-macos platform=macOS 26.5.0 auth=token
    [ws] → hello-ok methods=189 events=27 presence=2 stateVersion=8
    [ws] → event tick seq=per-client clients=1
    [ws] → close code=1001 durationMs=3860 handshake=connected
    
  • Observed result after fix: The control loop that creates a fresh GatewayTLSPinningSession each pass produced three distinct session identities and three connected callbacks. The patched cache loop reused one session identity across three healthy direct wss:// connect passes and produced exactly one connected callback while the gateway kept the WebSocket connected until the harness disconnected.
  • What was not tested: The full packaged menu-bar app was not separately launched on this machine; the proof exercises the same shared GatewayNodeSession + GatewayTLSPinningSession macOS runtime path used by the coordinator against a real local OpenClaw WSS gateway. Local swift test --package-path apps/macos --filter MacNodeModeCoordinatorTests is blocked here by Command Line Tools missing SwiftUI preview macro plugins; the PR macos-swift check passed for this head.

Regression Test Plan

  • Coverage level: Swift unit test plus macOS WSS runtime proof
  • Target test file: apps/macos/Tests/OpenClawIPCTests/MacNodeModeCoordinatorTests.swift
  • Scenario locked in: Repeated macOS direct wss:// node-mode loop passes with the same URL and TLS params reuse one session object; changed TLS fingerprint input rebuilds it.
  • Why this is the smallest reliable guardrail: The regression lives at the macOS coordinator owner boundary before GatewayNodeSession.connect(...), so it proves the exact identity-stability invariant without weakening the shared gateway reconnect contract.

Root Cause

  • Root cause: MacNodeModeCoordinator.run() called buildSessionBox(...) on every loop pass, and the wss:// path created a fresh GatewayTLSPinningSession / WebSocketSessionBox for unchanged URL and TLS pin inputs.
  • Missing detection / guardrail: Existing tests covered TLS parameter selection and shared session replacement behavior, but not the macOS coordinator invariant that unchanged TLS inputs must preserve the same session object identity across healthy run-loop passes.

@openclaw-barnacle openclaw-barnacle Bot added app: macos App: macos size: S proof: supplied External PR includes structured after-fix real behavior proof. labels Jun 5, 2026
@clawsweeper

clawsweeper Bot commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Codex review: passed. Reviewed June 5, 2026, 10:16 PM ET / 02:16 UTC.

Summary
The PR adds a macOS node-mode TLS session cache keyed by URL and TLS pin parameters, plus Swift tests for unchanged-session reuse and changed-parameter rebuilds.

PR surface: Other +78. Total +78 across 2 files.

Reproducibility: yes. source-reproducible: current main creates a fresh TLS session box on every macOS wss loop pass, and GatewayNodeSession treats the changed session identity as reconnect input. I did not run the macOS app locally during this read-only review.

Review metrics: none identified.

Merge readiness
Overall: 🦞 diamond lobster
Proof: 🦞 diamond lobster
Patch quality: 🦞 diamond lobster
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Next step before merge

  • [P2] No repair lane is needed because the PR has no actionable review findings; the remaining path is normal automerge/check gating for the exact head.

Security
Cleared: The diff only changes Swift macOS runtime code and Swift tests; it does not touch workflows, dependencies, lockfiles, package metadata, secrets handling, or downloaded artifacts.

Review details

Best possible solution:

Merge the scoped macOS coordinator cache after the normal automerge/check gates, preserving GatewayNodeSession’s shared behavior that a genuinely changed session object triggers reconnect.

Do we have a high-confidence way to reproduce the issue?

Yes, source-reproducible: current main creates a fresh TLS session box on every macOS wss loop pass, and GatewayNodeSession treats the changed session identity as reconnect input. I did not run the macOS app locally during this read-only review.

Is this the best way to solve the issue?

Yes, this is the best fix shape: it stabilizes identity at the macOS coordinator owner boundary while leaving the shared GatewayNodeSession changed-session contract intact. The iOS sibling path already captures and reuses a sessionBox for its loop, so a shared-session change would be broader than needed.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 3a2f54e6a866.

Label changes

Label changes:

  • add status: 🚀 automerge armed: This PR is in ClawSweeper's automerge lane. Sufficient (logs): The PR body includes live macOS/WSS gateway logs showing the changed behavior after the patch: one cached session identity and one connected callback across repeated healthy passes.
  • remove status: 👀 ready for maintainer look: Current PR status label is status: 🚀 automerge armed.

Label justifications:

  • P1: The linked bug affects a real macOS node-mode gateway workflow by repeatedly surfacing connected callbacks during a healthy direct session.
  • rating: 🦞 diamond lobster: Overall readiness is 🦞 diamond lobster; proof is 🦞 diamond lobster and patch quality is 🦞 diamond lobster.
  • status: 🚀 automerge armed: This PR is in ClawSweeper's automerge lane. Sufficient (logs): The PR body includes live macOS/WSS gateway logs showing the changed behavior after the patch: one cached session identity and one connected callback across repeated healthy passes.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes live macOS/WSS gateway logs showing the changed behavior after the patch: one cached session identity and one connected callback across repeated healthy passes.
Evidence reviewed

PR surface:

Other +78. Total +78 across 2 files.

View PR surface stats
Area Files Added Removed Net
Source 0 0 0 0
Tests 0 0 0 0
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 2 83 5 +78
Total 2 83 5 +78

What I checked:

Likely related people:

  • Ayaan Zaidi: Current-line blame attributes both the macOS coordinator loop/buildSessionBox and GatewayNodeSession session-identity reconnect logic to commit 61d121f in this checkout. (role: recent area contributor; confidence: medium; commits: 61d121f1caa2; files: apps/macos/Sources/OpenClaw/NodeMode/MacNodeModeCoordinator.swift, apps/shared/OpenClawKit/Sources/OpenClawKit/GatewayNodeSession.swift)
  • Vincent Koc: git log -S shows the latest release commit 2e08f0f in the symbol history for the macOS TLS session construction and GatewayNodeSession identity path. (role: adjacent release/history contributor; confidence: low; commits: 2e08f0f4221f; files: apps/macos/Sources/OpenClaw/NodeMode/MacNodeModeCoordinator.swift, apps/shared/OpenClawKit/Sources/OpenClawKit/GatewayNodeSession.swift)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P1 High-priority user-facing bug, regression, or broken workflow. labels Jun 5, 2026
@clawsweeper clawsweeper Bot added the merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. label Jun 5, 2026
@zhangguiping-xydt

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. labels Jun 5, 2026
@Takhoffman

Copy link
Copy Markdown
Contributor

@clawsweeper automerge

@clawsweeper

clawsweeper Bot commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

🦞🔧
ClawSweeper saw the passing review, but the PR needs another repair pass before merge.

Source: clawsweeper[bot]
Feedback: structured ClawSweeper verdict: pass (sha=619590ed7416a9b3f855ebda102e9ae76af493a8); failed required checks before automerge: Real behavior proof:CANCELLED
Action: repair worker queued. Run: https://github.com/openclaw/clawsweeper/actions/runs/27049846445
Model: gpt-5.5

I will update this PR branch, or open a safe credited replacement, if the repair worker finds a narrow CI fix.

Automerge progress:

  • 2026-06-06 02:10:17 UTC review queued 619590ed7416 (queued)
  • 2026-06-06 02:16:59 UTC review passed 619590ed7416 (structured ClawSweeper verdict: pass (sha=619590ed7416a9b3f855ebda102e9ae76af49...)

@clawsweeper clawsweeper Bot added clawsweeper:automerge Maintainer opted this PR into bounded ClawSweeper-reviewed automerge status: 🚀 automerge armed This PR is in ClawSweeper's automerge lane. and removed status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. labels Jun 6, 2026
@clawsweeper

clawsweeper Bot commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper 🐠 reef update

Thanks for the work on this. ClawSweeper could not push to this branch with the permissions available, so it opened a narrow replacement PR to keep the fix swimming forward without losing the contributor trail. not your fault, just GitHub branch-permission tides.

Why replacement: ClawSweeper could not update the source PR branch directly; GitHub did not grant sufficient push rights to the bot for that branch.
Replacement PR: #90815
Why close: this run explicitly closes the superseded source PR after the credited replacement PR is open, so review continues in one place.
Closing this source PR only because source-PR closing was explicitly enabled for this run.
Attribution stays attached; the replacement just gives the fix a writable branch.
Co-author credit kept:

fish notes: model gpt-5.5, reasoning high; reviewed against 1496eac.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

app: macos App: macos clawsweeper:automerge Maintainer opted this PR into bounded ClawSweeper-reviewed automerge P1 High-priority user-facing bug, regression, or broken workflow. proof: sufficient ClawSweeper judged the real behavior proof convincing. proof: supplied External PR includes structured after-fix real behavior proof. rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. size: S status: 🚀 automerge armed This PR is in ClawSweeper's automerge lane.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: macOS node mode can silently self-reconnect in a healthy direct gateway session

2 participants