Skip to content

fix(gateway): keep newer node session on stale disconnect#79491

Closed
davelutztx wants to merge 78 commits intoopenclaw:mainfrom
davelutztx:fix-node-registry-stale-unregister
Closed

fix(gateway): keep newer node session on stale disconnect#79491
davelutztx wants to merge 78 commits intoopenclaw:mainfrom
davelutztx:fix-node-registry-stale-unregister

Conversation

@davelutztx
Copy link
Copy Markdown

Summary

Fix a node registry race where a stale/disconnecting node WebSocket can unregister a newer live session for the same node id.

Root cause

NodeRegistry.unregister(connId) looked up nodeId from nodesByConn, then unconditionally deleted nodesById[nodeId].

If an older connection closes after a newer connection for the same Android node has already registered, the old close removes the newer live registry entry. The node socket can still send node.event, but node.invoke and nodes status see the node as disconnected.

This matches the symptom in locked issue #30137: Android UI/chat appears connected while node commands fail with node not connected after gateway restart/reconnect timing.

Fix

Only delete nodesById[nodeId] during unregister when the current registered session still belongs to the closing connId.

Verification

Added a regression test:

  • register old-conn for node N
  • register new-conn for the same node N
  • unregister old-conn
  • assert new-conn remains connected/registered

Ran targeted gateway tests:

node scripts/run-vitest.mjs run --config test/vitest/vitest.gateway.config.ts \
  src/gateway/node-registry.test.ts \
  src/gateway/node-catalog.test.ts \
  src/gateway/server-node-events.test.ts

Result: 42 tests passed.

Refs #30137.

steipete and others added 30 commits May 5, 2026 02:42
Normalize WhatsApp onboarding allowlist entries to digit-only WhatsApp IDs and reject invalid owner-phone inputs during prompt validation.

(cherry picked from commit 68a500c)
* fix(telegram): reuse preview for long text finals

* test(qa): cover long telegram finals

* fix(qa): satisfy extension lint

* fix(qa): keep telegram long final fixture to two chunks

* test(telegram): cover three chunk finals

* fix(telegram): force long final preview boundary

(cherry picked from commit e03fe1e)
Bind the default loopback gateway listener only to `127.0.0.1` on Windows so libuv dual-stack `::1` behavior cannot wedge localhost HTTP requests.

Also keeps non-Windows dual-loopback behavior covered, replaces the redundant Windows passthrough test with guard coverage, and adds the required changelog entry.

Fixes openclaw#69674.

Tests:
- pnpm exec oxfmt --check --threads=1 CHANGELOG.md src/gateway/net.ts src/gateway/net.test.ts
- pnpm test src/gateway/net.test.ts
- pnpm check:changed
- GitHub required checks: green

Thanks @SARAMALI15792.

Co-authored-by: saram ali <140950904+SARAMALI15792@users.noreply.github.com>
Co-authored-by: Brad Groux <3053586+BradGroux@users.noreply.github.com>
(cherry picked from commit 978bc53)
…isted] (openclaw#74161)

Summary:
- The PR updates agents skill prompt guidance to require exact `<location>` paths for single- and multi-skill selection, adds prompt assertions, and records the fix in the changelog.
- Reproducibility: yes. Static source reproduction is enough: current main lacks the exact-`<location>` guard  ... illsSection()`, while the PR diff adds it to both selection branches and asserts the resulting prompt text.

Automerge notes:
- PR branch already contained follow-up commit before automerge: fix: enforce exact skill paths for all skill matches

Validation:
- ClawSweeper review passed for head 743c984.
- Required merge gates passed before the squash merge.

Prepared head SHA: 743c984
Review: openclaw#74161 (comment)

Co-authored-by: tianguicheng <tianguicheng@xiaomi.com>
Co-authored-by: sallyom <somalley@redhat.com>
(cherry picked from commit c739088)
Accept drive-absolute Windows sandbox Docker bind sources in config and runtime validation while keeping blocked-path and allowed-root comparisons case-insensitive for Windows drive paths.

Also remove a stale WhatsApp setup import that blocked extension lint after the rebase.

Co-authored-by: 6607changchun <84566142+6607changchun@users.noreply.github.com>
Co-authored-by: Brad Groux <3053586+BradGroux@users.noreply.github.com>
(cherry picked from commit d02fbc6)
Adds cap_drop and no-new-privileges hardening for the bundled gateway Docker Compose services.\n\nThanks @VintageAyu.

(cherry picked from commit f9da484)
…penclaw#77280)

Merged via squash.

Prepared head SHA: f4188b4
Co-authored-by: openperf <80630709+openperf@users.noreply.github.com>
Co-authored-by: openperf <80630709+openperf@users.noreply.github.com>
Reviewed-by: @openperf

(cherry picked from commit 31da1fe)
@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented May 8, 2026

ClawSweeper status: review started.

I am starting a fresh review of this pull request: fix(gateway): keep newer node session on stale disconnect This is item 1/1 in the current shard. Shard 0/1.

This placeholder means the worker is alive and reading the current context. I will edit this same comment with the actual review when the claws are done clicking.

Crustacean status: shell secured, claws on keyboard, evidence pebbles being sorted.

@obviyus
Copy link
Copy Markdown
Contributor

obviyus commented May 9, 2026

Thanks for the PR. This branch appears to include unrelated release/main replay around the gateway stale-disconnect fix, so it is not reviewable as a focused Telegram PR. Please reopen as a narrow PR with only the intended fix.

@davelutztx
Copy link
Copy Markdown
Author

Thanks — agreed, this branch picked up unrelated release/main history and shouldn’t be reviewed further. I checked current main and the intended node reconnect fix appears to have already landed via #78351,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling app: android App: android app: ios App: ios app: macos App: macos app: web-ui App: web-ui channel: discord Channel integration: discord channel: feishu Channel integration: feishu channel: googlechat Channel integration: googlechat channel: imessage Channel integration: imessage channel: irc channel: line Channel integration: line channel: matrix Channel integration: matrix channel: mattermost Channel integration: mattermost channel: msteams Channel integration: msteams channel: nextcloud-talk Channel integration: nextcloud-talk channel: nostr Channel integration: nostr channel: qa-channel Channel integration: qa-channel channel: qqbot channel: signal Channel integration: signal channel: slack Channel integration: slack channel: synology-chat channel: telegram Channel integration: telegram channel: tlon Channel integration: tlon channel: twitch Channel integration: twitch channel: voice-call Channel integration: voice-call channel: whatsapp-web Channel integration: whatsapp-web channel: zalo Channel integration: zalo channel: zalouser Channel integration: zalouser cli CLI command changes commands Command implementations docker Docker and sandbox tooling docs Improvements or additions to documentation extensions: acpx extensions: anthropic extensions: arcee extensions: byteplus extensions: cerebras extensions: cloudflare-ai-gateway extensions: codex extensions: copilot-proxy Extension: copilot-proxy extensions: deepinfra extensions: deepseek extensions: diagnostics-otel Extension: diagnostics-otel extensions: diagnostics-prometheus extensions: duckduckgo extensions: fal extensions: gradium extensions: huggingface extensions: inworld Extension: inworld extensions: kilocode extensions: kimi-coding extensions: litellm extensions: llm-task Extension: llm-task extensions: lmstudio extensions: lobster Extension: lobster extensions: memory-core Extension: memory-core extensions: memory-lancedb Extension: memory-lancedb extensions: memory-wiki extensions: minimax extensions: moonshot extensions: nvidia extensions: open-prose Extension: open-prose extensions: openai extensions: qa-lab extensions: qianfan extensions: senseaudio extensions: stepfun extensions: synthetic extensions: tavily extensions: tencent extensions: together extensions: tokenjuice Changes to the bundled tokenjuice extension extensions: tts-local-cli extensions: venice extensions: vercel-ai-gateway extensions: volcengine extensions: webhooks extensions: xiaomi gateway Gateway runtime plugin: azure-speech Azure Speech plugin plugin: bonjour Plugin integration: bonjour plugin: file-transfer plugin: google-meet plugin: migrate-claude plugin: migrate-hermes scripts Repository scripts size: XL triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup.

Projects

None yet

Development

Successfully merging this pull request may close these issues.