Rate limit node pairing requests [AI]#90147
Conversation
|
Codex review: needs real behavior proof before merge. Reviewed June 4, 2026, 12:11 AM ET / 04:11 UTC. Summary PR surface: Source +72, Tests +186. Total +258 across 5 files. Reproducibility: no. high-confidence live reproduction is present. Source inspection shows current main enters requestNodePairing directly, and the PR adds harness coverage for the claimed burst and paired-reconnect paths, but no external non-loopback run was provided. Review metrics: 1 noteworthy metric.
Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Proof guidance:
Risk before merge
Maintainer options:
Next step before merge
Security Review findings
Review detailsBest possible solution: Land node-pairing throttling only after the auth-rate-limit contract is explicitly approved and documented or adjusted, paired-reconnect compatibility remains covered, and redacted non-loopback gateway proof is added. Do we have a high-confidence way to reproduce the issue? No high-confidence live reproduction is present. Source inspection shows current main enters requestNodePairing directly, and the PR adds harness coverage for the claimed burst and paired-reconnect paths, but no external non-loopback run was provided. Is this the best way to solve the issue? Unclear as merge-ready. Throttling before the node-pairing storage lock is the right layer for the pressure point, but reusing the documented failed-auth limiter for successful pairing requests needs owner approval, documentation, or a safer contract before it is the best fix. Full review comments:
Overall correctness: patch is incorrect AGENTS.md: found and applied where relevant. Codex review notes: model gpt-5.5, reasoning high; reviewed against 7b02080fa179. Label changesLabel changes:
Label justifications:
Evidence reviewedPR surface: Source +72, Tests +186. Total +258 across 5 files. View PR surface stats
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
|
@clawsweeper re-review Updated the PR after the prior review:
|
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
|
Verification before merge: Behavior addressed: First-time node-role WebSocket pairing requests are rate-limited before entering shared node-pairing storage, while already paired node upgrade reconnects keep their approved surface when fresh pairing is throttled. Real environment tested: Local source checkout with gateway WebSocket integration tests; GitHub Actions PR CI on head f40e412. No external live non-loopback device environment was used. Exact steps or command run after this patch:
Evidence after fix:
Observed result after fix: Repeated first-time node pairing attempts from the same rate-limit key are rejected before queueing node-pairing storage work, and paired upgrade reconnects continue on the already approved effective surface. What was not tested: External live non-loopback node/device traffic outside the local gateway WebSocket test harness. Regression Risk: Best fix verdict: Best/appropriate. The gateway connect layer owns the client rate-limit key, so gating before User behavior change: |
* fix: rate limit node pairing requests * fix: preserve paired node reconnects
* fix: rate limit node pairing requests * fix: preserve paired node reconnects
* fix: rate limit node pairing requests * fix: preserve paired node reconnects
* fix: rate limit node pairing requests * fix: preserve paired node reconnects
* fix: rate limit node pairing requests * fix: preserve paired node reconnects
* fix: rate limit node pairing requests * fix: preserve paired node reconnects
* fix: rate limit node pairing requests * fix: preserve paired node reconnects
* fix: rate limit node pairing requests * fix: preserve paired node reconnects
…26.6.5) (#963) This PR contains the following updates: | Package | Update | Change | |---|---|---| | [ghcr.io/openclaw/openclaw](https://openclaw.ai) ([source](https://github.com/openclaw/openclaw)) | patch | `2026.6.1` → `2026.6.5` | --- ### Release Notes <details> <summary>openclaw/openclaw (ghcr.io/openclaw/openclaw)</summary> ### [`v2026.6.5`](https://github.com/openclaw/openclaw/blob/HEAD/CHANGELOG.md#202665) [Compare Source](openclaw/openclaw@v2026.6.1...v2026.6.5) ##### Highlights - QQBot now strips model reasoning/thinking scaffolding before native delivery, preventing raw `<thinking>` content from leaking into channel replies. ([#​89913](openclaw/openclaw#89913), [#​90132](openclaw/openclaw#90132)) Thanks [@​openperf](https://github.com/openperf). - MCP tool results now coerce `resource_link`, `resource`, `audio`, malformed image, and future non-text/image blocks at the materialize boundary, preventing Anthropic 400s and poisoned session history after a tool returns richer MCP content. ([#​90710](openclaw/openclaw#90710), [#​90728](openclaw/openclaw#90728)) Thanks [@​RanSHammer](https://github.com/RanSHammer) and [@​849261680](https://github.com/849261680). - Anthropic extended-thinking sessions recover after prompt-cache expiry or Gateway restart because stream start events wait for `message_start`, letting pre-generation signature errors trigger the existing recovery retry. ([#​90667](openclaw/openclaw#90667), [#​90697](openclaw/openclaw#90697)) Thanks [@​openperf](https://github.com/openperf). - Parallel is now a bundled `web_search` provider with `PARALLEL_API_KEY` discovery, guarded endpoint handling, cache-safe session ids, onboarding picker support, and docs. ([#​85158](openclaw/openclaw#85158)) Thanks [@​NormallyGaussian](https://github.com/NormallyGaussian). - Google Vertex ADC users get static catalog rows and runtime model resolution again, while single-provider cooldown recovery and memory adapter status checks are more reliable. ([#​90506](openclaw/openclaw#90506), [#​90609](openclaw/openclaw#90609), [#​90717](openclaw/openclaw#90717), [#​90816](openclaw/openclaw#90816)) Thanks [@​849261680](https://github.com/849261680). - Matrix can preflight voice notes before mention gating, preserve thread reads/replies through Matrix relations pagination, and carry QA coverage for voice and thread flows. ([#​78016](openclaw/openclaw#78016), [#​90415](openclaw/openclaw#90415)) - Auth and plugin install state is more durable: auth profiles now live in SQLite, official npm plugin install records keep their trusted pins, and prerelease fallback integrity checks avoid carrying stale integrity forward. ([#​89102](openclaw/openclaw#89102), [#​88585](openclaw/openclaw#88585)) - macOS node mode no longer silently self-reconnects away from a healthy direct Gateway session, reducing unexpected companion app session churn. ([#​90668](openclaw/openclaw#90668), [#​90815](openclaw/openclaw#90815)) Thanks [@​vrurg](https://github.com/vrurg). - Upgrade and service paths are safer: cron legacy JSON stores migrate during doctor preflight, service env placeholders no longer mask state-dir secrets, WhatsApp startup waits are bounded, and disabled WhatsApp accounts tear down on config reload. ([#​90072](openclaw/openclaw#90072), [#​90208](openclaw/openclaw#90208), [#​90277](openclaw/openclaw#90277), [#​90488](openclaw/openclaw#90488), [#​90486](openclaw/openclaw#90486), [#​87951](openclaw/openclaw#87951), [#​87965](openclaw/openclaw#87965)) Thanks [@​MonkeyLeeT](https://github.com/MonkeyLeeT), [@​sallyom](https://github.com/sallyom), [@​mcaxtr](https://github.com/mcaxtr), and [@​MukundaKatta](https://github.com/MukundaKatta). ##### Changes - Search/providers: add the Parallel bundled web-search plugin, live provider tests, registration contracts, onboarding/docs wiring, and guarded `api.parallel.ai/v1/search` support. ([#​85158](openclaw/openclaw#85158)) Thanks [@​NormallyGaussian](https://github.com/NormallyGaussian). - Matrix/channels: add voice-message preflight and thread-aware read/reply behavior, including Matrix QA scenario wiring and docs for voice-message behavior. ([#​78016](openclaw/openclaw#78016), [#​90415](openclaw/openclaw#90415)) - Skills/ClawHub: install ClawHub skills backed by GitHub repositories through the resolved install API, download the pinned GitHub commit, keep install-policy checks, and report install telemetry after success. ([#​90478](openclaw/openclaw#90478)) Thanks [@​Patrick-Erichsen](https://github.com/Patrick-Erichsen). - Google Chat/channels: add native approval card actions and click handling so Google Chat approvals use platform-native cards instead of generic message flow. - Mobile: Android provider/model screens now surface expiring, unavailable, unresolved, and attention states more clearly, while iOS settings and Talk tabs keep diagnostics, gateway rows, attachment labels, and unavailable Talk controls reachable. - Memory: QMD search can use the new rerank toggle, and memory adapter status uses the resolved default model identity when checking plain status. ([#​61834](openclaw/openclaw#61834)) - Docs/tooling: add Parallel search docs, refresh weather-skill guidance toward `web_fetch`, clarify legacy `openai-codex` auth, document release/test helper scripts, and tighten changed-test routing docs for CI/debugging work. ([#​90028](openclaw/openclaw#90028), [#​90250](openclaw/openclaw#90250)) Thanks [@​fuller-stack-dev](https://github.com/fuller-stack-dev). - Release/process: switch release trains to `YYYY.M.PATCH` monthly patch numbering, keep pre-transition tags compatible, and pin the June 2026 floor at `2026.6.5` after the published beta. - Platform maintenance: refresh Android, Swift/macOS, Docker, CodeQL, Buildx, Docker build/push, and Codex Action dependencies for this release train. ([#​74980](openclaw/openclaw#74980), [#​81757](openclaw/openclaw#81757), [#​86481](openclaw/openclaw#86481), [#​86483](openclaw/openclaw#86483), [#​90601](openclaw/openclaw#90601)) - QQBot: add `/bot-group-allways on|off` slash command (with named-account and default-account support) to toggle whether group messages require an `@mention` before the bot replies, and clear the runtime config snapshot after the write so the new account-level `defaultRequireMention` takes effect immediately without restart. ([#​91423](openclaw/openclaw#91423)) Thanks [@​cxyhhhhh](https://github.com/cxyhhhhh). ##### Fixes - Channel content boundaries: QQBot now strips reasoning/thinking tags before sending, preserving final answers while hiding internal model narration from users. ([#​89913](openclaw/openclaw#89913), [#​90132](openclaw/openclaw#90132)) Thanks [@​openperf](https://github.com/openperf). - Agents/MCP/providers: coerce non-text/image MCP tool-result blocks before they reach provider converters, preserving valid images and turning richer MCP content into text instead of malformed image blocks. ([#​90710](openclaw/openclaw#90710), [#​90728](openclaw/openclaw#90728)) Thanks [@​RanSHammer](https://github.com/RanSHammer) and [@​849261680](https://github.com/849261680). - Anthropic/Codex/ACP/agent recovery: defer Anthropic stream start events until `message_start`, strip stale compaction thinking signatures before Anthropic replay, detect unsigned thinking-only stalls, refresh prompt fences after compaction writes, reject empty completion handoffs, preserve parent streaming-off overrides/shared progress commentary, forward heartbeat metadata to context-engine hooks, and cover Codex session/thread migration edge cases. ([#​90667](openclaw/openclaw#90667), [#​90697](openclaw/openclaw#90697), [#​90163](openclaw/openclaw#90163), [#​90108](openclaw/openclaw#90108), [#​89874](openclaw/openclaw#89874), [#​89505](openclaw/openclaw#89505), [#​90632](openclaw/openclaw#90632), [#​89302](openclaw/openclaw#89302), [#​90729](openclaw/openclaw#90729), [#​90317](openclaw/openclaw#90317), [#​90319](openclaw/openclaw#90319)) Thanks [@​openperf](https://github.com/openperf), [@​100yenadmin](https://github.com/100yenadmin), and [@​ooiuuii](https://github.com/ooiuuii). - Provider/model resolution: preserve Google Vertex ADC auth markers in generated catalogs, re-probe a single-provider primary after cooldown, share Codex model visibility, fail closed for unknown model auth, preserve Codex alias availability, keep unresolved profile refs unknown, and avoid resolving auth while listing models. ([#​90506](openclaw/openclaw#90506), [#​90609](openclaw/openclaw#90609), [#​90717](openclaw/openclaw#90717), [#​90702](openclaw/openclaw#90702)) Thanks [@​849261680](https://github.com/849261680). - Gateway/macOS/mobile: avoid duplicate Gateway probe warnings by identity, rate-limit node pairing requests while preserving paired-node reconnects, keep macOS node mode on a healthy direct Gateway session, keep iOS diagnostics and gateway rows reachable, and avoid Linux ARM Gradle resource tasks during Android builds. ([#​85791](openclaw/openclaw#85791), [#​90147](openclaw/openclaw#90147), [#​90668](openclaw/openclaw#90668), [#​90815](openclaw/openclaw#90815)) Thanks [@​giodl73-repo](https://github.com/giodl73-repo) and [@​vrurg](https://github.com/vrurg). - TUI/chat/Workboard/auto-reply: optimistic user messages stay stable across stale history reloads, runId reassignment, and abort windows instead of disappearing, jumping, or lingering as ghost rows; Workboard stale lifecycle bulk updates no longer overwrite newer status/provenance; message-tool sends now count as delivery. ([#​86205](openclaw/openclaw#86205), [#​89600](openclaw/openclaw#89600), [#​88592](openclaw/openclaw#88592), [#​90123](openclaw/openclaw#90123)) Thanks [@​RomneyDa](https://github.com/RomneyDa). - Cron/update/service env: doctor config preflight now migrates legacy cron JSON stores into SQLite before runtime reads, service env planning skips unresolved placeholders that would mask state-dir `.env` values, and session transcript rewrites keep registry markers/discriminants consistent. ([#​90072](openclaw/openclaw#90072), [#​90208](openclaw/openclaw#90208), [#​90277](openclaw/openclaw#90277), [#​90488](openclaw/openclaw#90488)) Thanks [@​MonkeyLeeT](https://github.com/MonkeyLeeT) and [@​sallyom](https://github.com/sallyom). - Security/config/tooling: guard MCP HTTP redirects, protect global agent config defaults, and keep release/test/tooling proof failures bounded and explicit. ([#​89732](openclaw/openclaw#89732), [#​90145](openclaw/openclaw#90145)) - Channels: WhatsApp restarts when per-account config changes, bounds background startup waits, closes failed sockets, and preserves reconnect behavior; Mattermost slash commands keep their state on `globalThis`; Feishu streaming cards preserve full merged content; voice-call tracks Twilio streams after connect; ClickClack reply tools respect `toolsAllow`. ([#​87951](openclaw/openclaw#87951), [#​87965](openclaw/openclaw#87965), [#​90486](openclaw/openclaw#90486), [#​68113](openclaw/openclaw#68113), [#​90534](openclaw/openclaw#90534), [#​90181](openclaw/openclaw#90181), [#​90607](openclaw/openclaw#90607), [#​89500](openclaw/openclaw#89500)) Thanks [@​MukundaKatta](https://github.com/MukundaKatta), [@​mcaxtr](https://github.com/mcaxtr), [@​infoanton](https://github.com/infoanton), [@​mushuiyu886](https://github.com/mushuiyu886), and [@​sahibzada-allahyar](https://github.com/sahibzada-allahyar). - Feishu: retry transient send rate-limit errors (HTTP 429, per-chat code 230020, tenant-level code 11232) with linear backoff, including SDK responses that fulfill with rate-limit bodies instead of throwing, and route streaming-card sends through the retry wrapper. ([#​89659](openclaw/openclaw#89659)) Thanks [@​ladygege](https://github.com/ladygege). - Release/CI/E2E: main CI guard drift, PR merge diff scoping, live Docker credential staging, base-image qualification, installer Docker classification, Playwright dependency install recovery, API-key auth for Codex live Docker lanes, Parallels option terminators, and JSON-mode progress handling are tighter so release proof fails cleaner. ([#​90532](openclaw/openclaw#90532), [#​90287](openclaw/openclaw#90287), [#​90058](openclaw/openclaw#90058)) Thanks [@​RomneyDa](https://github.com/RomneyDa), [@​hxy91819](https://github.com/hxy91819), and [@​mrunalp](https://github.com/mrunalp). - Release/CI/E2E: Docker E2E and live Docker harness runs now apply default memory, CPU, and process ceilings while preserving explicit per-lane overrides. - Release/CI/E2E: plugin lifecycle matrix resource sampling now fails phases that exceed RSS, wall-clock, or CPU ceilings instead of only logging the measurements. - Release/CI/E2E: Codex npm plugin live assertions now cap transcript discovery and diagnostic log reads so failure proof stays bounded. - Tests/state isolation: QA Lab valid-tool-call metrics now require runtime tool-call evidence when runtime parity data is available instead of counting tool-backed scenario pass status alone. - Tests/state isolation: QA Lab runtime parity now fails planned-only tool-call rows without matching tool results instead of treating matching mock plans as real tool evidence. - Tests/state isolation: provider, media, auth, cron, task, session, sandbox, Gateway, and Codex timeout fixtures now scope more home/state/env data per test, reducing cross-test leakage and making release validation failures less noisy. ([#​90027](openclaw/openclaw#90027), [#​89974](openclaw/openclaw#89974)) </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about these updates again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4xMDEuMSIsInVwZGF0ZWRJblZlciI6IjQzLjEwMS4xIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJyZW5vdmF0ZS9jb250YWluZXIiLCJ0eXBlL3BhdGNoIl19--> Reviewed-on: https://git.erwanleboucher.dev/eleboucher/homelab/pulls/963
* fix: rate limit node pairing requests * fix: preserve paired node reconnects
Summary
AI-assisted: yes.
Linked context
No public issue is linked in this PR body.
Real behavior proof (required for external PRs)
pnpm format -- src/gateway/node-connect-reconcile.ts src/gateway/node-connect-reconcile.test.ts src/gateway/server/ws-connection/message-handler.ts src/gateway/server.node-pairing-rate-limit.test.ts;node scripts/run-vitest.mjs src/gateway/node-connect-reconcile.test.ts src/gateway/server.node-pairing-rate-limit.test.ts src/gateway/server.preauth-bootstrap-token-rate-limit.test.ts src/gateway/auth-rate-limit.test.ts;node scripts/run-oxlint.mjs src/gateway/auth-rate-limit.ts src/gateway/node-connect-reconcile.ts src/gateway/node-connect-reconcile.test.ts src/gateway/server/ws-connection/message-handler.ts src/gateway/server.node-pairing-rate-limit.test.ts;git diff --check HEAD~1..HEAD.maxAttempts: 3, eight concurrent first-time node connects create three pending node-pairing requests and return five structured auth rate-limit responses; a paired node reconnecting with an upgraded capability declaration remains connected and does not create an extra pending request while the bucket is exhausted.requestNodePairingdirectly without a node-pairing rate-limit scope.Tests and validation
Commands run:
pnpm format -- src/gateway/auth-rate-limit.ts src/gateway/server/ws-connection/message-handler.ts src/gateway/server.node-pairing-rate-limit.test.tsnode scripts/run-vitest.mjs src/gateway/server.node-pairing-rate-limit.test.ts src/gateway/server.preauth-bootstrap-token-rate-limit.test.ts src/gateway/auth-rate-limit.test.tsnode scripts/run-oxlint.mjs src/gateway/auth-rate-limit.ts src/gateway/server/ws-connection/message-handler.ts src/gateway/server.node-pairing-rate-limit.test.tsgit diff --checkpnpm format -- src/gateway/node-connect-reconcile.ts src/gateway/node-connect-reconcile.test.ts src/gateway/server/ws-connection/message-handler.ts src/gateway/server.node-pairing-rate-limit.test.tsnode scripts/run-vitest.mjs src/gateway/node-connect-reconcile.test.ts src/gateway/server.node-pairing-rate-limit.test.ts src/gateway/server.preauth-bootstrap-token-rate-limit.test.ts src/gateway/auth-rate-limit.test.tsnode scripts/run-oxlint.mjs src/gateway/auth-rate-limit.ts src/gateway/node-connect-reconcile.ts src/gateway/node-connect-reconcile.test.ts src/gateway/server/ws-connection/message-handler.ts src/gateway/server.node-pairing-rate-limit.test.tsgit diff --check HEAD~1..HEADscripts/pr review-tests 90147 src/gateway/node-connect-reconcile.test.ts src/gateway/server.node-pairing-rate-limit.test.ts src/gateway/server.preauth-bootstrap-token-rate-limit.test.ts src/gateway/auth-rate-limit.test.tstimeout 1800 .agents/skills/autoreview/scripts/autoreview --mode branch --base origin/mainRegression coverage added:
src/gateway/server.node-pairing-rate-limit.test.tscovers concurrent first-time node-pairing attempts and paired reconnect upgrades through the real WebSocket connect path;src/gateway/node-connect-reconcile.test.tscovers preserving the approved effective surface when a paired upgrade request is throttled.What failed before this fix: the node connect path had no node-pairing limiter before
requestNodePairingentered the shared storage lock.Risk checklist
Did user-visible behavior change? (
Yes/No)Yes. Repeated first-time node pairing attempts from the same rate-limit identity can now receive the existing structured rate-limit connect error when the configured bucket is exhausted. Paired reconnects keep their approved surface when only the fresh upgrade request is throttled.
Did config, environment, or migration behavior change? (
Yes/No)No. This reuses the existing gateway auth rate-limit configuration and defaults.
Did security, auth, secrets, network, or tool execution behavior change? (
Yes/No)Yes. This changes the gateway security/auth boundary for node-role pairing requests by adding per-IP throttling before node-pairing storage writes.
What is the highest-risk area?
Unusually low
gateway.auth.rateLimitsettings can throttle rapid first-time node onboarding from the same rate-limit identity.How is that risk mitigated?
The bucket uses the existing gateway rate-limit configuration, loopback exemption behavior, browser-origin limiter selection, and serialized attempt helper already used by auth-sensitive connect paths. Existing paired reconnects keep their approved surface if an upgrade pairing request is throttled.
Current review state
What is the next action?
Wait for GitHub Actions CI and fresh bot review on the latest head.
What is still waiting on author, maintainer, CI, or external proof?
GitHub Actions CI is pending on the latest pushed head; external live non-loopback proof was not run.
Which bot or reviewer comments were addressed?
Addressed ClawSweeper feedback to preserve paired reconnects on their approved surface when node-pairing request throttling trips.