fix(gateway): increase local loopback probe budget to prevent false negative timeouts#45984
fix(gateway): increase local loopback probe budget to prevent false negative timeouts#45984caicongyang wants to merge 1 commit intoopenclaw:mainfrom
Conversation
…egative timeouts The gateway probe was giving local loopback connections only 800ms to connect and authenticate, but the gateway client has a default connectChallengeTimeoutMs of 4000ms. This caused false negative timeouts on Windows where the WebSocket handshake can take longer. Changed the local loopback probe budget from 'Math.min(800, overallMs)' to 'Math.max(4000, Math.min(overallMs, 8000))' to ensure enough time for the client's internal authentication to complete. Fixes: openclaw#45940
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 455329d8e4
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| // too short and caused false negative timeouts on Windows. | ||
| if (kind === "localLoopback") { | ||
| return Math.min(800, overallMs); | ||
| return Math.max(4000, Math.min(overallMs, 8000)); |
There was a problem hiding this comment.
Keep local probe within requested overall timeout
This change makes localLoopback probes ignore small --timeout values by forcing at least 4000ms (Math.max(4000, ...)), so openclaw gateway probe --timeout 1000 can still wait ~4s when localhost is unreachable or slow. Since the CLI advertises this flag as an "Overall probe budget in ms" (src/cli/gateway-cli/register.ts), this is a behavioral regression that can break automation expecting the command to honor the requested upper bound.
Useful? React with 👍 / 👎.
Greptile SummaryThis PR increases the local loopback probe budget from Changes:
Issues found:
Confidence Score: 2/5
Prompt To Fix All With AIThis is a comment left during a code review.
Path: src/commands/gateway-status/helpers.ts
Line: 124
Comment:
**Probe budget can exceed `overallMs`, ignoring the command timeout**
The new formula `Math.max(4000, Math.min(overallMs, 8000))` allows the probe budget to **exceed** the `overallMs` argument whenever `overallMs < 4000`. The default `overallTimeoutMs` in `gateway-status.ts` is **3000ms** (line 42: `parseTimeoutMs(opts.timeout, 3000)`), so in the typical invocation this returns 4000ms — a probe budget 33% larger than the overall command timeout.
Since all targets are probed in parallel via `Promise.all` (line 155 of `gateway-status.ts`), this means:
- The command will routinely take ~4s to complete instead of the expected ~3s default.
- Users who explicitly pass `--timeout=2000` (or any value < 4000) will see their setting silently overridden and have to wait at least 4 seconds.
The old formula `Math.min(800, overallMs)` always respected `overallMs`. The fix should keep that invariant. One approach is to also raise the default `overallTimeoutMs` alongside the probe budget increase — e.g., change the default from 3000 to 5000ms in `gateway-status.ts` — so there is headroom for the 4000ms probe without violating the overall budget:
```typescript
// In gateway-status.ts, line 42:
const overallTimeoutMs = parseTimeoutMs(opts.timeout, 5000);
// In helpers.ts, keep the probe bounded by overall:
return Math.min(Math.max(4000, overallMs), 8000);
```
Alternatively, clamp the probe budget to `overallMs` while preserving the intent:
```suggestion
return Math.min(Math.max(4000, overallMs), 8000);
```
(This still gives 4000ms when `overallMs >= 4000`, and for `overallMs < 4000` it returns exactly `overallMs` — which preserves the user timeout while the root issue is better addressed by raising the default overall timeout.)
How can I resolve this? If you propose a fix, please make it concise.Last reviewed commit: 455329d |
| // too short and caused false negative timeouts on Windows. | ||
| if (kind === "localLoopback") { | ||
| return Math.min(800, overallMs); | ||
| return Math.max(4000, Math.min(overallMs, 8000)); |
There was a problem hiding this comment.
Probe budget can exceed overallMs, ignoring the command timeout
The new formula Math.max(4000, Math.min(overallMs, 8000)) allows the probe budget to exceed the overallMs argument whenever overallMs < 4000. The default overallTimeoutMs in gateway-status.ts is 3000ms (line 42: parseTimeoutMs(opts.timeout, 3000)), so in the typical invocation this returns 4000ms — a probe budget 33% larger than the overall command timeout.
Since all targets are probed in parallel via Promise.all (line 155 of gateway-status.ts), this means:
- The command will routinely take ~4s to complete instead of the expected ~3s default.
- Users who explicitly pass
--timeout=2000(or any value < 4000) will see their setting silently overridden and have to wait at least 4 seconds.
The old formula Math.min(800, overallMs) always respected overallMs. The fix should keep that invariant. One approach is to also raise the default overallTimeoutMs alongside the probe budget increase — e.g., change the default from 3000 to 5000ms in gateway-status.ts — so there is headroom for the 4000ms probe without violating the overall budget:
// In gateway-status.ts, line 42:
const overallTimeoutMs = parseTimeoutMs(opts.timeout, 5000);
// In helpers.ts, keep the probe bounded by overall:
return Math.min(Math.max(4000, overallMs), 8000);Alternatively, clamp the probe budget to overallMs while preserving the intent:
| return Math.max(4000, Math.min(overallMs, 8000)); | |
| return Math.min(Math.max(4000, overallMs), 8000); |
(This still gives 4000ms when overallMs >= 4000, and for overallMs < 4000 it returns exactly overallMs — which preserves the user timeout while the root issue is better addressed by raising the default overall timeout.)
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/commands/gateway-status/helpers.ts
Line: 124
Comment:
**Probe budget can exceed `overallMs`, ignoring the command timeout**
The new formula `Math.max(4000, Math.min(overallMs, 8000))` allows the probe budget to **exceed** the `overallMs` argument whenever `overallMs < 4000`. The default `overallTimeoutMs` in `gateway-status.ts` is **3000ms** (line 42: `parseTimeoutMs(opts.timeout, 3000)`), so in the typical invocation this returns 4000ms — a probe budget 33% larger than the overall command timeout.
Since all targets are probed in parallel via `Promise.all` (line 155 of `gateway-status.ts`), this means:
- The command will routinely take ~4s to complete instead of the expected ~3s default.
- Users who explicitly pass `--timeout=2000` (or any value < 4000) will see their setting silently overridden and have to wait at least 4 seconds.
The old formula `Math.min(800, overallMs)` always respected `overallMs`. The fix should keep that invariant. One approach is to also raise the default `overallTimeoutMs` alongside the probe budget increase — e.g., change the default from 3000 to 5000ms in `gateway-status.ts` — so there is headroom for the 4000ms probe without violating the overall budget:
```typescript
// In gateway-status.ts, line 42:
const overallTimeoutMs = parseTimeoutMs(opts.timeout, 5000);
// In helpers.ts, keep the probe bounded by overall:
return Math.min(Math.max(4000, overallMs), 8000);
```
Alternatively, clamp the probe budget to `overallMs` while preserving the intent:
```suggestion
return Math.min(Math.max(4000, overallMs), 8000);
```
(This still gives 4000ms when `overallMs >= 4000`, and for `overallMs < 4000` it returns exactly `overallMs` — which preserves the user timeout while the root issue is better addressed by raising the default overall timeout.)
How can I resolve this? If you propose a fix, please make it concise.|
Closing this as already fixed/superseded by #47533. Current Maintainer verification on current
|
Description
The gateway probe was giving local loopback connections only 800ms to connect and authenticate, but the gateway client has a default
connectChallengeTimeoutMsof 4000ms. This caused false negative timeouts on Windows where the WebSocket handshake can take longer than expected.Changes
Changed the local loopback probe budget from
Math.min(800, overallMs)toMath.max(4000, Math.min(overallMs, 8000))to ensure enough time for the client's internal authentication to complete.Related Issue
Fixes: #45940
Testing
This fix ensures that local loopback probe connections have at least 4 seconds to complete the WebSocket handshake, matching the gateway client's internal timeout.