Skip to content

fix(onboard): widen inference verification timeouts on WSL2 (Fixes #987)#1998

Merged
ericksoa merged 2 commits into
mainfrom
fix/987-wsl2-inference-verification
Apr 17, 2026
Merged

fix(onboard): widen inference verification timeouts on WSL2 (Fixes #987)#1998
ericksoa merged 2 commits into
mainfrom
fix/987-wsl2-inference-verification

Conversation

@ericksoa

@ericksoa ericksoa commented Apr 17, 2026

Copy link
Copy Markdown
Contributor

Summary

  • WSL2 timeout fix: Widens getValidationProbeCurlArgs() from 10s/15s to 20s/30s when isWsl() detects WSL2, preventing false "failed to connect" errors caused by slower DNS resolution and TLS handshakes through the virtualized network stack.
  • Single retry with backoff: When all probes fail with curl exit codes 28 (timeout), 6, or 7 (connection failure), retries once with doubled timeouts against /chat/completions before reporting failure.
  • Actionable WSL2 error message: After retry exhaustion on WSL2, appends a hint suggesting --skip-verify so users aren't stuck on a false negative.

Closes #987

Test plan

  • getValidationProbeCurlArgs({ isWsl: true }) returns ["--connect-timeout", "20", "--max-time", "30"]
  • getValidationProbeCurlArgs({ isWsl: false }) returns ["--connect-timeout", "10", "--max-time", "15"]
  • Retry guard matches exactly curl exit codes 28, 6, 7 (not 0 or 22)
  • Retry doubles timeout values
  • WSL2 hint with --skip-verify present in failure path
  • All existing tests pass (1387 tests, 0 regressions)

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Improvements

    • Optimized API endpoint validation with platform-aware timeout values, providing enhanced support for Windows Subsystem for Linux (WSL) environments
    • Implemented automatic retry mechanism for Chat Completions endpoint checks to improve reliability
    • Enhanced error reporting with environment-specific diagnostic information
  • Tests

    • Added comprehensive test coverage for endpoint validation in WSL environments

WSL2's virtualized network stack can cause DNS resolution and TCP
connections to take significantly longer, making the 10s connect timeout
expire before TLS completes. This widens probe timeouts to 20s/30s on
WSL2, adds a single retry with doubled timeouts on curl exit codes 28
(timeout), 6, and 7 (connection failure), and surfaces an actionable
hint suggesting --skip-verify when the retry also fails.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Apr 17, 2026

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

This PR addresses a WSL2 endpoint verification regression by introducing configurable curl timeouts and automatic retry logic. The getValidationProbeCurlArgs function now accepts options to differentiate timeout values between WSL (20s/30s) and non-WSL (10s/15s) environments, while probeOpenAiLikeEndpoint implements a single retry attempt with doubled timeouts for transient connection failures.

Changes

Cohort / File(s) Summary
Timeout Configuration & Retry Logic
src/lib/onboard.ts
Modified getValidationProbeCurlArgs to accept opts parameter and return environment-specific timeout values; extended probeOpenAiLikeEndpoint with single-attempt retry logic for specific curl status codes (28, 6, 7) using doubled timeout arguments; added WSL2-specific error hint in failure messages; exported getValidationProbeCurlArgs.
WSL2 Verification Tests
test/wsl2-probe-timeout.test.ts
New test suite validating timeout argument generation for WSL and non-WSL environments, verifying retry guard conditions via compiled output inspection, and confirming WSL2-specific error messaging and flag behavior.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 A hop through timeouts, WSL's plight,
Retries bloom when connections take flight,
Twenty seconds now, for Windows folk true,
NVIDIA's embrace shines bright anew!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely describes the main change: widening inference verification timeouts on WSL2 and referencing the resolved issue #987.
Linked Issues check ✅ Passed The PR successfully implements all coding objectives from issue #987: WSL2 timeout detection, wider timeout values (20s/30s vs 10s/15s), single retry with doubled timeouts for specific curl exit codes, WSL2-specific error hints, and comprehensive test coverage.
Out of Scope Changes check ✅ Passed All changes are directly scoped to addressing issue #987: timeout handling in getValidationProbeCurlArgs, retry logic in probeOpenAiLikeEndpoint, WSL2 detection and hints, and corresponding tests. No unrelated or out-of-scope modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/987-wsl2-inference-verification

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
test/wsl2-probe-timeout.test.ts (1)

65-68: Source-scanning regex is fragile.

This regex requires exact formatting of the guard function. Minification, code formatting changes, or even variable renaming would break this test without any actual behavioral change.

Consider adding a comment acknowledging this fragility, or extract the guard logic into a separate, exported helper that can be tested directly (similar to how getValidationProbeCurlArgs was exported).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/wsl2-probe-timeout.test.ts` around lines 65 - 68, The test is brittle
because it checks the exact source formatting of the guard using a fragile
regex; update the code by extracting the guard logic into a named, exported
helper (e.g., export a function isTimeoutOrConnFailure) and update the test to
import and assert that helper directly (similar to getValidationProbeCurlArgs),
or at minimum add an inline comment in test/wsl2-probe-timeout.test.ts
explaining the regex fragility and why it’s used; locate references to
isTimeoutOrConnFailure and onboardSrc to implement the extraction and update the
test to call the exported helper.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@test/wsl2-probe-timeout.test.ts`:
- Around line 65-68: The test is brittle because it checks the exact source
formatting of the guard using a fragile regex; update the code by extracting the
guard logic into a named, exported helper (e.g., export a function
isTimeoutOrConnFailure) and update the test to import and assert that helper
directly (similar to getValidationProbeCurlArgs), or at minimum add an inline
comment in test/wsl2-probe-timeout.test.ts explaining the regex fragility and
why it’s used; locate references to isTimeoutOrConnFailure and onboardSrc to
implement the extraction and update the test to call the exported helper.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 17a5e2c7-ff4d-44a0-a8ea-8751150c563c

📥 Commits

Reviewing files that changed from the base of the PR and between c133bab and 805a24d.

📒 Files selected for processing (2)
  • src/lib/onboard.ts
  • test/wsl2-probe-timeout.test.ts

@ericksoa ericksoa merged commit 56ee83f into main Apr 17, 2026
14 checks passed
@wscurran wscurran added the bug-fix PR fixes a bug or regression label Jun 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug-fix PR fixes a bug or regression

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Window x86 WSL2][Regression] NemoClaw/OpenShell inference verification reports false connectivity failure for NVIDIA Endpoints during onboarding

3 participants