fix(onboard): increase verification timeout and reduce max_tokens for custom provider probes by Sid-Qin · Pull Request #27380 · openclaw/openclaw

Sid-Qin · 2026-02-26T09:27:49Z

Summary

Problem: Onboard wizard fails to verify local llama.cpp servers (Custom Provider) — the verification request times out after 10 seconds because the server needs to load a large model and generate up to 1024 tokens.
Why it matters: Users with local LLM servers (llama.cpp, vLLM, etc.) cannot complete the onboard wizard, even though their API is fully functional.
What changed: src/commands/onboard-custom.ts — raised VERIFY_TIMEOUT_MS from 10s to 30s, reduced max_tokens from 1024 to 1 (verification only needs a single token), added explicit stream: false to both OpenAI and Anthropic probes.
What did NOT change: The verification flow logic, error handling, retry prompts, or any other onboard wizard behavior.

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Closes [Bug] Onboard wizard fails to verify local llama.cpp server (Custom Provider) — but API works fine from container #27346

User-visible / Behavior Changes

Custom provider verification during onboard now tolerates slower local LLM servers (up to 30s instead of 10s).
Verification requests generate only 1 token instead of 1024, making the probe much faster.

Security Impact (required)

New permissions/capabilities? `No`
Secrets/tokens handling changed? `No`
New/changed network calls? `No` (same endpoints, just different timeout and body params)
Command/tool execution surface changed? `No`
Data access scope changed? `No`

Repro + Verification

Environment

OS: Ubuntu 24.04 (Docker)
Runtime: Node 22+
Integration/channel: CLI onboard wizard

Steps

Run llama-server with a large model (e.g. Qwen3.5-27B) on host
Run `openclaw` onboard wizard in Docker
Select Custom Provider → Base URL: `http://172.17.0.1:8080/v1\` → Model ID: `Qwen3.5-27B-UD-Q8_K_XL.gguf`

Expected

Verification succeeds (model responds with 1 token within 30s)

Actual

Before fix: "Verification failed: This operation was aborted" after 10s
After fix: Verification succeeds because max_tokens=1 completes quickly and timeout is 30s

Evidence

Root cause:

`VERIFY_TIMEOUT_MS` was 10s — too short for large local models that need to load into GPU memory
`max_tokens: 1024` forced the server to generate a full response, adding unnecessary latency
The verification only needs to confirm the API is reachable and the model ID is valid — 1 token suffices

// Before
const VERIFY_TIMEOUT_MS = 10000;
// body: { max_tokens: 1024 }

// After
const VERIFY_TIMEOUT_MS = 30_000;
// body: { max_tokens: 1, stream: false }

All 14 tests pass after updating test expectations.

Human Verification (required)

Verified scenarios: All 14 unit tests pass, including timeout test updated to 30s
Edge cases checked: OpenAI probe, Anthropic probe, timeout abort, retry flow
What I did not verify: Live testing with actual llama.cpp server (no GPU available)

Compatibility / Migration

Backward compatible? `Yes`
Config/env changes? `No`
Migration needed? `No`

Failure Recovery (if this breaks)

How to disable/revert: Revert changes in `onboard-custom.ts`
Files/config to restore: `src/commands/onboard-custom.ts`, `src/commands/onboard-custom.test.ts`
Known bad symptoms: If reverted, large local models will continue to fail verification

Risks and Mitigations

Risk: 30s timeout means users wait longer if the server is truly unreachable. Mitigated by max_tokens=1 making successful probes much faster than before.

greptile-apps · 2026-02-26T09:30:02Z

Greptile Summary

Fixed custom provider verification timeouts for local LLM servers by increasing timeout from 10s to 30s and reducing verification token generation from 1024 to 1.

Increased VERIFY_TIMEOUT_MS from 10s to 30s to accommodate large model loading times on local llama.cpp/vLLM servers
Reduced max_tokens from 1024 to 1 in verification probes — since verification only needs to confirm API reachability and valid model ID, generating a single token is sufficient and much faster
Added explicit stream: false to both OpenAI and Anthropic verification requests for clarity
Updated all test expectations to match new timeout and token values

The changes are minimal, well-targeted, and properly address the reported issue where users with functional local LLM servers couldn't complete onboarding due to premature timeout. The trade-off (users wait longer for truly unreachable servers) is acknowledged and reasonable.

Confidence Score: 5/5

This PR is safe to merge with minimal risk
Changes are narrowly scoped to verification timeout/token configuration with proper test coverage. No breaking changes, security issues, or logical errors. The timeout increase and token reduction are well-justified optimizations that directly address the reported issue.
No files require special attention

_{Last reviewed commit: 83b6ef0}

… custom provider probes The onboard wizard sends a chat-completion request to verify custom providers. With max_tokens: 1024 and a 10 s timeout, large local models (e.g. Qwen3.5-27B on llama.cpp) routinely time out because the server needs to load the model and generate up to 1024 tokens before responding. Changes: - Raise VERIFY_TIMEOUT_MS from 10 s to 30 s - Lower max_tokens from 1024 to 1 (verification only needs a single token to confirm the API is reachable and the model ID is valid) - Add explicit stream: false to both OpenAI and Anthropic probes Closes openclaw#27346 Made-with: Cursor

philipp-spiess · 2026-02-27T21:51:52Z

Thanks everyone for the iteration here. This PR now consolidates the onboarding verification reliability fix for custom providers.

What’s included:

onboarding verification timeout increase in the custom-provider verification flow
lean verification probe behavior for custom-provider checks
updated tests covering verification behavior
changelog note for this user-facing fix

Validation:

CI is green on required checks and remaining workflows are in progress
local smoke passed against Ollama (nate/instinct:latest)

If this looks good, this should be the one we merge. Thank you all for the prior PRs and review cycles.

philipp-spiess · 2026-02-27T21:52:57Z

Merged, thank you everyone.

Post-merge notes:

Local smoke validation with Ollama passed (nate/instinct:latest).
Superseded PRs were closed with thanks: fix: increase onboard verify timeout to 120s for local LLMs #29043, fix: increase VERIFY_TIMEOUT_MS to 2 minutes for local LLMs #29062, fix: increase VERIFY_TIMEOUT_MS from 10s to 120s for local LLMs #29102.
Follow-up tracking issue created for endpoint-aware timeout policy + host coverage: Onboarding custom-provider verification: endpoint-aware timeout policy and host coverage #29155.

Appreciate all the iterations and review help here.

… custom provider probes (openclaw#27380) * fix(onboard): increase verification timeout and reduce max_tokens for custom provider probes The onboard wizard sends a chat-completion request to verify custom providers. With max_tokens: 1024 and a 10 s timeout, large local models (e.g. Qwen3.5-27B on llama.cpp) routinely time out because the server needs to load the model and generate up to 1024 tokens before responding. Changes: - Raise VERIFY_TIMEOUT_MS from 10 s to 30 s - Lower max_tokens from 1024 to 1 (verification only needs a single token to confirm the API is reachable and the model ID is valid) - Add explicit stream: false to both OpenAI and Anthropic probes Closes openclaw#27346 Made-with: Cursor * Changelog: note custom-provider onboarding verification fix --------- Co-authored-by: Philipp Spiess <hello@philippspiess.com>

… custom provider probes (openclaw#27380) * fix(onboard): increase verification timeout and reduce max_tokens for custom provider probes The onboard wizard sends a chat-completion request to verify custom providers. With max_tokens: 1024 and a 10 s timeout, large local models (e.g. Qwen3.5-27B on llama.cpp) routinely time out because the server needs to load the model and generate up to 1024 tokens before responding. Changes: - Raise VERIFY_TIMEOUT_MS from 10 s to 30 s - Lower max_tokens from 1024 to 1 (verification only needs a single token to confirm the API is reachable and the model ID is valid) - Add explicit stream: false to both OpenAI and Anthropic probes Closes openclaw#27346 Made-with: Cursor * Changelog: note custom-provider onboarding verification fix --------- Co-authored-by: Philipp Spiess <hello@philippspiess.com> (cherry picked from commit c6cd434)

… custom provider probes (openclaw#27380) * fix(onboard): increase verification timeout and reduce max_tokens for custom provider probes The onboard wizard sends a chat-completion request to verify custom providers. With max_tokens: 1024 and a 10 s timeout, large local models (e.g. Qwen3.5-27B on llama.cpp) routinely time out because the server needs to load the model and generate up to 1024 tokens before responding. Changes: - Raise VERIFY_TIMEOUT_MS from 10 s to 30 s - Lower max_tokens from 1024 to 1 (verification only needs a single token to confirm the API is reachable and the model ID is valid) - Add explicit stream: false to both OpenAI and Anthropic probes Closes openclaw#27346 Made-with: Cursor * Changelog: note custom-provider onboarding verification fix --------- Co-authored-by: Philipp Spiess <hello@philippspiess.com>

openclaw-barnacle bot added commands Command implementations size: XS experienced-contributor labels Feb 26, 2026

philipp-spiess force-pushed the fix/27346-onboard-verify-timeout branch from 22beb88 to 967d51f Compare February 27, 2026 21:16

openclaw-barnacle bot added the agents Agent runtime and tooling label Feb 27, 2026

philipp-spiess force-pushed the fix/27346-onboard-verify-timeout branch from df90a05 to 967d51f Compare February 27, 2026 21:26

openclaw-barnacle bot removed the agents Agent runtime and tooling label Feb 27, 2026

philipp-spiess force-pushed the fix/27346-onboard-verify-timeout branch from 967d51f to 606f9d8 Compare February 27, 2026 21:31

Changelog: note custom-provider onboarding verification fix

61365f9

philipp-spiess merged commit ee2eadd into openclaw:main Feb 27, 2026
21 checks passed

github-actions bot mentioned this pull request Feb 27, 2026

📡 Upstream Digest — 2026-02-27 22:16 UTC curtismercier/openclaw-mods#139

Open

gemini-code-assist bot mentioned this pull request Mar 1, 2026

chore(sync): replay rollup fixes on upstream/main MillionthOdin16/openclaw#111

Merged

alexey-pelykh mentioned this pull request Mar 10, 2026

Cherry-pick: Config, onboarding, and provider setup remoteclaw/remoteclaw#683

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(onboard): increase verification timeout and reduce max_tokens for custom provider probes#27380

fix(onboard): increase verification timeout and reduce max_tokens for custom provider probes#27380
philipp-spiess merged 2 commits intoopenclaw:mainfrom
Sid-Qin:fix/27346-onboard-verify-timeout

Sid-Qin commented Feb 26, 2026

Uh oh!

greptile-apps bot commented Feb 26, 2026

Uh oh!

philipp-spiess commented Feb 27, 2026

Uh oh!

Uh oh!

philipp-spiess commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Sid-Qin commented Feb 26, 2026

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

User-visible / Behavior Changes

Security Impact (required)

Repro + Verification

Environment

Steps

Expected

Actual

Evidence

Human Verification (required)

Compatibility / Migration

Failure Recovery (if this breaks)

Risks and Mitigations

Uh oh!

greptile-apps bot commented Feb 26, 2026

Greptile Summary

Confidence Score: 5/5

Uh oh!

philipp-spiess commented Feb 27, 2026

Uh oh!

Uh oh!

philipp-spiess commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants