Skip to content

fix(onboard): increase verification timeout and reduce max_tokens for custom provider probes#27380

Merged
philipp-spiess merged 2 commits intoopenclaw:mainfrom
Sid-Qin:fix/27346-onboard-verify-timeout
Feb 27, 2026
Merged

fix(onboard): increase verification timeout and reduce max_tokens for custom provider probes#27380
philipp-spiess merged 2 commits intoopenclaw:mainfrom
Sid-Qin:fix/27346-onboard-verify-timeout

Conversation

@Sid-Qin
Copy link
Contributor

@Sid-Qin Sid-Qin commented Feb 26, 2026

Summary

  • Problem: Onboard wizard fails to verify local llama.cpp servers (Custom Provider) — the verification request times out after 10 seconds because the server needs to load a large model and generate up to 1024 tokens.
  • Why it matters: Users with local LLM servers (llama.cpp, vLLM, etc.) cannot complete the onboard wizard, even though their API is fully functional.
  • What changed: src/commands/onboard-custom.ts — raised VERIFY_TIMEOUT_MS from 10s to 30s, reduced max_tokens from 1024 to 1 (verification only needs a single token), added explicit stream: false to both OpenAI and Anthropic probes.
  • What did NOT change: The verification flow logic, error handling, retry prompts, or any other onboard wizard behavior.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

User-visible / Behavior Changes

  • Custom provider verification during onboard now tolerates slower local LLM servers (up to 30s instead of 10s).
  • Verification requests generate only 1 token instead of 1024, making the probe much faster.

Security Impact (required)

  • New permissions/capabilities? `No`
  • Secrets/tokens handling changed? `No`
  • New/changed network calls? `No` (same endpoints, just different timeout and body params)
  • Command/tool execution surface changed? `No`
  • Data access scope changed? `No`

Repro + Verification

Environment

  • OS: Ubuntu 24.04 (Docker)
  • Runtime: Node 22+
  • Integration/channel: CLI onboard wizard

Steps

  1. Run llama-server with a large model (e.g. Qwen3.5-27B) on host
  2. Run `openclaw` onboard wizard in Docker
  3. Select Custom Provider → Base URL: `http://172.17.0.1:8080/v1\` → Model ID: `Qwen3.5-27B-UD-Q8_K_XL.gguf`

Expected

  • Verification succeeds (model responds with 1 token within 30s)

Actual

  • Before fix: "Verification failed: This operation was aborted" after 10s
  • After fix: Verification succeeds because max_tokens=1 completes quickly and timeout is 30s

Evidence

Root cause:

  • `VERIFY_TIMEOUT_MS` was 10s — too short for large local models that need to load into GPU memory
  • `max_tokens: 1024` forced the server to generate a full response, adding unnecessary latency
  • The verification only needs to confirm the API is reachable and the model ID is valid — 1 token suffices
// Before
const VERIFY_TIMEOUT_MS = 10000;
// body: { max_tokens: 1024 }

// After
const VERIFY_TIMEOUT_MS = 30_000;
// body: { max_tokens: 1, stream: false }

All 14 tests pass after updating test expectations.

Human Verification (required)

  • Verified scenarios: All 14 unit tests pass, including timeout test updated to 30s
  • Edge cases checked: OpenAI probe, Anthropic probe, timeout abort, retry flow
  • What I did not verify: Live testing with actual llama.cpp server (no GPU available)

Compatibility / Migration

  • Backward compatible? `Yes`
  • Config/env changes? `No`
  • Migration needed? `No`

Failure Recovery (if this breaks)

  • How to disable/revert: Revert changes in `onboard-custom.ts`
  • Files/config to restore: `src/commands/onboard-custom.ts`, `src/commands/onboard-custom.test.ts`
  • Known bad symptoms: If reverted, large local models will continue to fail verification

Risks and Mitigations

  • Risk: 30s timeout means users wait longer if the server is truly unreachable. Mitigated by max_tokens=1 making successful probes much faster than before.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 26, 2026

Greptile Summary

Fixed custom provider verification timeouts for local LLM servers by increasing timeout from 10s to 30s and reducing verification token generation from 1024 to 1.

  • Increased VERIFY_TIMEOUT_MS from 10s to 30s to accommodate large model loading times on local llama.cpp/vLLM servers
  • Reduced max_tokens from 1024 to 1 in verification probes — since verification only needs to confirm API reachability and valid model ID, generating a single token is sufficient and much faster
  • Added explicit stream: false to both OpenAI and Anthropic verification requests for clarity
  • Updated all test expectations to match new timeout and token values

The changes are minimal, well-targeted, and properly address the reported issue where users with functional local LLM servers couldn't complete onboarding due to premature timeout. The trade-off (users wait longer for truly unreachable servers) is acknowledged and reasonable.

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • Changes are narrowly scoped to verification timeout/token configuration with proper test coverage. No breaking changes, security issues, or logical errors. The timeout increase and token reduction are well-justified optimizations that directly address the reported issue.
  • No files require special attention

Last reviewed commit: 83b6ef0

@philipp-spiess philipp-spiess force-pushed the fix/27346-onboard-verify-timeout branch from 22beb88 to 967d51f Compare February 27, 2026 21:16
@openclaw-barnacle openclaw-barnacle bot added the agents Agent runtime and tooling label Feb 27, 2026
@philipp-spiess philipp-spiess force-pushed the fix/27346-onboard-verify-timeout branch from df90a05 to 967d51f Compare February 27, 2026 21:26
@openclaw-barnacle openclaw-barnacle bot removed the agents Agent runtime and tooling label Feb 27, 2026
… custom provider probes

The onboard wizard sends a chat-completion request to verify custom
providers.  With max_tokens: 1024 and a 10 s timeout, large local
models (e.g. Qwen3.5-27B on llama.cpp) routinely time out because
the server needs to load the model and generate up to 1024 tokens
before responding.

Changes:
- Raise VERIFY_TIMEOUT_MS from 10 s to 30 s
- Lower max_tokens from 1024 to 1 (verification only needs a single
  token to confirm the API is reachable and the model ID is valid)
- Add explicit stream: false to both OpenAI and Anthropic probes

Closes openclaw#27346

Made-with: Cursor
@philipp-spiess philipp-spiess force-pushed the fix/27346-onboard-verify-timeout branch from 967d51f to 606f9d8 Compare February 27, 2026 21:31
@philipp-spiess
Copy link
Member

Thanks everyone for the iteration here. This PR now consolidates the onboarding verification reliability fix for custom providers.

What’s included:

  • onboarding verification timeout increase in the custom-provider verification flow
  • lean verification probe behavior for custom-provider checks
  • updated tests covering verification behavior
  • changelog note for this user-facing fix

Validation:

  • CI is green on required checks and remaining workflows are in progress
  • local smoke passed against Ollama (nate/instinct:latest)

If this looks good, this should be the one we merge. Thank you all for the prior PRs and review cycles.

@philipp-spiess
Copy link
Member

Merged, thank you everyone.

Post-merge notes:

Appreciate all the iterations and review help here.

jalehman pushed a commit to rodrigouroz/openclaw that referenced this pull request Feb 27, 2026
… custom provider probes (openclaw#27380)

* fix(onboard): increase verification timeout and reduce max_tokens for custom provider probes

The onboard wizard sends a chat-completion request to verify custom
providers.  With max_tokens: 1024 and a 10 s timeout, large local
models (e.g. Qwen3.5-27B on llama.cpp) routinely time out because
the server needs to load the model and generate up to 1024 tokens
before responding.

Changes:
- Raise VERIFY_TIMEOUT_MS from 10 s to 30 s
- Lower max_tokens from 1024 to 1 (verification only needs a single
  token to confirm the API is reachable and the model ID is valid)
- Add explicit stream: false to both OpenAI and Anthropic probes

Closes openclaw#27346

Made-with: Cursor

* Changelog: note custom-provider onboarding verification fix

---------

Co-authored-by: Philipp Spiess <hello@philippspiess.com>
velvet-shark pushed a commit to lailoo/openclaw that referenced this pull request Feb 27, 2026
… custom provider probes (openclaw#27380)

* fix(onboard): increase verification timeout and reduce max_tokens for custom provider probes

The onboard wizard sends a chat-completion request to verify custom
providers.  With max_tokens: 1024 and a 10 s timeout, large local
models (e.g. Qwen3.5-27B on llama.cpp) routinely time out because
the server needs to load the model and generate up to 1024 tokens
before responding.

Changes:
- Raise VERIFY_TIMEOUT_MS from 10 s to 30 s
- Lower max_tokens from 1024 to 1 (verification only needs a single
  token to confirm the API is reachable and the model ID is valid)
- Add explicit stream: false to both OpenAI and Anthropic probes

Closes openclaw#27346

Made-with: Cursor

* Changelog: note custom-provider onboarding verification fix

---------

Co-authored-by: Philipp Spiess <hello@philippspiess.com>
r4jiv007 pushed a commit to r4jiv007/openclaw that referenced this pull request Feb 28, 2026
… custom provider probes (openclaw#27380)

* fix(onboard): increase verification timeout and reduce max_tokens for custom provider probes

The onboard wizard sends a chat-completion request to verify custom
providers.  With max_tokens: 1024 and a 10 s timeout, large local
models (e.g. Qwen3.5-27B on llama.cpp) routinely time out because
the server needs to load the model and generate up to 1024 tokens
before responding.

Changes:
- Raise VERIFY_TIMEOUT_MS from 10 s to 30 s
- Lower max_tokens from 1024 to 1 (verification only needs a single
  token to confirm the API is reachable and the model ID is valid)
- Add explicit stream: false to both OpenAI and Anthropic probes

Closes openclaw#27346

Made-with: Cursor

* Changelog: note custom-provider onboarding verification fix

---------

Co-authored-by: Philipp Spiess <hello@philippspiess.com>
xiexikang pushed a commit to cclawd007/cclawd that referenced this pull request Feb 28, 2026
… custom provider probes (openclaw#27380)

* fix(onboard): increase verification timeout and reduce max_tokens for custom provider probes

The onboard wizard sends a chat-completion request to verify custom
providers.  With max_tokens: 1024 and a 10 s timeout, large local
models (e.g. Qwen3.5-27B on llama.cpp) routinely time out because
the server needs to load the model and generate up to 1024 tokens
before responding.

Changes:
- Raise VERIFY_TIMEOUT_MS from 10 s to 30 s
- Lower max_tokens from 1024 to 1 (verification only needs a single
  token to confirm the API is reachable and the model ID is valid)
- Add explicit stream: false to both OpenAI and Anthropic probes

Closes openclaw#27346

Made-with: Cursor

* Changelog: note custom-provider onboarding verification fix

---------

Co-authored-by: Philipp Spiess <hello@philippspiess.com>
mylukin pushed a commit to mylukin/openclaw that referenced this pull request Feb 28, 2026
… custom provider probes (openclaw#27380)

* fix(onboard): increase verification timeout and reduce max_tokens for custom provider probes

The onboard wizard sends a chat-completion request to verify custom
providers.  With max_tokens: 1024 and a 10 s timeout, large local
models (e.g. Qwen3.5-27B on llama.cpp) routinely time out because
the server needs to load the model and generate up to 1024 tokens
before responding.

Changes:
- Raise VERIFY_TIMEOUT_MS from 10 s to 30 s
- Lower max_tokens from 1024 to 1 (verification only needs a single
  token to confirm the API is reachable and the model ID is valid)
- Add explicit stream: false to both OpenAI and Anthropic probes

Closes openclaw#27346

Made-with: Cursor

* Changelog: note custom-provider onboarding verification fix

---------

Co-authored-by: Philipp Spiess <hello@philippspiess.com>
wanjizheng pushed a commit to wanjizheng/openclaw that referenced this pull request Feb 28, 2026
… custom provider probes (openclaw#27380)

* fix(onboard): increase verification timeout and reduce max_tokens for custom provider probes

The onboard wizard sends a chat-completion request to verify custom
providers.  With max_tokens: 1024 and a 10 s timeout, large local
models (e.g. Qwen3.5-27B on llama.cpp) routinely time out because
the server needs to load the model and generate up to 1024 tokens
before responding.

Changes:
- Raise VERIFY_TIMEOUT_MS from 10 s to 30 s
- Lower max_tokens from 1024 to 1 (verification only needs a single
  token to confirm the API is reachable and the model ID is valid)
- Add explicit stream: false to both OpenAI and Anthropic probes

Closes openclaw#27346

Made-with: Cursor

* Changelog: note custom-provider onboarding verification fix

---------

Co-authored-by: Philipp Spiess <hello@philippspiess.com>
wanjizheng pushed a commit to wanjizheng/openclaw that referenced this pull request Feb 28, 2026
… custom provider probes (openclaw#27380)

* fix(onboard): increase verification timeout and reduce max_tokens for custom provider probes

The onboard wizard sends a chat-completion request to verify custom
providers.  With max_tokens: 1024 and a 10 s timeout, large local
models (e.g. Qwen3.5-27B on llama.cpp) routinely time out because
the server needs to load the model and generate up to 1024 tokens
before responding.

Changes:
- Raise VERIFY_TIMEOUT_MS from 10 s to 30 s
- Lower max_tokens from 1024 to 1 (verification only needs a single
  token to confirm the API is reachable and the model ID is valid)
- Add explicit stream: false to both OpenAI and Anthropic probes

Closes openclaw#27346

Made-with: Cursor

* Changelog: note custom-provider onboarding verification fix

---------

Co-authored-by: Philipp Spiess <hello@philippspiess.com>
(cherry picked from commit c6cd434)
wanjizheng pushed a commit to wanjizheng/openclaw that referenced this pull request Feb 28, 2026
… custom provider probes (openclaw#27380)

* fix(onboard): increase verification timeout and reduce max_tokens for custom provider probes

The onboard wizard sends a chat-completion request to verify custom
providers.  With max_tokens: 1024 and a 10 s timeout, large local
models (e.g. Qwen3.5-27B on llama.cpp) routinely time out because
the server needs to load the model and generate up to 1024 tokens
before responding.

Changes:
- Raise VERIFY_TIMEOUT_MS from 10 s to 30 s
- Lower max_tokens from 1024 to 1 (verification only needs a single
  token to confirm the API is reachable and the model ID is valid)
- Add explicit stream: false to both OpenAI and Anthropic probes

Closes openclaw#27346

Made-with: Cursor

* Changelog: note custom-provider onboarding verification fix

---------

Co-authored-by: Philipp Spiess <hello@philippspiess.com>
(cherry picked from commit c6cd434)
wanjizheng pushed a commit to wanjizheng/openclaw that referenced this pull request Feb 28, 2026
… custom provider probes (openclaw#27380)

* fix(onboard): increase verification timeout and reduce max_tokens for custom provider probes

The onboard wizard sends a chat-completion request to verify custom
providers.  With max_tokens: 1024 and a 10 s timeout, large local
models (e.g. Qwen3.5-27B on llama.cpp) routinely time out because
the server needs to load the model and generate up to 1024 tokens
before responding.

Changes:
- Raise VERIFY_TIMEOUT_MS from 10 s to 30 s
- Lower max_tokens from 1024 to 1 (verification only needs a single
  token to confirm the API is reachable and the model ID is valid)
- Add explicit stream: false to both OpenAI and Anthropic probes

Closes openclaw#27346

Made-with: Cursor

* Changelog: note custom-provider onboarding verification fix

---------

Co-authored-by: Philipp Spiess <hello@philippspiess.com>
(cherry picked from commit c6cd434)
vincentkoc pushed a commit to Sid-Qin/openclaw that referenced this pull request Feb 28, 2026
… custom provider probes (openclaw#27380)

* fix(onboard): increase verification timeout and reduce max_tokens for custom provider probes

The onboard wizard sends a chat-completion request to verify custom
providers.  With max_tokens: 1024 and a 10 s timeout, large local
models (e.g. Qwen3.5-27B on llama.cpp) routinely time out because
the server needs to load the model and generate up to 1024 tokens
before responding.

Changes:
- Raise VERIFY_TIMEOUT_MS from 10 s to 30 s
- Lower max_tokens from 1024 to 1 (verification only needs a single
  token to confirm the API is reachable and the model ID is valid)
- Add explicit stream: false to both OpenAI and Anthropic probes

Closes openclaw#27346

Made-with: Cursor

* Changelog: note custom-provider onboarding verification fix

---------

Co-authored-by: Philipp Spiess <hello@philippspiess.com>
vincentkoc pushed a commit to rylena/rylen-openclaw that referenced this pull request Feb 28, 2026
… custom provider probes (openclaw#27380)

* fix(onboard): increase verification timeout and reduce max_tokens for custom provider probes

The onboard wizard sends a chat-completion request to verify custom
providers.  With max_tokens: 1024 and a 10 s timeout, large local
models (e.g. Qwen3.5-27B on llama.cpp) routinely time out because
the server needs to load the model and generate up to 1024 tokens
before responding.

Changes:
- Raise VERIFY_TIMEOUT_MS from 10 s to 30 s
- Lower max_tokens from 1024 to 1 (verification only needs a single
  token to confirm the API is reachable and the model ID is valid)
- Add explicit stream: false to both OpenAI and Anthropic probes

Closes openclaw#27346

Made-with: Cursor

* Changelog: note custom-provider onboarding verification fix

---------

Co-authored-by: Philipp Spiess <hello@philippspiess.com>
newtontech pushed a commit to newtontech/openclaw-fork that referenced this pull request Feb 28, 2026
… custom provider probes (openclaw#27380)

* fix(onboard): increase verification timeout and reduce max_tokens for custom provider probes

The onboard wizard sends a chat-completion request to verify custom
providers.  With max_tokens: 1024 and a 10 s timeout, large local
models (e.g. Qwen3.5-27B on llama.cpp) routinely time out because
the server needs to load the model and generate up to 1024 tokens
before responding.

Changes:
- Raise VERIFY_TIMEOUT_MS from 10 s to 30 s
- Lower max_tokens from 1024 to 1 (verification only needs a single
  token to confirm the API is reachable and the model ID is valid)
- Add explicit stream: false to both OpenAI and Anthropic probes

Closes openclaw#27346

Made-with: Cursor

* Changelog: note custom-provider onboarding verification fix

---------

Co-authored-by: Philipp Spiess <hello@philippspiess.com>
wanjizheng pushed a commit to wanjizheng/openclaw that referenced this pull request Mar 1, 2026
… custom provider probes (openclaw#27380)

* fix(onboard): increase verification timeout and reduce max_tokens for custom provider probes

The onboard wizard sends a chat-completion request to verify custom
providers.  With max_tokens: 1024 and a 10 s timeout, large local
models (e.g. Qwen3.5-27B on llama.cpp) routinely time out because
the server needs to load the model and generate up to 1024 tokens
before responding.

Changes:
- Raise VERIFY_TIMEOUT_MS from 10 s to 30 s
- Lower max_tokens from 1024 to 1 (verification only needs a single
  token to confirm the API is reachable and the model ID is valid)
- Add explicit stream: false to both OpenAI and Anthropic probes

Closes openclaw#27346

Made-with: Cursor

* Changelog: note custom-provider onboarding verification fix

---------

Co-authored-by: Philipp Spiess <hello@philippspiess.com>
wanjizheng pushed a commit to wanjizheng/openclaw that referenced this pull request Mar 1, 2026
… custom provider probes (openclaw#27380)

* fix(onboard): increase verification timeout and reduce max_tokens for custom provider probes

The onboard wizard sends a chat-completion request to verify custom
providers.  With max_tokens: 1024 and a 10 s timeout, large local
models (e.g. Qwen3.5-27B on llama.cpp) routinely time out because
the server needs to load the model and generate up to 1024 tokens
before responding.

Changes:
- Raise VERIFY_TIMEOUT_MS from 10 s to 30 s
- Lower max_tokens from 1024 to 1 (verification only needs a single
  token to confirm the API is reachable and the model ID is valid)
- Add explicit stream: false to both OpenAI and Anthropic probes

Closes openclaw#27346

Made-with: Cursor

* Changelog: note custom-provider onboarding verification fix

---------

Co-authored-by: Philipp Spiess <hello@philippspiess.com>
steipete pushed a commit to Sid-Qin/openclaw that referenced this pull request Mar 2, 2026
… custom provider probes (openclaw#27380)

* fix(onboard): increase verification timeout and reduce max_tokens for custom provider probes

The onboard wizard sends a chat-completion request to verify custom
providers.  With max_tokens: 1024 and a 10 s timeout, large local
models (e.g. Qwen3.5-27B on llama.cpp) routinely time out because
the server needs to load the model and generate up to 1024 tokens
before responding.

Changes:
- Raise VERIFY_TIMEOUT_MS from 10 s to 30 s
- Lower max_tokens from 1024 to 1 (verification only needs a single
  token to confirm the API is reachable and the model ID is valid)
- Add explicit stream: false to both OpenAI and Anthropic probes

Closes openclaw#27346

Made-with: Cursor

* Changelog: note custom-provider onboarding verification fix

---------

Co-authored-by: Philipp Spiess <hello@philippspiess.com>
safzanpirani pushed a commit to safzanpirani/clawdbot that referenced this pull request Mar 2, 2026
… custom provider probes (openclaw#27380)

* fix(onboard): increase verification timeout and reduce max_tokens for custom provider probes

The onboard wizard sends a chat-completion request to verify custom
providers.  With max_tokens: 1024 and a 10 s timeout, large local
models (e.g. Qwen3.5-27B on llama.cpp) routinely time out because
the server needs to load the model and generate up to 1024 tokens
before responding.

Changes:
- Raise VERIFY_TIMEOUT_MS from 10 s to 30 s
- Lower max_tokens from 1024 to 1 (verification only needs a single
  token to confirm the API is reachable and the model ID is valid)
- Add explicit stream: false to both OpenAI and Anthropic probes

Closes openclaw#27346

Made-with: Cursor

* Changelog: note custom-provider onboarding verification fix

---------

Co-authored-by: Philipp Spiess <hello@philippspiess.com>
steipete pushed a commit to Sid-Qin/openclaw that referenced this pull request Mar 2, 2026
… custom provider probes (openclaw#27380)

* fix(onboard): increase verification timeout and reduce max_tokens for custom provider probes

The onboard wizard sends a chat-completion request to verify custom
providers.  With max_tokens: 1024 and a 10 s timeout, large local
models (e.g. Qwen3.5-27B on llama.cpp) routinely time out because
the server needs to load the model and generate up to 1024 tokens
before responding.

Changes:
- Raise VERIFY_TIMEOUT_MS from 10 s to 30 s
- Lower max_tokens from 1024 to 1 (verification only needs a single
  token to confirm the API is reachable and the model ID is valid)
- Add explicit stream: false to both OpenAI and Anthropic probes

Closes openclaw#27346

Made-with: Cursor

* Changelog: note custom-provider onboarding verification fix

---------

Co-authored-by: Philipp Spiess <hello@philippspiess.com>
venjiang pushed a commit to venjiang/openclaw that referenced this pull request Mar 2, 2026
… custom provider probes (openclaw#27380)

* fix(onboard): increase verification timeout and reduce max_tokens for custom provider probes

The onboard wizard sends a chat-completion request to verify custom
providers.  With max_tokens: 1024 and a 10 s timeout, large local
models (e.g. Qwen3.5-27B on llama.cpp) routinely time out because
the server needs to load the model and generate up to 1024 tokens
before responding.

Changes:
- Raise VERIFY_TIMEOUT_MS from 10 s to 30 s
- Lower max_tokens from 1024 to 1 (verification only needs a single
  token to confirm the API is reachable and the model ID is valid)
- Add explicit stream: false to both OpenAI and Anthropic probes

Closes openclaw#27346

Made-with: Cursor

* Changelog: note custom-provider onboarding verification fix

---------

Co-authored-by: Philipp Spiess <hello@philippspiess.com>
robertchang-ga pushed a commit to robertchang-ga/openclaw that referenced this pull request Mar 2, 2026
… custom provider probes (openclaw#27380)

* fix(onboard): increase verification timeout and reduce max_tokens for custom provider probes

The onboard wizard sends a chat-completion request to verify custom
providers.  With max_tokens: 1024 and a 10 s timeout, large local
models (e.g. Qwen3.5-27B on llama.cpp) routinely time out because
the server needs to load the model and generate up to 1024 tokens
before responding.

Changes:
- Raise VERIFY_TIMEOUT_MS from 10 s to 30 s
- Lower max_tokens from 1024 to 1 (verification only needs a single
  token to confirm the API is reachable and the model ID is valid)
- Add explicit stream: false to both OpenAI and Anthropic probes

Closes openclaw#27346

Made-with: Cursor

* Changelog: note custom-provider onboarding verification fix

---------

Co-authored-by: Philipp Spiess <hello@philippspiess.com>
execute008 pushed a commit to execute008/openclaw that referenced this pull request Mar 2, 2026
… custom provider probes (openclaw#27380)

* fix(onboard): increase verification timeout and reduce max_tokens for custom provider probes

The onboard wizard sends a chat-completion request to verify custom
providers.  With max_tokens: 1024 and a 10 s timeout, large local
models (e.g. Qwen3.5-27B on llama.cpp) routinely time out because
the server needs to load the model and generate up to 1024 tokens
before responding.

Changes:
- Raise VERIFY_TIMEOUT_MS from 10 s to 30 s
- Lower max_tokens from 1024 to 1 (verification only needs a single
  token to confirm the API is reachable and the model ID is valid)
- Add explicit stream: false to both OpenAI and Anthropic probes

Closes openclaw#27346

Made-with: Cursor

* Changelog: note custom-provider onboarding verification fix

---------

Co-authored-by: Philipp Spiess <hello@philippspiess.com>
dorgonman pushed a commit to kanohorizonia/openclaw that referenced this pull request Mar 3, 2026
… custom provider probes (openclaw#27380)

* fix(onboard): increase verification timeout and reduce max_tokens for custom provider probes

The onboard wizard sends a chat-completion request to verify custom
providers.  With max_tokens: 1024 and a 10 s timeout, large local
models (e.g. Qwen3.5-27B on llama.cpp) routinely time out because
the server needs to load the model and generate up to 1024 tokens
before responding.

Changes:
- Raise VERIFY_TIMEOUT_MS from 10 s to 30 s
- Lower max_tokens from 1024 to 1 (verification only needs a single
  token to confirm the API is reachable and the model ID is valid)
- Add explicit stream: false to both OpenAI and Anthropic probes

Closes openclaw#27346

Made-with: Cursor

* Changelog: note custom-provider onboarding verification fix

---------

Co-authored-by: Philipp Spiess <hello@philippspiess.com>
sachinkundu pushed a commit to sachinkundu/openclaw that referenced this pull request Mar 6, 2026
… custom provider probes (openclaw#27380)

* fix(onboard): increase verification timeout and reduce max_tokens for custom provider probes

The onboard wizard sends a chat-completion request to verify custom
providers.  With max_tokens: 1024 and a 10 s timeout, large local
models (e.g. Qwen3.5-27B on llama.cpp) routinely time out because
the server needs to load the model and generate up to 1024 tokens
before responding.

Changes:
- Raise VERIFY_TIMEOUT_MS from 10 s to 30 s
- Lower max_tokens from 1024 to 1 (verification only needs a single
  token to confirm the API is reachable and the model ID is valid)
- Add explicit stream: false to both OpenAI and Anthropic probes

Closes openclaw#27346

Made-with: Cursor

* Changelog: note custom-provider onboarding verification fix

---------

Co-authored-by: Philipp Spiess <hello@philippspiess.com>
zooqueen pushed a commit to hanzoai/bot that referenced this pull request Mar 6, 2026
… custom provider probes (openclaw#27380)

* fix(onboard): increase verification timeout and reduce max_tokens for custom provider probes

The onboard wizard sends a chat-completion request to verify custom
providers.  With max_tokens: 1024 and a 10 s timeout, large local
models (e.g. Qwen3.5-27B on llama.cpp) routinely time out because
the server needs to load the model and generate up to 1024 tokens
before responding.

Changes:
- Raise VERIFY_TIMEOUT_MS from 10 s to 30 s
- Lower max_tokens from 1024 to 1 (verification only needs a single
  token to confirm the API is reachable and the model ID is valid)
- Add explicit stream: false to both OpenAI and Anthropic probes

Closes openclaw#27346

Made-with: Cursor

* Changelog: note custom-provider onboarding verification fix

---------

Co-authored-by: Philipp Spiess <hello@philippspiess.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

commands Command implementations size: XS

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Onboard wizard fails to verify local llama.cpp server (Custom Provider) — but API works fine from container

2 participants