Description
NemoClaw incompatible with Qwen3.6 due to too small request token length.
Using:
- NemoClaw/OpenClaw 0.38
- vLLM 0.20.2
- Qwen/Qwen3.6-27B
- OpenAI-compatible endpoint
onboarding fails during inference validation with:
OPENCLAW_CONFIG_OK inference.local response did not contain choices[0].message.content
Qwen3.6 runs in thinking mode by default and returns:
{
"id": "chatcmpl-9d8a2524b5f62ead",
"object": "chat.completion",
"created": 1778497346,
"model": "Qwen/Qwen3.6-27B",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": null,
"refusal": null,
"annotations": null,
"audio": null,
"function_call": null,
"tool_calls": [],
"reasoning": "Here's a thinking process:\n\n1. **Analyze User Input:**\n - User says: \"Reply with exactly: PONG\"\n "
},
"logprobs": null,
**"finish_reason": "length",**
"stop_reason": null,
"token_ids": null
}
],
"service_tier": null,
"system_fingerprint": null,
"usage": {
"prompt_tokens": 16,
"total_tokens": 48,
"completion_tokens": 32,
"prompt_tokens_details": null
},
"prompt_logprobs": null,
"prompt_token_ids": null,
"kv_transfer_params": null
}
NemoClaw assumes:
choices[0].message.content != null
and rejects the response even though the endpoint is functioning correctly.
This is a compatibility issue between:
- NemoClaw/OpenClaw response validation
- Qwen3.6 thinking-mode behavior (can only be switched off at request time, not setup time)
- vLLM OpenAI-compatible output format
Reproduction Steps
- Run in one terminal:
docker run --rm -it --gpus all \
-e HF_TOKEN=$HF_TOKEN \
-v ~/.cache/huggingface:/root/.cache/huggingface \
-p 8000:8000 \
vllm/vllm-openai \
Qwen/Qwen3.6-27B \
--host 0.0.0.0 \
--port 8000 \
--tensor-parallel-size 1 \
--max-model-len 262144 \
--trust-remote-code \
--enable-auto-tool-choice \
--tool-call-parser qwen3_coder \
--reasoning-parser qwen3 \
--speculative-config '{"method":"qwen3_next_mtp","num_speculative_tokens":2}'
- Run in another terminal on the same machine
curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash
- Choose
3) Other OpenAI-compatible endpoint, http://host.docker.internal:8000/v1, dummy and Qwen/Qwen3.6-27B, then continue setup as required.
- At Step 7 of the Nemoclaw setup, you'll be greeted by this message:
[7/8] Setting up OpenClaw inside sandbox
──────────────────────────────────────────────────
✓ OpenClaw gateway launched inside sandbox
Verifying compatible endpoint through the messaging sandbox...
⚠ Gateway provider 'compatible-endpoint' did not report the selected endpoint URL.
Continuing to the sandbox-side inference.local smoke check.
Compatible endpoint sandbox smoke check failed.
Telegram provider startup is not the root cause; inference.local failed.
OPENCLAW_CONFIG_OK inference.local response did not contain choices[0].message.content: {"id": "chatcmpl-ac1e7493d920f2b2", "object": "chat.completion", "created": 1778494319, "model": "Qwen/Qwen3.6-27B", "choices": [{"index": 0, "message": {"role": "assistant", "content": null, "refusal": null, "annotations": null, "audio": null, "function_call": null, "tool_calls": [], "reasoning": "Thinking Process:\n\n1. **Analyze the Request:** The user wants a reply with the exact string \"PONG\".\n2. **Constraint"}, "logprobs": null, "finish_reason": "length", "stop_reason": null, "token_ids": null}], "service_tier": null, "system_fingerprint": null, "usage": {"prompt_tokens": 16, "total_tokens": 48, "completion_tokens": 32, "prompt_tokens_details": null}, "prompt_logprobs": null, "prompt_token_ids": null, "kv_transfer_params": null}
Despite the endpoint working otherwise perfectly. It just never populates choices[0].message.content because the model never finishes thinking, and generation stops due to "finish_reason": "length".
Environment
- Linux spark-fce0 6.17.0-1014-nvidia # 14-Ubuntu SMP PREEMPT_DYNAMIC Tue Mar 17 19:01:40 UTC 2026 aarch64 aarch64 aarch64 GNU/Linux
- v22.22.2
- Docker version 29.2.1, build a5c7197
- nemoclaw v0.0.38
Debug Output
Logs
Checklist
Description
NemoClaw incompatible with Qwen3.6 due to too small request token length.
Using:
onboarding fails during inference validation with:
OPENCLAW_CONFIG_OK inference.local response did not contain choices[0].message.contentQwen3.6 runs in thinking mode by default and returns:
NemoClaw assumes:
choices[0].message.content != nulland rejects the response even though the endpoint is functioning correctly.
This is a compatibility issue between:
Reproduction Steps
curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash3) Other OpenAI-compatible endpoint,http://host.docker.internal:8000/v1,dummyandQwen/Qwen3.6-27B, then continue setup as required.Despite the endpoint working otherwise perfectly. It just never populates
choices[0].message.contentbecause the model never finishes thinking, and generation stops due to"finish_reason": "length".Environment
Debug Output
Logs
Checklist