Description
When configuring a custom provider (non-Azure) via openclaw onboard --custom, the generated model config in openclaw.json always sets input: ["text"], with no option to include "image". This silently disables vision/image support for all models under that provider, even when the underlying models (Claude, GPT, Gemini, Qwen, etc.) fully support image input.
Root Cause
File: src/commands/onboard-custom.ts, line ~666:
// Non-Azure path (all custom providers)
{
id: modelId,
name: `${modelId} (Custom Provider)`,
contextWindow: DEFAULT_CONTEXT_WINDOW,
maxTokens: DEFAULT_MAX_TOKENS,
input: ["text"] as ["text"], // ← hardcoded, no image option
...
}
Compare with the Azure path, which at least detects reasoning models and adds "image":
// Azure path
input: isLikelyReasoningModel
? (["text", "image"] as Array<"text" | "image">)
: (["text"] as ["text"]),
Impact
- Users who set up custom providers (OpenAI-compatible endpoints like apiyi, one-api, etc.) with vision-capable models (Claude, GPT-4o, Gemini, Qwen-VL) cannot send images to agents
- The gateway receives image attachments but silently drops them because the model config declares
input: ["text"] only
- No error or warning is shown — the agent just never receives the image
- Users have no way to fix this through the onboard wizard; manual JSON editing is required
Steps to Reproduce
- Run
openclaw onboard --custom with any non-Azure provider
- Add a vision-capable model (e.g.,
claude-sonnet-4-6)
- Check
~/.openclaw/openclaw.json — the model will have input: ["text"]
- Try sending an image via any channel — agent receives only text
Expected Behavior
- The onboard wizard should detect common vision-capable models by name pattern (e.g.,
claude, gpt-4o, gpt-5, gemini, qwen-vl) and default to ["text", "image"]
- Or: prompt the user to confirm whether the model supports image input
- Or: default all models to
["text", "image"] since the gateway already validates attachments
Workaround
Manually edit ~/.openclaw/openclaw.json and change input: ["text"] to input: ["text", "image"] for each affected model, then restart the gateway.
Environment
- OpenClaw version: 2026.3.13
- OS: Linux
- Provider: apiyi (OpenAI-compatible)
Description
When configuring a custom provider (non-Azure) via
openclaw onboard --custom, the generated model config inopenclaw.jsonalways setsinput: ["text"], with no option to include"image". This silently disables vision/image support for all models under that provider, even when the underlying models (Claude, GPT, Gemini, Qwen, etc.) fully support image input.Root Cause
File:
src/commands/onboard-custom.ts, line ~666:Compare with the Azure path, which at least detects reasoning models and adds
"image":Impact
input: ["text"]onlySteps to Reproduce
openclaw onboard --customwith any non-Azure providerclaude-sonnet-4-6)~/.openclaw/openclaw.json— the model will haveinput: ["text"]Expected Behavior
claude,gpt-4o,gpt-5,gemini,qwen-vl) and default to["text", "image"]["text", "image"]since the gateway already validates attachmentsWorkaround
Manually edit
~/.openclaw/openclaw.jsonand changeinput: ["text"]toinput: ["text", "image"]for each affected model, then restart the gateway.Environment