Description
The built-in vision_analyze tool fails when using the GLM (ZhipuAI) vision provider (glm-4v-flash) with error:
Error code: 400 - {'error': {'code': '1210', 'message': 'API 调用参数有误,请检查文档。'}}
Root Cause
Request format mismatch between Hermes' vision_analyze tool and the GLM API. The GLM vision API accepts image data in OpenAI-compatible format with content as an array of {"type": "text/..."} and {"type": "image_url", "image_url": {"url": "data:image/...;base64,..."}} objects.
Manual curl calls with exactly this format work correctly (tested with glm-4v-flash at https://open.bigmodel.cn/api/paas/v4/).
Config Used
auxiliary:
vision:
provider: glm
model: glm-4v-flash
base_url: "https://open.bigmodel.cn/api/paas/v4/"
api_key: "<valid-key>"
Expected vs Actual
- Expected:
vision_analyze tool sends a correctly formatted OpenAI-compatible request to GLM, returns image analysis
- Actual: Returns 400 error 1210 immediately
Workaround
Manual curl + base64 encoding works perfectly (documented in references/glm-vision-provider.md). The GLM API key, endpoint, and model are all correct.
Environment
- Hermes Agent latest (2026-05)
- GLM provider: glm-4v-flash
- Platform: CLI and WeChat gateway (same result, across multiple sessions)
Description
The built-in
vision_analyzetool fails when using the GLM (ZhipuAI) vision provider (glm-4v-flash) with error:Root Cause
Request format mismatch between Hermes'
vision_analyzetool and the GLM API. The GLM vision API accepts image data in OpenAI-compatible format withcontentas an array of{"type": "text/..."}and{"type": "image_url", "image_url": {"url": "data:image/...;base64,..."}}objects.Manual curl calls with exactly this format work correctly (tested with
glm-4v-flashathttps://open.bigmodel.cn/api/paas/v4/).Config Used
Expected vs Actual
vision_analyzetool sends a correctly formatted OpenAI-compatible request to GLM, returns image analysisWorkaround
Manual curl + base64 encoding works perfectly (documented in
references/glm-vision-provider.md). The GLM API key, endpoint, and model are all correct.Environment