feat: add GLM-4.5V model support #739

yyhhyyyyyy · 2025-08-15T01:12:42Z

add GLM-4.5V model support

Summary by CodeRabbit

New Features
- Added GLM-4.5V multimodal model to the Zhipu provider.
- Supports image understanding (vision), function calling, and advanced reasoning.
- Offers a 65k context window with up to 8k generated tokens.
- Available alongside existing GLM-4.5 options without affecting current selections.

coderabbitai · 2025-08-15T01:12:47Z

Walkthrough

Adds GLM-4.5V (multimodal) model entries to default settings, Zhipu provider-specific settings, and the Zhipu provider’s fetchOpenAIModels list. No control flow, signatures, or error handling changes.

Changes

Cohort / File(s)	Summary
Default model settings update `src/main/presenter/configPresenter/modelDefaultSettings.ts`	Adds GLM-4.5V entry (id: glm-4.5v) with vision, functionCall, reasoning; temperature 0.7, maxTokens 8192, contextLength 65536.
Provider model settings update (Zhipu) `src/main/presenter/configPresenter/providerModelSettings.ts`	Adds GLM-4.5V config under Zhipu; same parameters and matching ['glm-4.5v']; positioned before glm-4.5.
Zhipu provider models list update `src/main/presenter/llmProviderPresenter/providers/zhipuProvider.ts`	Extends fetchOpenAIModels with GLM-4.5V in multimodal section; sets metadata (group zhipu, providerId, isCustom false).

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related PRs

feat: add support for Zhipu GLM 4.5 series models #659 — Also updates Zhipu GLM‑4.5 family model entries across similar settings arrays.
feat: add Claude Opus 4.1 support and fix anthropic model list dialog #702 — Adds new model entries to the same configuration structures (different provider/model: Claude Opus 4.1).

Suggested reviewers

zerob13

Poem

I twitch my whiskers—new V in the mix,
GLM hops in with multimodal tricks.
In defaults, providers, the list it will stay,
A carrot of config to brighten the day.
Hop, save, merge—then bound away! 🥕🐇

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/add-glm-4.5v-support

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 7174593 and fce0b8e.

📒 Files selected for processing (3)

src/main/presenter/configPresenter/modelDefaultSettings.ts (1 hunks)
src/main/presenter/configPresenter/providerModelSettings.ts (1 hunks)
src/main/presenter/llmProviderPresenter/providers/zhipuProvider.ts (1 hunks)

🧰 Additional context used

📓 Path-based instructions (9)

**/*.{ts,tsx,js,jsx,vue}

📄 CodeRabbit Inference Engine (CLAUDE.md)

Use English for logs and comments

Files:

src/main/presenter/configPresenter/modelDefaultSettings.ts
src/main/presenter/llmProviderPresenter/providers/zhipuProvider.ts
src/main/presenter/configPresenter/providerModelSettings.ts

**/*.{ts,tsx}

📄 CodeRabbit Inference Engine (CLAUDE.md)

Strict type checking enabled for TypeScript

**/*.{ts,tsx}: 始终使用 try-catch 处理可能的错误
提供有意义的错误信息
记录详细的错误日志
优雅降级处理
日志应包含时间戳、日志级别、错误代码、错误描述、堆栈跟踪（如适用）、相关上下文信息
日志级别应包括 ERROR、WARN、INFO、DEBUG
不要吞掉错误
提供用户友好的错误信息
实现错误重试机制
避免记录敏感信息
使用结构化日志
设置适当的日志级别

Files:

src/main/presenter/configPresenter/modelDefaultSettings.ts
src/main/presenter/llmProviderPresenter/providers/zhipuProvider.ts
src/main/presenter/configPresenter/providerModelSettings.ts

src/main/**/*.ts

📄 CodeRabbit Inference Engine (CLAUDE.md)

Main to Renderer: Use EventBus to broadcast events via mainWindow.webContents.send()

Use Electron's built-in APIs for file system and native dialogs

Files:

src/main/presenter/configPresenter/modelDefaultSettings.ts
src/main/presenter/llmProviderPresenter/providers/zhipuProvider.ts
src/main/presenter/configPresenter/providerModelSettings.ts

src/main/presenter/**/*.ts

📄 CodeRabbit Inference Engine (CLAUDE.md)

One presenter per functional domain

Files:

src/main/presenter/configPresenter/modelDefaultSettings.ts
src/main/presenter/llmProviderPresenter/providers/zhipuProvider.ts
src/main/presenter/configPresenter/providerModelSettings.ts

src/main/presenter/configPresenter/**/*.ts

📄 CodeRabbit Inference Engine (CLAUDE.md)

Centralize configuration in configPresenter/

Files:

src/main/presenter/configPresenter/modelDefaultSettings.ts
src/main/presenter/configPresenter/providerModelSettings.ts

**/*.{js,jsx,ts,tsx}

📄 CodeRabbit Inference Engine (.cursor/rules/development-setup.mdc)

**/*.{js,jsx,ts,tsx}: 使用 OxLint 进行代码检查
Log和注释使用英文书写

Files:

src/main/presenter/configPresenter/modelDefaultSettings.ts
src/main/presenter/llmProviderPresenter/providers/zhipuProvider.ts
src/main/presenter/configPresenter/providerModelSettings.ts

src/{main,renderer}/**/*.ts

📄 CodeRabbit Inference Engine (.cursor/rules/electron-best-practices.mdc)

src/{main,renderer}/**/*.ts: Use context isolation for improved security
Implement proper inter-process communication (IPC) patterns
Optimize application startup time with lazy loading
Implement proper error handling and logging for debugging

Files:

src/main/presenter/configPresenter/modelDefaultSettings.ts
src/main/presenter/llmProviderPresenter/providers/zhipuProvider.ts
src/main/presenter/configPresenter/providerModelSettings.ts

src/main/**/*.{ts,js,tsx,jsx}

📄 CodeRabbit Inference Engine (.cursor/rules/project-structure.mdc)

主进程代码放在 src/main

Files:

src/main/presenter/configPresenter/modelDefaultSettings.ts
src/main/presenter/llmProviderPresenter/providers/zhipuProvider.ts
src/main/presenter/configPresenter/providerModelSettings.ts

src/main/presenter/llmProviderPresenter/providers/*.ts

📄 CodeRabbit Inference Engine (CLAUDE.md)

src/main/presenter/llmProviderPresenter/providers/*.ts: Create provider file in src/main/presenter/llmProviderPresenter/providers/ when adding a new LLM provider
Implement coreStream method following standardized event interface in LLM provider files

src/main/presenter/llmProviderPresenter/providers/*.ts: Each file in src/main/presenter/llmProviderPresenter/providers/*.ts should handle interaction with a specific LLM API, including request/response formatting, tool definition conversion, native/non-native tool call management, and standardizing output streams to a common event format.
Provider implementations must use a coreStream method that yields standardized stream events to decouple the main loop from provider-specific details.
The coreStream method in each Provider must perform a single streaming API request per conversation round and must not contain multi-round tool call loop logic.
Provider files should implement helper methods such as formatMessages, convertToProviderTools, parseFunctionCalls, and prepareFunctionCallPrompt as needed for provider-specific logic.
All provider implementations must parse provider-specific data chunks and yield standardized events for text, reasoning, tool calls, usage, errors, stop reasons, and image data.
When a provider does not support native function calling, it must prepare messages using prompt wrapping (e.g., prepareFunctionCallPrompt) before making the API call.
When a provider supports native function calling, MCP tools must be converted to the provider's format (e.g., using convertToProviderTools) and included in the API request.
Provider implementations should aggregate and yield usage events as part of the standardized stream.
Provider implementations should yield image data events in the standardized format when applicable.
Provider implementations should yield reasoning events in the standardized format when applicable.
Provider implementations should yield tool call events (`tool_call_star...

Files:

src/main/presenter/llmProviderPresenter/providers/zhipuProvider.ts

🧠 Learnings (1)

📚 Learning: 2025-07-21T01:45:33.790Z

Learnt from: CR
PR: ThinkInAIXYZ/deepchat#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-21T01:45:33.790Z
Learning: Applies to src/main/presenter/configPresenter/providers.ts : Add provider configuration in configPresenter/providers.ts when adding a new LLM provider

Applied to files:

src/main/presenter/llmProviderPresenter/providers/zhipuProvider.ts
src/main/presenter/configPresenter/providerModelSettings.ts

🔇 Additional comments (4)

src/main/presenter/configPresenter/modelDefaultSettings.ts (1)

1648-1658: GLM-4.5V default settings added — consistent and well-scoped

The new entry aligns with the Zhipu provider settings and the provider’s model list (contextLength 65536, maxTokens 8192, vision true). No control flow or typing issues here.

src/main/presenter/llmProviderPresenter/providers/zhipuProvider.ts (1)

142-150: Added GLM-4.5V to Zhipu catalog — matches defaults/provider config

The catalog entry is consistent with default and provider settings (contextLength 65536, maxTokens 8192). Placement under the multimodal section is correct.

src/main/presenter/configPresenter/providerModelSettings.ts (2)

310-320: Zhipu provider model settings for GLM-4.5V — consistent with defaults

The provider-specific entry mirrors defaults (temperature 0.7, maxTokens 8192, contextLength 65536, vision true, functionCall true, reasoning true). Looks good.

310-320: Confirmed: keep functionCall: true for GLM-4.5V

GLM‑4.5V supports native OpenAI‑compatible function/tool calling (docs/examples show the OpenAI-style "tools" parameter and tool_call responses). No change required.

File: src/main/presenter/configPresenter/providerModelSettings.ts — glm-4.5v entry (lines ~310–320): keep functionCall: true.

Caveat: ensure your chosen deployment/SDK honors the OpenAI-style tools schema; if it doesn't, use your prompt-wrapping path instead.

coderabbitai · 2025-08-15T01:19:24Z

src/main/presenter/configPresenter/modelDefaultSettings.ts

+  {
+    id: 'glm-4.5v',
+    name: 'GLM-4.5V',
+    temperature: 0.7,
+    maxTokens: 8192,
+    contextLength: 65536,
+    match: ['glm-4.5v'],
+    vision: true,
+    functionCall: true,
+    reasoning: true
+  },


💡 Verification agent

🧩 Analysis chain

Confirm GLM-4.5V capabilities (function calling, context window) and naming

Please double-check with Zhipu’s latest docs that:

Official model ID is exactly glm-4.5v

It supports native function/tool calling

Context window should be 65,536 (vs 128k like text-only 4.5 variants)

If any of these differ, update this block and provider-specific settings accordingly.

Optionally, verify codebase references to ensure no other places need updates:

🏁 Script executed:

#!/bin/bash set -euo pipefail rg -nC2 -i '\bglm[- ]?4\.5v\b|GLM-4\.5V'

Length of output: 2336

🌐 Web query:

What is the official model identifier for Zhipu's GLM-4.5V, its context window size, and does it support native function (tool) calling compatible with OpenAI "tools"?

💡 Result:

The official model identifier for Zhipu's GLM-4.5V is glm-4.5v[7]. Its context window size is up to 64K tokens for multimodal (i.e., both text and visual) inputs[2][6][9]. Current documentation and developer guides do not confirm official support for direct native function/tool calling in the way that OpenAI's "tools" (previously called "functions") works, nor do they state explicit compatibility with the OpenAI "tools" API schema.

Details and Supporting Information

Official model identifier:

As shown in Zhipu's Python SDK API example, the invocation looks like this:
response = client.chat.completions.create(model="glm-4.5v", ...)
This indicates the canonical model name for API access is glm-4.5v[7].

Context window size:

Zhipu GLM-4.5V supports a 64K token multimodal context window (i.e., it can process up to 64,000 tokens of combined text, image, or video per request)[2][6][9]. This is confirmed in technical summaries and product announcements.

Function (Tool) Calling Support:

There is no explicit mention in the available official documentation, technical reports, or SDK examples of a native implementation for function calling (i.e., OpenAI "tools" schema compatibility for structured API endpoints as actions, also called "functions")[7][3].

While GLM-4.5V is designed for strong multimodal and agent scenarios and can be integrated into GUI agent prompts and desktop tools, the APIs provided show only classic chat/inference patterns (prompt completion and image/video inputs)[2][3][7]. No evidence is given of a dedicated function calling interface or adherence to the OpenAI "tools" JSON schema for function calling.

Available developer demos show classic multimodal inference, visual grounding, and GUI agent tasks using prompt engineering, not a schema-augmented or structured function call format as standardized by OpenAI's "tools"[3].

Summary Table

Aspect GLM-4.5V Specification

Official Model Name glm-4.5v

Context Window 64,000 tokens (64K), multimodal

OpenAI "tools" Support Not officially documented as supported

If your use case requires structured function/tool calling as defined by OpenAI's "tools" schema (functions with input/output schemas handled natively by the model API), there is no direct evidence of support for this in GLM-4.5V as of August 2025. Integration would likely require custom prompt handling and downstream orchestration code on your part.

Citations:

1: https://news.aibase.com/news/20407

2: https://news.aibase.com/news/20404

3: https://github.com/zai-org/GLM-V

4: https://news.smol.ai/issues/25-08-12-not-much

5: https://eu.36kr.com/en/p/3418614402207366

6: https://www.youtube.com/watch?v=vTuXQkQWgqA

7: https://www.kdjingpai.com/en/glm-45v/

8: https://news.smol.ai/issues/25-08-11-ioi-gold

9: https://hyper.ai/en/headlines/532a7a0f1b1768a229a19fa2b061a0b7

10: https://www.barchart.com/story/news/34130142/zai-launches-glm45v-opensource-visionlanguage-model-sets-new-bar-for-multimodal-reasoning

Action: Set GLM-4.5V functionCall to false (ID & context verified)

Short summary: Verified the official model ID is glm-4.5v and the multimodal context window is 64K tokens (65536). There is no official documentation showing native function/tool-calling support compatible with OpenAI "tools", so the code should not mark this model as supporting function calls.

Files/locations to update:

src/main/presenter/configPresenter/modelDefaultSettings.ts — entry at ~lines 1649–1656: change functionCall: true

src/main/presenter/configPresenter/providerModelSettings.ts — entry at ~lines 311–318: change functionCall: true

src/main/presenter/llmProviderPresenter/providers/zhipuProvider.ts — model entry around line ~143: verify and update if it marks function calling

Suggested change (replace the single line in each block):

from:
functionCall: true,

to:
functionCall: false, // no official native function/tool-calling documented for glm-4.5v

Keep vision: true and contextLength: 65536 as-is.

🤖 Prompt for AI Agents

In src/main/presenter/configPresenter/modelDefaultSettings.ts around lines 1648–1658, change the glm-4.5v model entry to set functionCall to false (it currently is true) because there's no official native function/tool-calling support; likewise update src/main/presenter/configPresenter/providerModelSettings.ts around lines 311–318 to set functionCall: false for the glm-4.5v entry, and inspect src/main/presenter/llmProviderPresenter/providers/zhipuProvider.ts near line ~143 to verify any glm-4.5v model entry does not mark functionCall true and change it to false if present; keep vision: true and contextLength: 65536 unchanged and add a short inline comment like "// no official native function/tool-calling documented for glm-4.5v" where you flip the flag.

coderabbitai · 2025-08-15T01:19:24Z

src/main/presenter/llmProviderPresenter/providers/zhipuProvider.ts

+      {
+        id: 'glm-4.5v',
+        name: 'GLM-4.5V',
+        group: 'zhipu',
+        providerId: this.provider.id,
+        isCustom: false,
+        contextLength: 65536,
+        maxTokens: 8192
+      },


💡 Verification agent

🧩 Analysis chain

Sanity check: vision handling path works for GLM-4.5V

Since this model is multimodal, ensure the OpenAI-compatible code path properly formats image inputs for Zhipu (as it already does for glm-4v*). No code changes required here if the existing vision path is reused; just verify end-to-end.

🏁 Script executed:

#!/bin/bash set -euo pipefail # Check for existing vision handling in Zhipu/OpenAI-compatible paths rg -n -C3 -i 'glm-4v|image|vision' src/main/presenter/llmProviderPresenter/providers

Length of output: 121472

🏁 Script executed:

#!/bin/bash set -euo pipefail file="src/main/presenter/llmProviderPresenter/providers/zhipuProvider.ts" echo "=== File: $file ===" nl -ba "$file" | sed -n '1,420p'

Length of output: 449

🏁 Script executed:

#!/bin/bash set -euo pipefail file="src/main/presenter/llmProviderPresenter/providers/zhipuProvider.ts" if [[ ! -f "$file" ]]; then echo "FILE_NOT_FOUND: $file" exit 0 fi echo "=== Showing $file (first 400 lines) ===" sed -n '1,400p' "$file" echo "=== Searching for inheritance and vision/image handling within the file ===" rg -n -C2 "class |extends|handleImg|image_url|image:|image_url|image" "$file" || true

Length of output: 6476

Action: mark GLM-4.5V as vision-enabled in zhipuProvider

OpenAICompatibleProvider already implements the image/vision handling path and ZhipuProvider inherits it, but the GLM-4.5V model entry in src/main/presenter/llmProviderPresenter/providers/zhipuProvider.ts is not flagged as vision:true — the upper layer uses that flag to decide whether to insert image_url. Add the vision flag so the multimodal path is used end-to-end (and do the same for other glm-4v* variants if they are multimodal).

Files to update:

src/main/presenter/llmProviderPresenter/providers/zhipuProvider.ts — add vision: true to the GLM-4.5V model object (and optionally glm-4v*, glm-4v-plus-0111, glm-4v-flash if they support vision).

Suggested diff:
@@
{
id: 'glm-4.5v',
name: 'GLM-4.5V',
group: 'zhipu',
providerId: this.provider.id,
isCustom: false,

vision: true, contextLength: 65536, maxTokens: 8192 },

🤖 Prompt for AI Agents

In src/main/presenter/llmProviderPresenter/providers/zhipuProvider.ts around lines 142 to 150, the GLM-4.5V model entry is missing the vision flag so the multimodal image path is not used; add vision: true to that model object (and optionally add vision: true to other glm-4v* entries such as glm-4v, glm-4v-plus-0111, glm-4v-flash if those models support vision) so the upper layers will include image_url and route requests through the provider's vision handling.

feat: add GLM-4.5V model support

fce0b8e

coderabbitai bot reviewed Aug 15, 2025

View reviewed changes

zerob13 approved these changes Aug 15, 2025

View reviewed changes

zerob13 merged commit f1b9111 into dev Aug 15, 2025
2 checks passed

zerob13 deleted the feat/add-glm-4.5v-support branch November 23, 2025 13:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add GLM-4.5V model support #739

feat: add GLM-4.5V model support #739

Uh oh!

yyhhyyyyyy commented Aug 15, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Aug 15, 2025 •

edited

Loading

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Aug 15, 2025

Uh oh!

coderabbitai bot Aug 15, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Aspect	GLM-4.5V Specification
Official Model Name	glm-4.5v
Context Window	64,000 tokens (64K), multimodal
OpenAI "tools" Support	Not officially documented as supported

feat: add GLM-4.5V model support #739

feat: add GLM-4.5V model support #739

Uh oh!

Conversation

yyhhyyyyyy commented Aug 15, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Aug 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 15, 2025

Choose a reason for hiding this comment

Details and Supporting Information

Uh oh!

coderabbitai bot Aug 15, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yyhhyyyyyy commented Aug 15, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 15, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)