-
Notifications
You must be signed in to change notification settings - Fork 2.8k
feat: add GLM-4.6 model support for z.ai provider #8408
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Add GLM-4.6 configuration to internationalZAiModels with 200k context window - Add GLM-4.6 configuration to mainlandZAiModels with tiered pricing - Add test coverage for GLM-4.6 model in both international and China API lines Fixes #8406
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Self-review protocol engaged: arguing with myself about GLM-4.6 so future-me doesn't bill at 200k twice.
| cacheReadsPrice: 0.057, | ||
| }, | ||
| { | ||
| contextWindow: 200_000, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[P2] The 200_000 tier duplicates the Infinity tier pricing. Unless the provider bills differently at exactly 200k, this tier appears redundant and can be removed to reduce noise.
| "GLM-4.5-Air is the lightweight version of GLM-4.5. It balances performance and cost-effectiveness, and can flexibly switch to hybrid thinking models.", | ||
| }, | ||
| "glm-4.6": { | ||
| maxTokens: 98_304, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[P3] Please confirm maxTokens for GLM-4.6. With a 204,800 context window, 98,304 mirrors the 4.5 configuration, but it may differ for 4.6 per provider docs.
| const model = handlerWithModel.getModel() | ||
| expect(model.id).toBe(testModelId) | ||
| expect(model.info).toEqual(internationalZAiModels[testModelId]) | ||
| expect(model.info.contextWindow).toBe(204_800) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[P3] Consider asserting pricing and supportsPromptCache for GLM-4.6 (both intl and mainland) to catch future config regressions.
| cacheWritesPrice: 0, | ||
| cacheReadsPrice: 0.11, | ||
| description: | ||
| "GLM-4.6 is Zhipu's newest model with an extended context window of up to 200k tokens, providing enhanced capabilities for processing longer documents and conversations.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[P3] Optional: add a link to the GLM-4.6 docs/announcement in this description for maintainers to verify context window and pricing quickly.
|
I am not sure if it is fixed, context size is ok, but model never goes over 104k tokens... what is more like 131k context window, than 200k. |
Description
This PR attempts to address Issue #8406 by adding support for the GLM-4.6 model from z.ai (Zhipu AI) with its 200k context window.
Changes
internationalZAiModelswith 200k (204,800 tokens) context windowmainlandZAiModelswith appropriate tiered pricing structureTesting
Related Issue
Fixes #8406
Feedback and guidance are welcome!
Important
Adds support for GLM-4.6 model from Zhipu AI with 200k context window and tests for international and China API lines.
internationalZAiModelsandmainlandZAiModelswith a 200k context window.mainlandZAiModels.zai.spec.tsfor both international and China API lines.This description was created by
for dc549ee. You can customize this summary. It will automatically update as commits are pushed.