[GLM-4.7] GLM Model support for GLM-Lite#31386
Conversation
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
There was a problem hiding this comment.
Code Review
This pull request adds support for the GLM-Lite model and its MTP (Multi-Token Prediction) variant for speculative decoding. The changes include new model implementation files, updates to model registries, and modifications to benchmark configurations. The implementation appears to leverage existing patterns from models like DeepseekV2. My review has identified a critical issue in the test configuration that could lead to incorrect testing, and a high-severity type hint error in the new model implementation.
|
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: Yuxuan Zhang <2448370773@qq.com>
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
|
like this? |
Signed-off-by: Roger Wang <hey@rogerw.io>
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com> Signed-off-by: Yuxuan Zhang <2448370773@qq.com>
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com> Signed-off-by: Yuxuan Zhang <2448370773@qq.com>
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com> Signed-off-by: Yuxuan Zhang <2448370773@qq.com>
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com> Signed-off-by: Yuxuan Zhang <2448370773@qq.com>
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com> Signed-off-by: Yuxuan Zhang <2448370773@qq.com>
using with transformers 5.0.0 with GLM-Lite model, transformers PR here