Skip to content

[Bug] glm5 mxfp4 mtp is broken #23142

@functionstackx

Description

@functionstackx

Checklist

  • I searched related issues but found no solution.
  • The bug persists in the latest version.
  • Issues without environment info and a minimal reproducible demo are hard to resolve and may receive no feedback.
  • If this is not a bug report but a general question, please start a discussion at https://github.com/sgl-project/sglang/discussions. Otherwise, it will be closed.
  • Please use English. Otherwise, it will be closed.

Describe the bug

hi @hubertlu-tw @HaiShaw @chunfangamd

glm5 mxfp4 mtp is broken

2026-04-18 22:34:20 TP0] model.eh_proj.weight_scale not found in params_dict.
[2026-04-18 22:34:20 TP7] Scheduler hit an exception: Traceback (most recent call last):
  File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 3771, in run_scheduler_process
    scheduler = Scheduler(
  File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 425, in __init__
    self.init_model_worker()
  File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 683, in init_model_worker
    self.maybe_init_draft_worker()
  File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 667, in maybe_init_draft_worker
    self.draft_worker = DraftWorkerClass(**draft_worker_kwargs)
  File "/sgl-workspace/sglang/python/sglang/srt/speculative/eagle_worker_v2.py", line 658, in __init__
    self._draft_worker = EagleDraftWorker(
  File "/sgl-workspace/sglang/python/sglang/srt/speculative/eagle_worker_v2.py", line 138, in __init__
    self.draft_worker = TpModelWorker(
  File "/sgl-workspace/sglang/python/sglang/srt/managers/tp_worker.py", line 260, in __init__
    self._init_model_runner()
  File "/sgl-workspace/sglang/python/sglang/srt/managers/tp_worker.py", line 343, in _init_model_runner
    self._model_runner = ModelRunner(
  File "/sgl-workspace/sglang/python/sglang/srt/model_executor/model_runner.py", line 480, in __init__
    self.initialize(pre_model_load_memory)
  File "/sgl-workspace/sglang/python/sglang/srt/model_executor/model_runner.py", line 570, in initialize
    self.load_model()
  File "/sgl-workspace/sglang/python/sglang/srt/model_executor/model_runner.py", line 1264, in load_model
    self.model = self.loader.load_model(
  File "/sgl-workspace/sglang/python/sglang/srt/model_loader/loader.py", line 699, in load_model
    self.load_weights_and_postprocess(
  File "/sgl-workspace/sglang/python/sglang/srt/model_loader/loader.py", line 708, in load_weights_and_postprocess
    model.load_weights(weights)
  File "/sgl-workspace/sglang/python/sglang/srt/models/deepseek_nextn.py", line 284, in load_weights
    super().load_weights(weights, is_nextn=True)
  File "/sgl-workspace/sglang/python/sglang/srt/models/deepseek_v2.py", line 2323, in load_weights
    self.do_load_weights(weights, is_nextn)
  File "/sgl-workspace/sglang/python/sglang/srt/models/deepseek_common/deepseek_weight_loader.py", line 361, in do_load_weights
    future.result()

Reproduction

https://github.com/SemiAnalysisAI/InferenceX/actions/runs/24615280642/job/71976386638?pr=1091

https://github.com/SemiAnalysisAI/InferenceX/pull/1091/changes#diff-802d3a7be2d0d2932c889be8616a6c220b90cf93b440be7b06cc645414d889bf

python3 -m sglang.launch_server \
    --model-path $MODEL \
    --host=0.0.0.0 \
    --port $PORT \
    --trust-remote-code \
    --tp $TP \
    --chunked-prefill-size 131072 \
    --disable-radix-cache \
    --mem-fraction-static 0.85 \
    --model-loader-extra-config '{"enable_multithread_load": true}' \
    --watchdog-timeout 1200 \
    --reasoning-parser glm45 \
    --tool-call-parser glm47 \
    --speculative-algorithm EAGLE \
    --speculative-num-steps 3 \
    --speculative-eagle-topk 1 \
    --speculative-num-draft-tokens 4 \
    $EVAL_CONTEXT_ARGS > $SERVER_LOG 2>&1 &

Environment

lmsysorg/sglang-rocm:v0.5.10rc0-rocm700-mi35x-20260417
amd/GLM-5-MXFP4

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions