Skip to content

[Bug] fix pp for qwen3_5 (KeyError when reading params) #21184

@jia-wei-tang

Description

@jia-wei-tang

Checklist

  • I searched related issues but found no solution.
  • The bug persists in the latest version.
  • Issues without environment info and a minimal reproducible demo are hard to resolve and may receive no feedback.
  • If this is not a bug report but a general question, please start a discussion at https://github.com/sgl-project/sglang/discussions. Otherwise, it will be closed.
  • Please use English. Otherwise, it will be closed.

Describe the bug

with pp=8 tp=1:

[2026-03-23 10:59:23 PP6] Scheduler hit an exception: Traceback (most recent call last):
  File "/disk1/tjw/sglang/python/sglang/srt/managers/scheduler.py", line 3413, in run_scheduler_process
    scheduler = Scheduler(
                ^^^^^^^^^^
  File "/disk1/tjw/sglang/python/sglang/srt/managers/scheduler.py", line 376, in __init__
    self.init_model_worker()
  File "/disk1/tjw/sglang/python/sglang/srt/managers/scheduler.py", line 593, in init_model_worker
    self.init_tp_model_worker()
  File "/disk1/tjw/sglang/python/sglang/srt/managers/scheduler.py", line 551, in init_tp_model_worker
    self.tp_worker = TpModelWorker(
                     ^^^^^^^^^^^^^^
  File "/disk1/tjw/sglang/python/sglang/srt/managers/tp_worker.py", line 261, in __init__
    self._init_model_runner()
  File "/disk1/tjw/sglang/python/sglang/srt/managers/tp_worker.py", line 344, in _init_model_runner
    self._model_runner = ModelRunner(
                         ^^^^^^^^^^^^
  File "/disk1/tjw/sglang/python/sglang/srt/model_executor/model_runner.py", line 422, in __init__
    self.initialize(pre_model_load_memory)
  File "/disk1/tjw/sglang/python/sglang/srt/model_executor/model_runner.py", line 502, in initialize
    self.load_model()
  File "/disk1/tjw/sglang/python/sglang/srt/model_executor/model_runner.py", line 1079, in load_model
    self.model = self.loader.load_model(
                 ^^^^^^^^^^^^^^^^^^^^^^^
  File "/disk1/tjw/sglang/python/sglang/srt/model_loader/loader.py", line 689, in load_model
    self.load_weights_and_postprocess(
  File "/disk1/tjw/sglang/python/sglang/srt/model_loader/loader.py", line 698, in load_weights_and_postprocess
    model.load_weights(weights)
  File "/disk1/tjw/sglang/python/sglang/srt/models/qwen3_5.py", line 1372, in load_weights
    param = params_dict[name_mapped]
            ~~~~~~~~~~~^^^^^^^^^^^^^
KeyError: 'model.layers.4.mlp.experts.w13_weight'

Reproduction

run cmd:

python3 -m sglang.launch_server --model-path  /disk1/models/Qwen3.5-122B-A10B-FP8          --disable-radix-cache         --mem-fraction-static 0.6  --tp-size 1    --pipeline-parallel-size 8   --trust-remote-code

Environment

8 GPU (NVIDIA GeForce RTX 4090)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions