fix AttributeError: 'LazyValue' object has no attribute 'keys' in eplb_manager.py for qwen3 moe#21822
Conversation
…b_manager.py for qwen3 moe
There was a problem hiding this comment.
Code Review
This pull request removes the LazyValue utility and its usage within the Qwen3 MoE model implementation. Specifically, the routed_experts_weights_of_layer attribute is now initialized directly via a dictionary comprehension in the load_weights method rather than being wrapped in a lazy-loading function. I have no feedback to provide as there were no review comments to evaluate.
|
Created a matching issue #21833 |
|
i've checked that this bug has been introduced in v0.5.9 with the intention to speed-up weight loading to create this attribute only when necessary, we may patch LazyValue from utils as follows: class LazyValue:
def __init__(self, creator: Callable):
self._creator = creator
self._value = None
def __getattr__(self, name): # fix qwen3 coder 480b eplb
return getattr(self.value, name)
def __getitem__(self, key):
return self.value[key]
def __setitem__(self, key, value):
self.value[key] = value
@property
def value(self):
if self._creator is not None:
self._value = self._creator()
self._creator = None
return self._value |
| ) | ||
| } | ||
| ) | ||
| self.routed_experts_weights_of_layer = { |
There was a problem hiding this comment.
The correct way of fixing this should be implementing a getitem method for LazyValue class?
| def __getattr__(self, name): | ||
| return getattr(self.value, name) | ||
|
|
||
| def __getitem__(self, key): |
There was a problem hiding this comment.
Need to check whether self.value itself supports getitem method. self.value might be something other than dict
There was a problem hiding this comment.
if value is other than dict or list, LazyValue.getitem raises exactly the same exception as value itself would raise
i think this is ok
or do i miss something?
from typing import Callable
class LazyValue:
def __init__(self, creator: Callable):
self._creator = creator
self._value = None
def __getattr__(self, name):
return getattr(self.value, name)
def __getitem__(self, key):
return self.value[key]
def __setitem__(self, key, value):
self.value[key] = value
@property
def value(self):
if self._creator is not None:
self._value = self._creator()
self._creator = None
return self._value
d = {10: 'ten'}
lvd = LazyValue(lambda: d)
print(lvd.keys()) # dict_keys([10])
d = 1
lvd = LazyValue(lambda: d)
try:
print(lvd.keys())
except Exception as e:
print(f'{e}') # 'int' object has no attribute 'keys'
try:
print(lvd[0])
except Exception as e:
print(f'{e}') # 'int' object is not subscriptable
try:
print(d[0])
except Exception as e:
print(f'{e}') # 'int' object is not subscriptable|
/tag-and-rerun-ci |
**Upstream status** as of 2026-04-06: - Qwen3.5: fixed via [PR #19767](sgl-project/sglang#19767) (merged 2026-03-09, included in v0.5.10) - Qwen3: [PR #21461](sgl-project/sglang#21461) — closed without merge 2026-03-30 (CI failure), superseded by #21822 - Qwen3: [PR #21822](sgl-project/sglang#21822) — new fix opened 2026-03-26, addresses `AttributeError: 'LazyValue' object has no attribute 'keys'` in `eplb_manager.py` for Qwen3 MoE. Code review 2026-04-04 by `Fridge003` and `Evgueni-Petrov-aka-espetrov`. Alternative `LazyValue.__getattr__` approach proposed (avoids modifying the model class). **Approved** by `Fridge003` on 2026-04-06, CI rerun triggered — awaiting merge. (Duplicate [PR #21820](sgl-project/sglang#21820) was closed same day in favour of #21822.) Not in v0.5.10 When `--enable-eplb` is active with EP, the `EPLBManager` crashes after its first rebalance interval (default: 1000 forward passes): - SGLang PR #17137 — non-Marlin WNA16MoE port (does not fix EP bug) - SGLang #14158 — update_weights_from_tensor for WNA16MoE (unrelated) - SGLang [PR #13715](sgl-project/sglang#13715) — fix EPLB + FP4 weight tensor filtering (merged, different issue) - SGLang [PR #20963](sgl-project/sglang#20963) — Nvidia modelopt refactoring (1/N). Under active review: reviewer `Edwardf0t1` asked for end-to-end verification 2026-03-31, author `wenscarl` responded 2026-04-01 and posted 3 further inline review responses 2026-04-06. Not stalled but awaiting approval. Migrates the NVFP4 code as-is — expected vehicle for EP-awareness fixes (#20869, #21630). Watch this PR for resolution of the NVFP4 input_scale and CutlassMoEParams bugs - SGLang [PR #21822](sgl-project/sglang#21822) — new EPLB/Qwen3 fix (opened 2026-03-26). Addresses `LazyValue.keys()` AttributeError. Code review 2026-04-04 by `Fridge003` and `Evgueni-Petrov-aka-espetrov`. Alternative `LazyValue.__getattr__` approach proposed. **Approved** by `Fridge003` on 2026-04-06, CI rerun triggered — awaiting merge "Good code is like humor: when you have to explain it, it’s bad." - Cory House P.S.: Code reviews and approvals are crucial for maintaining high-quality software.
|
looks like all failure are unrelated to this PR? stage-b-test-1-gpu-large-amd (linux-mi325-1gpu-sglang, 1)
stage-c-test-large-8-gpu-amd-mi35x (linux-mi35x-gpu-8, 0) stage-c-test-large-8-gpu-amd-mi35x (linux-mi35x-gpu-8, 1) |
…b_manager.py for qwen3 moe (sgl-project#21822)
Motivation
found this typo while tuning sglang for qwen3 coder 480b in disaggregated prefill-decode setup
the exception is thrown by eplb_manager.py during the 1st attempt to rebalance the experts
Modifications
get rid of unnecessary wrapper
Accuracy Tests
n/a
Speed Tests and Profiling
n/a
Checklist
Review and Merge Process
/tag-and-rerun-ci,/tag-run-ci-label,/rerun-failed-ci