Conversation
|
@zhyncs wondered if you can wait a bit, I have a PR coming |
|
@HaiShaw May you build on top of this PR? This is an urgent fix for the main branch. |
Sounds good, I will just need to shrink down my diff. |
|
FYI All 2-GPU unit tests, performance tests, and accuracy tests failed due to the machine itself running out of memory. I manually executed them in the local development environment without any issues. Please ignore. |
| """ | ||
|
|
||
| def __init__(self, quant_config: Fp8Config): | ||
| def __new__(cls, *args, **kwargs): |
There was a problem hiding this comment.
FusedMoEMethodBase needs to be inherited, but directly writing it as an import will cause circular dependencies. Currently, a dynamic approach is used to avoid this issue.
There was a problem hiding this comment.
Most other changes are what I spotted too, just __new__ doesn't seem to be necessary?
There was a problem hiding this comment.
__new__ is used here because we need to modify the class inheritance before instance creation. It's the only method that runs before __init__ and allows us to control how the instance is created, letting us break the circular import by setting up inheritance at runtime rather than import time.
There was a problem hiding this comment.
If we use apply in fp8.py, and remove apply setting in __init__.py, should be simply ok?
There was a problem hiding this comment.
Thanks, let me take a look, my side of ROCm tests has got no complain, so worthy a check.
There was a problem hiding this comment.
python3 -c "from sglang.srt.layers.fused_moe_triton.fused_moe import fused_moe"There was a problem hiding this comment.
I see it too, only used in benchmark scripts, so we will fix it, let me continue it tomorrow.
Ignore - mean the failed cases? |
yes |
|
Currently, nightly gsm8k and the following gpu-2 have been locally verified. It seems to be an issue with the GPU runner. @merrymercy Please help fix the GPU runner issue. Thanks. https://github.com/sgl-project/sglang/actions/runs/12212009601/job/34069970746 |
|
bash test.sh ✅ #!/bin/bash
set -ex
python3 test_data_parallelism.py
python3 test_mla.py
python3 test_mla_fp8.py
python3 test_dp_attention.py
python3 test_update_weights_from_distributed.py
python3 test_moe_ep.py
python3 -m unittest test_bench_one_batch.TestBenchOneBatch.test_moe_default
python3 -m unittest test_bench_serving.TestBenchServing.test_moe_offline_throughput_default
python3 -m unittest test_bench_serving.TestBenchServing.test_moe_offline_throughput_without_radix_cache |
|
Let me further explain the fix and design intention of this PR.
|
Motivation
fix #2386 #2370 #2366
cc @BBuf @HaiShaw
Modifications
Checklist