Skip to content

refine sgl_moe_align_block_size_benchmark#4327

Merged
zhyncs merged 2 commits intomainfrom
refine_sgl_moe_align_block_size_benchmark
Mar 12, 2025
Merged

refine sgl_moe_align_block_size_benchmark#4327
zhyncs merged 2 commits intomainfrom
refine_sgl_moe_align_block_size_benchmark

Conversation

@BBuf
Copy link
Copy Markdown
Collaborator

@BBuf BBuf commented Mar 12, 2025

Part of #2965

h200:

INFO 03-12 04:02:20 __init__.py:190] Automatically detected platform cuda.
✅ SGL and Triton implementations match
✅ SGL and VLLM implementations match
moe-align-block-size-performance:
     num_tokens  num_experts  topk        SGL      Triton        VLLM
0          16.0         32.0   2.0  16.736001   25.312001   16.832000
1          16.0         32.0   4.0  16.767999   25.760001   16.992001
2          16.0         32.0   8.0  16.928000   26.528001   17.152000
3          16.0         64.0   2.0  17.664000   27.488001   20.128001
4          16.0         64.0   4.0  17.632000   27.584000   20.191999
5          16.0         64.0   8.0  17.696001   28.480001   20.288000
6          16.0        128.0   2.0  19.584000   31.968001   31.392001
7          16.0        128.0   4.0  19.743999   32.095999   31.583998
8          16.0        128.0   8.0  19.936001   32.127999   31.615999
9          16.0        256.0   2.0  22.655999   42.911999   65.151997
10         16.0        256.0   4.0  22.816001   43.168001   65.183997
11         16.0        256.0   8.0  22.816001   43.104000   65.375999
12         32.0         32.0   2.0  16.896000   25.728000   16.896000
13         32.0         32.0   4.0  16.767999   26.335999   17.055999
14         32.0         32.0   8.0  17.120000   27.232001   17.696001
15         32.0         64.0   2.0  17.664000   27.584000   20.128001
16         32.0         64.0   4.0  17.664000   28.448001   20.288000
17         32.0         64.0   8.0  17.632000   29.376000   20.288000
18         32.0        128.0   2.0  19.616000   32.063998   31.488001
19         32.0        128.0   4.0  19.808000   32.159999   31.615999
20         32.0        128.0   8.0  19.840000   32.703999   31.776000
21         32.0        256.0   2.0  22.816001   43.136001   65.215997
22         32.0        256.0   4.0  22.848001   43.264002   65.343998
23         32.0        256.0   8.0  23.008000   43.680001   65.536000
24         64.0         32.0   2.0  16.896000   26.303999   17.088000
25         64.0         32.0   4.0  17.088000   27.200000   17.759999
26         64.0         32.0   8.0  16.672000   28.224001   18.784000
27         64.0         64.0   2.0  17.664000   28.480001   20.320000
28         64.0         64.0   4.0  17.503999   29.536000   20.463999
29         64.0         64.0   8.0  18.400000   31.104000   21.632001
30         64.0        128.0   2.0  19.776000   32.288000   31.647999
31         64.0        128.0   4.0  19.872000   32.639999   31.808000
32         64.0        128.0   8.0  19.776000   33.696000   32.000002
33         64.0        256.0   2.0  22.816001   43.296002   65.343998
34         64.0        256.0   4.0  23.040000   43.648001   65.504000
35         64.0        256.0   8.0  23.167999   44.863999   66.111997
36        128.0         32.0   2.0  16.896000   27.248001   17.728001
37        128.0         32.0   4.0  16.672000   28.255999   18.784000
38        128.0         32.0   8.0  16.896000   31.168001   22.816001
39        128.0         64.0   2.0  17.503999   29.600000   20.479999
40        128.0         64.0   4.0  18.432001   30.975999   21.600001
41        128.0         64.0   8.0  18.239999   32.448001   22.832001
42        128.0        128.0   2.0  19.776000   32.703999   31.776000
43        128.0        128.0   4.0  19.776000   33.824001   31.808000
44        128.0        128.0   8.0  19.776000   35.647999   32.575998
45        128.0        256.0   2.0  23.008000   43.744002   65.536000
46        128.0        256.0   4.0  23.200000   44.927999   66.032000
47        128.0        256.0   8.0  23.584001   44.319998   66.704005
48        256.0         32.0   2.0  16.608000   28.255999   18.784000
49        256.0         32.0   4.0  16.928000   31.199999   22.784000
50        256.0         32.0   8.0  17.344000   37.087999   29.664000
51        256.0         64.0   2.0  18.271999   31.168001   21.600001
52        256.0         64.0   4.0  18.239999   32.448001   22.976000
53        256.0         64.0   8.0  18.784000   35.360001   27.200000
54        256.0        128.0   2.0  19.776000   33.728000   31.968001
55        256.0        128.0   4.0  19.776000   35.840001   32.575998
56        256.0        128.0   8.0  20.416001   38.656000   36.352001
57        256.0        256.0   2.0  23.104001   44.831999   66.079997
58        256.0        256.0   4.0  23.584001   44.383999   66.671997
59        256.0        256.0   8.0  23.584001   46.080001   67.648001
60        512.0         32.0   2.0  16.896000   31.168001   22.816001
61        512.0         32.0   4.0  17.312000   37.103999   29.696001
62        512.0         32.0   8.0  17.824000   48.735999   44.032000
63        512.0         64.0   2.0  18.224001   32.384001   22.927999
64        512.0         64.0   4.0  18.784000   35.360001   26.688000
65        512.0         64.0   8.0  19.552000   40.959999   41.407999
66        512.0        128.0   2.0  19.776000   35.744000   32.607999
67        512.0        128.0   4.0  20.512000   38.688000   36.192000
68        512.0        128.0   8.0  20.703999   42.528000   43.248001
69        512.0        256.0   2.0  23.584001   44.256002   66.688001
70        512.0        256.0   4.0  23.631999   46.016000   67.616001
71        512.0        256.0   8.0  24.192000   50.719999   71.039997
72       1024.0         32.0   2.0  17.312000   37.055999   29.664000
73       1024.0         32.0   4.0  17.824000   48.672002   44.256002
74       1024.0         32.0   8.0  19.967999   72.159998   79.712003
75       1024.0         64.0   2.0  18.751999   35.360001   26.815999
76       1024.0         64.0   4.0  19.392001   40.895998   42.112000
77       1024.0         64.0   8.0  21.344000   52.767999   65.407999
78       1024.0        128.0   2.0  20.384001   38.624000   35.840001
79       1024.0        128.0   4.0  20.784000   42.512000   43.264002
80       1024.0        128.0   8.0  22.208000   48.032001   55.039998
81       1024.0        256.0   2.0  23.776000   46.048000   67.616001
82       1024.0        256.0   4.0  24.351999   50.912000   71.071997
83       1024.0        256.0   8.0  26.176000   56.992002   81.440002
84       2048.0         32.0   2.0  17.728001   48.656002   44.224001
85       2048.0         32.0   4.0  20.160001   72.159998   79.664007
86       2048.0         32.0   8.0  24.992000  117.151998  148.512006
87       2048.0         64.0   2.0  19.376000   40.927999   41.919999
88       2048.0         64.0   4.0  21.312000   52.800000   65.375999
89       2048.0         64.0   8.0  27.295999   76.159999  112.095997
90       2048.0        128.0   2.0  20.880001   42.479999   43.264002
91       2048.0        128.0   4.0  22.240000   48.128001   55.039998
92       2048.0        128.0   8.0  27.456000   59.904002   80.480002
93       2048.0        256.0   2.0  24.192000   50.976001   71.039997
94       2048.0        256.0   4.0  26.176000   56.623999   81.440002
95       2048.0        256.0   8.0  32.032002   64.960003   98.112002
96       4096.0         32.0   2.0  19.967999   72.223999   78.592002
97       4096.0         32.0   4.0  24.992000  117.183998  148.479998
98       4096.0         32.0   8.0  42.016000  208.128005  295.231998
99       4096.0         64.0   2.0  21.312000   52.735999   64.928003
100      4096.0         64.0   4.0  27.295999   76.223999  112.000003
101      4096.0         64.0   8.0  44.831999  121.424004  203.999996
102      4096.0        128.0   2.0  22.240000   48.096001   55.135999
103      4096.0        128.0   4.0  27.456000   59.999999   80.256000
104      4096.0        128.0   8.0  44.064000   83.328001  132.287994
105      4096.0        256.0   2.0  26.079999   56.607999   81.376001
106      4096.0        256.0   4.0  32.032002   65.103993   98.304003
107      4096.0        256.0   8.0  45.343999   78.720003  131.968006
108      8192.0         32.0   2.0  25.040001  117.215998  146.975994
109      8192.0         32.0   4.0  41.887999  208.287999  294.079989
110      8192.0         32.0   8.0  66.175997  381.408006  614.271998
111      8192.0         64.0   2.0  27.424000   76.207995  112.032004
112      8192.0         64.0   4.0  44.831999  121.376000  204.352006
113      8192.0         64.0   8.0  71.712002  213.407993  389.023989
114      8192.0        128.0   2.0  27.456000   59.935998   80.288000
115      8192.0        128.0   4.0  44.176001   83.296001  132.159993
116      8192.0        128.0   8.0  68.127997  128.703997  233.119994
117      8192.0        256.0   2.0  32.000002   65.024003   98.144002
118      8192.0        256.0   4.0  45.440000   78.752004  131.712005
119      8192.0        256.0   8.0  68.143994  102.816001  453.696012

@zhyncs zhyncs merged commit 7130a7c into main Mar 12, 2025
@zhyncs zhyncs deleted the refine_sgl_moe_align_block_size_benchmark branch March 12, 2025 05:48
hebiao064 pushed a commit to hebiao064/sglang that referenced this pull request Mar 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants