Support non-uniform layer structures in MFU estimation

The current MFU estimator assumes all layers have the same structure. Models like [Qwen3.5-397B-A17B](https://huggingface.co/Qwen/Qwen3.5-397B-A17B) have non-uniform layer structures, which the estimator doesn't account for. The FLOPs and memory bandwidth estimates would be inaccurate for these architectures. Refer to the following comment: 

Other models with varying layer widths or mixed configurations may also not fit the current assumptions
Should we consider cases where the model's layer structures differ? For example, the latest Qwen3.5-397B-A17B: https://huggingface.co/Qwen/Qwen3.5-397B-A17B. Of course, I think this issue can be addressed later.

_Originally posted by @sufeng-buaa in https://github.com/sgl-project/sglang/issues/19395#issuecomment-3994828063_
            

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support non-uniform layer structures in MFU estimation #19919

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support non-uniform layer structures in MFU estimation #19919

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions