You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've seen that #42697 introduced the possibility of using grouped matrix multiplication for MoE layers. Any reason why gpt-oss was left out (I know that its mlp layer is decorated with MegaBlocksMoeMLP)?