Fix EP post merge by ArthurZucker · Pull Request #43730 · huggingface/transformers

ArthurZucker · 2026-02-04T09:23:04Z

What does this PR do?

Fix GPT-OSS ???

EP sharding works, the forward is broken for eager (does not work) and output giberissh for grouped_mm.
Will work on a fix to have in the patch!

Fix any TP model

fix get shard tensor
fix get packed shard weights
fix tensor_idx

HuggingFaceDocBuilderDev · 2026-02-04T09:43:34Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

…lecting for mergeModuleList

3outeille · 2026-02-04T13:56:10Z

        shape[dim] = end - start
        return tuple(shape)

+class AllReduce(TensorParallelLayer):


we should rename it else it becomes hard to understand given we have

def all_reduce_forward(x, device_mesh): """All-reduce forward, identity backward. Use after rowwise layers.""" return _AllReduceForward.apply(x, device_mesh)

we will use moe_tp_experts instead (handle backwards)

github-actions · 2026-02-05T14:29:34Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: glm4_moe_lite, gpt_oss, minimax_m2, solar_open, fbgemm_fp8

3outeille · 2026-02-05T14:31:41Z

lgtm nice job !

* restore * add all reduce for ep * fix init and bias sharding * fix finalize weight init * add full stacktracing * fix * okay big improvement here * the only case shard index should be used is when we are acctually collecting for mergeModuleList * more fixes * fix EP forward gpt oss * revert some shit * when you are stupid sometimes you really need a brain :) :) :) :) * fix TP * Ok GPT oss is fixed now * try to fix perms * attempt to fix * am I a doomer and AI is not that bad? * fix * it "passes" but the output is shit * style my man * outputs are gonna be giberish but at least the forward pass "works" * dtyle * fix mixtral * okay shape fixes * tensor idx is only for groupped gemm / EP * fix gate_up shard * fix :) * revert some EP changes that are breaking other stuff * style * use moe_tp_experts * revert unrelated, last nits and style * good? * fix modlar --------- Co-authored-by: 3outeille <ferdinand.mom@epita.fr>

ArthurZucker force-pushed the fix-ep branch from 629d303 to c084aa7 Compare February 4, 2026 09:25

restore

ec2ed1d

ArthurZucker force-pushed the fix-ep branch from 596897f to ec2ed1d Compare February 4, 2026 09:31

add all reduce for ep

33ca330

ArthurZucker added 3 commits February 4, 2026 10:32

fix init and bias sharding

e545ac1

fix finalize weight init

fa78068

add full stacktracing

6e4d234

ArthurZucker changed the title up Fix EP post merge Feb 4, 2026

ArthurZucker added 3 commits February 4, 2026 12:55

fix

05fc1fa

okay big improvement here

819698c

the only case shard index should be used is when we are acctually col…

d99f834

…lecting for mergeModuleList

3outeille reviewed Feb 4, 2026

View reviewed changes

ArthurZucker added 17 commits February 4, 2026 14:06

more fixes

f0d0de1

fix EP forward gpt oss

c5cbdc8

revert some shit

94d676c

when you are stupid sometimes you really need a brain :) :) :) :)

959b46f

fix TP

01c5774

Ok GPT oss is fixed now

9dbb634

try to fix perms

8374298

attempt to fix

8e46655

am I a doomer and AI is not that bad?

104f80d

fix

14dca0c

it "passes" but the output is shit

3600fbe

Merge branch 'fix-ep' of github.com:huggingface/transformers into fix-ep

fabde8a

style my man

20dee9a

outputs are gonna be giberish but at least the forward pass "works"

0b95c64

Merge branch 'main' of github.com:huggingface/transformers into fix-ep

9137fa2

dtyle

646cbe3

fix mixtral

4b32a6b

ArthurZucker added 6 commits February 5, 2026 11:15

okay shape fixes

1460762

tensor idx is only for groupped gemm / EP

f789395

fix gate_up shard

8b4ed7b

fix :)

d8cd533

revert some EP changes that are breaking other stuff

76c904c

style

cfd92d7

3outeille added a commit that referenced this pull request Feb 5, 2026

Merge PR #43730 (fix-ep) into fix-moe-ep

73e77fb

3outeille mentioned this pull request Feb 5, 2026

🚨 fix + tests dense & MoE TP all reduce (decoder only) #43722

Merged

3outeille reviewed Feb 5, 2026

View reviewed changes

Comment thread src/transformers/models/solar_open/modular_solar_open.py Outdated

3outeille reviewed Feb 5, 2026

View reviewed changes

Comment thread src/transformers/models/minimax_m2/modular_minimax_m2.py Outdated

3outeille reviewed Feb 5, 2026

View reviewed changes

Comment thread src/transformers/quantizers/quantizer_fbgemm_fp8.py Outdated

3outeille reviewed Feb 5, 2026

View reviewed changes

Comment thread src/transformers/core_model_loading.py Outdated

use moe_tp_experts

7ad14a7

3outeille approved these changes Feb 5, 2026

View reviewed changes

ArthurZucker added 3 commits February 5, 2026 14:35

revert unrelated, last nits and style

c3ae670

good?

45dd124

fix modlar

46900e0

ArthurZucker merged commit e1f3766 into main Feb 5, 2026
24 of 26 checks passed

ArthurZucker deleted the fix-ep branch February 5, 2026 15:23

ArthurZucker mentioned this pull request Apr 20, 2026

Fix EP: RouterParallel shape, tp_plan property, grouped_mm sentinels #45473

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix EP post merge#43730

Fix EP post merge#43730
ArthurZucker merged 35 commits into
mainfrom
fix-ep

ArthurZucker commented Feb 4, 2026 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Feb 4, 2026

Uh oh!

3outeille Feb 4, 2026

Uh oh!

3outeille Feb 5, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Feb 5, 2026

Uh oh!

3outeille commented Feb 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ArthurZucker commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Fix GPT-OSS ???

Fix any TP model

Uh oh!

HuggingFaceDocBuilderDev commented Feb 4, 2026

Uh oh!

3outeille Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

3outeille Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Feb 5, 2026

Uh oh!

3outeille commented Feb 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ArthurZucker commented Feb 4, 2026 •

edited

Loading