Skip to content

[MoE][PoC] Expert Parallel: tp and tp2ep#731

Draft
tianyu-l wants to merge 3 commits intogh/tianyu-l/25/basefrom
gh/tianyu-l/25/head
Draft

[MoE][PoC] Expert Parallel: tp and tp2ep#731
tianyu-l wants to merge 3 commits intogh/tianyu-l/25/basefrom
gh/tianyu-l/25/head

Conversation

@tianyu-l
Copy link
Contributor

@tianyu-l tianyu-l commented Dec 12, 2024

Stack from ghstack (oldest at bottom):

Issues (12/11/2024)

  • forward collectives look right ("tp2ep" AG -> compute -> RS), need to understand the backward better
  • torch.compile generates full graph (applied per TransformerBlock), but inserts an additional A2A at the end of every two blocks

Not including

  • softmax scoring when Router Parallel is used (currently only sigmoid)

@tianyu-l tianyu-l mentioned this pull request Dec 12, 2024
tianyu-l added a commit that referenced this pull request Dec 12, 2024
ghstack-source-id: 5e173f3
Pull Request resolved: #731
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Dec 12, 2024
Issues (12/11/2024)
- forward collectives look right ("tp2ep" AG -> compute -> RS), need to understand the backward better
- torch.compile generates full graph (applied per TransformerBlock), but inserts an additional A2A at the end of every two blocks

Haven't worked on
- softmax scoring when Router Parallel is used (currently only sigmoid)

[ghstack-poisoned]
@tianyu-l tianyu-l marked this pull request as draft December 12, 2024 04:09
[ghstack-poisoned]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants