Skip to content

feat: add trtllm all-reduce (non-MoE)#1096

Merged
yzh119 merged 40 commits intoflashinfer-ai:mainfrom
yzh119:trtllm-comm
Jun 2, 2025
Merged

feat: add trtllm all-reduce (non-MoE)#1096
yzh119 merged 40 commits intoflashinfer-ai:mainfrom
yzh119:trtllm-comm

Conversation

@yyihuang
Copy link
Copy Markdown
Collaborator

@yyihuang yyihuang commented May 28, 2025

📌 Description

We add trt-llm custom all-reduce to flashinfer comm module.

🔍 Related Issues

We split this PR into multiple.
#1061
MoE kernels are also in progress.

🚀 Pull Request Checklist

Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.

✅ Pre-commit Checks

  • I have installed pre-commit by running pip install pre-commit (or used your preferred method).
  • I have installed the hooks with pre-commit install.
  • I have run the hooks manually with pre-commit run --all-files and fixed any reported issues.

If you are unsure about how to set up pre-commit, see the pre-commit documentation.

🧪 Tests

  • Tests have been added or updated as needed.
  • All tests are passing (unittest, etc.).

Reviewer Notes

@Edenzzzz
Copy link
Copy Markdown
Contributor

Thanks for supporting this! I wonder what are the major performance benefits compared to NCCL and if there are benchmark numbers?

Comment thread flashinfer/comm.py Outdated

# NOTE(Yingyi): The customAllReduce and allReduceFusion require different buffer size
# since allreduceFusion kernels are an improved implementation
# todo(review): align the ipc buffer size to tllm implementation
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can confirm the values are aligned.

Comment thread flashinfer/comm.py


# todo(review): align the ipc buffer size to tllm implementation
BarrierFlagCount = 256
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to confirm with trtllm team.

Copy link
Copy Markdown
Collaborator

@yzh119 yzh119 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, let's move on to the following comm ops (moe).

@yzh119 yzh119 merged commit 1600e3e into flashinfer-ai:main Jun 2, 2025
2 checks passed
@yzh119 yzh119 mentioned this pull request Jun 2, 2025
5 tasks
yyihuang added a commit that referenced this pull request Jun 4, 2025
…#1111)

<!-- .github/pull_request_template.md -->

## 📌 Description

Lamport buffer should be initialized to neg_zero only once after
creating new workspace. No need to re-init between each all-reduce.

## 🔍 Related Issues

related PRs: #1096 

## 🚀 Pull Request Checklist

Thank you for contributing to FlashInfer! Before we review your pull
request, please make sure the following items are complete.

### ✅ Pre-commit Checks

- [x] I have installed `pre-commit` by running `pip install pre-commit`
(or used your preferred method).
- [x] I have installed the hooks with `pre-commit install`.
- [x] I have run the hooks manually with `pre-commit run --all-files`
and fixed any reported issues.

> If you are unsure about how to set up `pre-commit`, see [the
pre-commit documentation](https://pre-commit.com/).

## 🧪 Tests

- [x] Tests have been added or updated as needed.
- [x] All tests are passing (`unittest`, etc.).

## Reviewer Notes

<!-- Optional: anything you'd like reviewers to focus on, concerns, etc.
-->
#define FLASHINFER_LOGGING_H_

#include <spdlog/sinks/stdout_color_sinks.h>
#include <spdlog/spdlog.h>
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like you introduced a new dependency here? Is this documented anywhere? (this breaks the build on my machine, and pip install spdlog didn't fix it).

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a submodule was later added which may solves this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants