Skip to content

feat: port fast_decode_plan from sgl#1745

Merged
yzh119 merged 15 commits intoflashinfer-ai:mainfrom
yyihuang:fix_fast_plan
Sep 22, 2025
Merged

feat: port fast_decode_plan from sgl#1745
yzh119 merged 15 commits intoflashinfer-ai:mainfrom
yyihuang:fix_fast_plan

Conversation

@zihaoye
Copy link
Copy Markdown
Contributor

@zihaoye zihaoye commented Sep 21, 2025

📌 Description

🔍 Related Issues

#1720

🚀 Pull Request Checklist

Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.

✅ Pre-commit Checks

  • I have installed pre-commit by running pip install pre-commit (or used your preferred method).
  • I have installed the hooks with pre-commit install.
  • I have run the hooks manually with pre-commit run --all-files and fixed any reported issues.

If you are unsure about how to set up pre-commit, see the pre-commit documentation.

🧪 Tests

  • Tests have been added or updated as needed.
  • All tests are passing (unittest, etc.).

Reviewer Notes

Co-authored-by: Yingyi Huang <yingyihuang2000@outlook.com>
@yyihuang yyihuang self-requested a review September 21, 2025 04:54
@yyihuang yyihuang self-assigned this Sep 21, 2025
@yzh119
Copy link
Copy Markdown
Collaborator

yzh119 commented Sep 21, 2025

Same here, cool to meet you :)

@yyihuang yyihuang marked this pull request as ready for review September 22, 2025 03:13
@yyihuang yyihuang requested a review from yzh119 September 22, 2025 03:14
Comment thread flashinfer/decode.py
disable_split_kv: bool = False,
) -> None:
"""
A faster version of BatchDecodeWithPagedKVCacheWrapper::plan used for FlashInferMultiStepDraftBackend.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be no longer required after we move to GPU-based planning.

@yzh119 yzh119 merged commit 175fc73 into flashinfer-ai:main Sep 22, 2025
2 checks passed
yzh119 pushed a commit that referenced this pull request Sep 24, 2025
… list (#1757)

<!-- .github/pull_request_template.md -->

## 📌 Description

fix #1745

## 🔍 Related Issues

<!-- Link any related issues here -->

## 🚀 Pull Request Checklist

Thank you for contributing to FlashInfer! Before we review your pull
request, please make sure the following items are complete.

### ✅ Pre-commit Checks

- [ ] I have installed `pre-commit` by running `pip install pre-commit`
(or used your preferred method).
- [ ] I have installed the hooks with `pre-commit install`.
- [ ] I have run the hooks manually with `pre-commit run --all-files`
and fixed any reported issues.

> If you are unsure about how to set up `pre-commit`, see [the
pre-commit documentation](https://pre-commit.com/).

## 🧪 Tests

- [ ] Tests have been added or updated as needed.
- [ ] All tests are passing (`unittest`, etc.).

## Reviewer Notes

<!-- Optional: anything you'd like reviewers to focus on, concerns, etc.
-->

---------

Co-authored-by: Zihao Ye <yezihhhao@gmail.com>
Co-authored-by: Zihao Ye <98052487+zihaoye@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants