Optimize schedule by hnyls2002 · Pull Request #1339 · sgl-project/sglang

hnyls2002 · 2024-09-05T18:16:47Z

Motivation

Modifications

Checklist

Format your code according to the Contributor Guide.
Add unit tests as outlined in the Contributor Guide.
Update documentation as needed, including docstrings or example tutorials.

hxer7963 · 2024-09-21T06:06:55Z

hi, @hnyls2002 @merrymercy.

I have been exploring the source code of the PrefillAdder class and the scheduler module within ModelTpServer::get_new_prefill_batch. It seems that the implementation reserves the maximum possible output token slots based on the estimated new_token_ratio before scheduling prefill requests.

However, I am curious about the motivation behind the scheduling strategy used by PrefillAdder and how it contributes to optimizing scheduling performance.

Could you provide some insights into these aspects?

hnyls2002 added 5 commits September 5, 2024 16:18

update

6ccea63

update

8a35b55

reduce overhead

a9c5f0a

reduce overhead

a3ce927

fix inflight

37bf108

merrymercy merged commit ab4a83b into main Sep 5, 2024

merrymercy deleted the optimize-schedule branch September 5, 2024 21:30

merrymercy mentioned this pull request Sep 13, 2024

Development Roadmap (2024 Q3) #634

Closed

29 tasks

timethink pushed a commit to timethink/sglang that referenced this pull request Mar 9, 2025

Optimize schedule (sgl-project#1339)

568eea5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize schedule#1339

Optimize schedule#1339
merrymercy merged 5 commits intomainfrom
optimize-schedule

hnyls2002 commented Sep 5, 2024

Uh oh!

hxer7963 commented Sep 21, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

hnyls2002 commented Sep 5, 2024

Motivation

Modifications

Checklist

Uh oh!

hxer7963 commented Sep 21, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants