-
Notifications
You must be signed in to change notification settings - Fork 70
sglang: add fp8 8k1k and fp4 1k1k #274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
cquil11
merged 4 commits into
InferenceMAX:multinode-integration
from
ishandhanani:ishan/morecfgs
Dec 4, 2025
Merged
sglang: add fp8 8k1k and fp4 1k1k #274
cquil11
merged 4 commits into
InferenceMAX:multinode-integration
from
ishandhanani:ishan/morecfgs
Dec 4, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
cquil11
approved these changes
Dec 4, 2025
cquil11
added a commit
that referenced
this pull request
Dec 4, 2025
Collaborator
|
@ishandhanani I think I merged prematurely. can you pls open another PR on
|
Contributor
Author
|
PR isnt ready yet. That dynamo branch will be created shortly. |
cquil11
added a commit
that referenced
this pull request
Dec 5, 2025
* adding initial changes to master configs; adding initial updates to validation logic and config parser * adding new gb200 script * adding integration to gb200 runner script and workflow files * revert and correct name of 1k1k scheduler workflow * adding runners.yaml to workflow invocation * toJson on conc since it is now a list * correctly sending conc list to multnode * hotfix * correct env var to MAX batch size * set -x * debugging with dynmao fork * debugging with dynmao fork pt 2 * experiment * adding separate script for launching * changing filenames * ntasks per node * making the spec-decoding output required * updating ntasks per node gp * test * test * conc list quoted * get rid of debug code * testing support for dsr1 * testing support for dsr1 test * testing support for dsr1 test * testing support for dsr1 test * testing support for dsr1 test * testing * some changes to generate sweeps * testing and debugging * adding new file code for sglang * adding new file code for sglang * changing file path * updating multinode fn hash * updating multinode fn hash * dynamo trtllm to dynamo trt * changing process result * add is multinode * bug fix * bug fix * bug fix * bug fix * polishing * polishing pt 2 * polishing pt 3 * polishing pt 4 * fixing summarize.py * polishing * testing * testing * adding testing workflows * adding testing workflows * adding testing workflows * adding testing workflows * adding testing workflows * adding testing workflows * adding testing workflows * adding testing workflows * adding testing workflows * adding testing workflows * adding testing workflows * adding testing workflows * adding testing workflows * adding testing workflows * adding testing workflows * adding tests * adding tests * adding tests * adding tests * adding tests * adding tests * adding tests * add updates for newest gb200 merge * add updates for newest gb200 merge pt 2 * move ntasks per node to framework level instead of runner level * nexp hard coded to 1: * add AMD configs to full sweep * shut the line counter workflow up haha * shut the line counter workflow up haha * shut the line counter workflow up haha pt 2 * updating testing logic * add model prefix to label validator * add more descriptive name to tests * update test for process results * add script mode * fix bug * sglang: add fp8 8k1k and fp4 1k1k (#274) * go * typo * typo... * more * Revert "sglang: add fp8 8k1k and fp4 1k1k (#274)" (#283) This reverts commit efcb4e4. * get rid of ntasks per node required env var for sglang * bug fix * bug fix missing amd * bug fix missing amd pt 2 * add served model name to summary * add served model name to summary pt 2 * add served model name to summary pt 3 * fix max model len bug * add readme * add image to json result --------- Co-authored-by: ishandhanani <82981111+ishandhanani@users.noreply.github.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
we now use sglang 0.5.5.post2 across the board