Skip to content

Conversation

@cquil11
Copy link
Collaborator

@cquil11 cquil11 commented Dec 4, 2025

Reverts #274

@cquil11 cquil11 requested a review from a team as a code owner December 4, 2025 16:25
@cquil11 cquil11 merged commit 21ec133 into multinode-integration Dec 4, 2025
@cquil11 cquil11 deleted the revert-274-ishan/morecfgs branch December 4, 2025 16:25
cquil11 added a commit that referenced this pull request Dec 5, 2025
* adding initial changes to master configs; adding initial updates to validation logic and config parser

* adding new gb200 script

* adding integration to gb200 runner script and workflow files

* revert and correct name of 1k1k scheduler workflow

* adding runners.yaml to workflow invocation

* toJson on conc since it is now a list

* correctly sending conc list to multnode

* hotfix

* correct env var to MAX batch size

* set -x

* debugging with dynmao fork

* debugging with dynmao fork pt 2

* experiment

* adding separate script for launching

* changing filenames

* ntasks per node

* making the spec-decoding output required

* updating ntasks per node
gp

* test

* test

* conc list quoted

* get rid of debug code

* testing support for dsr1

* testing support for dsr1 test

* testing support for dsr1 test

* testing support for dsr1 test

* testing support for dsr1 test

* testing

* some changes to generate sweeps

* testing and debugging

* adding new file code for sglang

* adding new file code for sglang

* changing file path

* updating multinode fn hash

* updating multinode fn hash

* dynamo trtllm to dynamo trt

* changing process result

* add is multinode

* bug fix

* bug fix

* bug fix

* bug fix

* polishing

* polishing pt 2

* polishing pt 3

* polishing pt 4

* fixing summarize.py

* polishing

* testing

* testing

* adding testing workflows

* adding testing workflows

* adding testing workflows

* adding testing workflows

* adding testing workflows

* adding testing workflows

* adding testing workflows

* adding testing workflows

* adding testing workflows

* adding testing workflows

* adding testing workflows

* adding testing workflows

* adding testing workflows

* adding testing workflows

* adding testing workflows

* adding tests

* adding tests

* adding tests

* adding tests

* adding tests

* adding tests

* adding tests

* add updates for newest gb200 merge

* add updates for newest gb200 merge pt 2

* move ntasks per node to framework level instead of runner level

* nexp hard coded to 1:

* add AMD configs to full sweep

* shut the line counter workflow up haha

* shut the line counter workflow up haha

* shut the line counter workflow up haha pt 2

* updating testing logic

* add model prefix to label validator

* add more descriptive name to tests

* update test for process results

* add script mode

* fix bug

* sglang: add fp8 8k1k and fp4 1k1k (#274)

* go

* typo

* typo...

* more

* Revert "sglang: add fp8 8k1k and fp4 1k1k (#274)" (#283)

This reverts commit efcb4e4.

* get rid of ntasks per node required env var for sglang

* bug fix

* bug fix missing amd

* bug fix missing amd pt 2

* add served model name to summary

* add served model name to summary pt 2

* add served model name to summary pt 3

* fix max model len bug

* add readme

* add image to json result

---------

Co-authored-by: ishandhanani <82981111+ishandhanani@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants