Skip to content

Conversation

@ankursingh-nv
Copy link
Collaborator

@ankursingh-nv ankursingh-nv commented Nov 26, 2025

@ankursingh-nv ankursingh-nv marked this pull request as ready for review November 27, 2025 01:34
@ankursingh-nv ankursingh-nv requested a review from a team as a code owner November 27, 2025 01:34
Copy link
Collaborator

@cquil11 cquil11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@cquil11
Copy link
Collaborator

cquil11 commented Dec 15, 2025

@ankursingh-nv

Reminder:

PR 267 has been merged. With this. sweeps will no longer run nightly, rather they will run only when necessary as indicated by the perf-changelog.yaml file at the root of the repo. Going forward, when developers make changes to configs that have performance impact, they must note that change in perf-changelog.yaml and give a brief description of the changes. Once their PR is ready for review, they can add the sweep-enabled label to trigger a test sweep on their local branch. Once everything looks good, they can merge to main and an official sweep will be run for the specified configs.

So for this PR, you will add something like the following entry to the bottom of perf-changelog.yaml:

- config-keys:
    - gptoss-fp4-b200-trt
  description: |
    - Add benchmark script for GPTOSS FP4 B200 TRT-LLM
    PR: https://github.com/InferenceMAX/InferenceMAX/pull/256

Then add the sweep-enabled tag to the PR after marking it ready for review to run a test sweep. After the test sweep is done, please link the run in your PR description.

Copy link
Collaborator

@cquil11 cquil11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment about perf-changelog.yaml

Copy link
Collaborator

@cquil11 cquil11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added perf changelog so lgtm

@cquil11
Copy link
Collaborator

cquil11 commented Dec 17, 2025

@ankursingh-nv where are we on this? I added the perf changelog and kicked off test run here
https://github.com/InferenceMAX/InferenceMAX/actions/runs/20286968709

@cquil11 cquil11 merged commit 156fef3 into main Dec 17, 2025
70 of 71 checks passed
@cquil11 cquil11 deleted the gptoss-trt-docker branch December 17, 2025 15:55
@github-project-automation github-project-automation bot moved this from In Progress to Done in InferenceMAX Board Dec 17, 2025
Oseltamivir pushed a commit that referenced this pull request Dec 17, 2025
* Add benchmark script for GPTOSS FP4 B200 TRT-LLM

* make changes to perf changelog

---------

Co-authored-by: Cameron Quilici <cjquilici@gmail.com>
cquil11 added a commit that referenced this pull request Dec 17, 2025
* Initial commit, for #304

* Allow testing on own PR

* condense workflow

* Rename Workflow

* Use environments

* Changed environment location

* Stricter activation

* Test replies

* Test replies

* Use token for comment perm

* Forgot validation

* feat: performance changelog triggered runs (as opposed to nightly) (#267) [skip-sweep]

* add logic for event driven runs

new single workflow that runs on merge to main, new perg-changelog.yaml to track performance changes, new logic to parse changelog, removed cron job in full sweep schedulers

* testing pt 1

* raise error if yaml diff in perf changelog is not valid

* remove unused imports in process_changelog.py

* config data key fix

* raise error if test-config subprocess fails to run

* backfill changelog

* backfill changelog pt 2

* backfill changelog pt 3

* backfill changelog pt 4

* backfill changelog pt 5

* backfill changelog pt 6

* add always() condition to upload changelog metadata

* backfill changelog pt 7 (test)

* backfill changelog pt 8 (revert test)

* backfill changelog pt 9

* backfill changelog pt 11

* change if condition for jobs in run sweep workflow

* debugging run sweep workflow

* debugging run sweep workflow pt 2

* debugging run sweep workflow pt 3 (revert)

* debugging run sweep workflow pt 4

* debugging run sweep workflow pt 5

* debugging run sweep workflow pt 6

* debugging run sweep workflow pt 7

* add always() condition to upload changelog metadata (add back, this got removed)

* add bmk prefix to results

* backfill changelog official

* for concurrency group, use more unique sha

* chore(deps): bump the github-actions group across 1 directory with 3 updates (#331)

Bumps the github-actions group with 3 updates in the / directory: [actions/checkout](https://github.com/actions/checkout), [actions/upload-artifact](https://github.com/actions/upload-artifact) and [actions/download-artifact](https://github.com/actions/download-artifact).


Updates `actions/checkout` from 6.0.0 to 6.0.1
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](actions/checkout@v6...8e8c483)

Updates `actions/upload-artifact` from 5.0.0 to 6.0.0
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](actions/upload-artifact@330a01c...b7c566a)

Updates `actions/download-artifact` from 6.0.0 to 7.0.0
- [Release notes](https://github.com/actions/download-artifact/releases)
- [Commits](actions/download-artifact@018cc2c...37930b1)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: 6.0.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: github-actions
- dependency-name: actions/upload-artifact
  dependency-version: 6.0.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: github-actions
- dependency-name: actions/download-artifact
  dependency-version: 7.0.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: github-actions
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* fix: add final newline to original perf-changelog.yaml so that there wont be erroneous negative diff [skip-sweep] (#333)

* Update MI355x Deepseek-R1 FP4  SGLang Image to v0.5.6.post1 (#330)

* Update amd-master.yaml

* Update perf-changelog.yaml

* Update dsr1_fp4_mi355x_docker.sh

* Update dsr1_fp4_mi355x_docker.sh

---------

Co-authored-by: Cameron Quilici <cjquilici@gmail.com>

* TOCTOU

* Test new env

* Ready for merge

* Add benchmark script for GPTOSS FP4 B200 TRT-LLM (#256)

* Add benchmark script for GPTOSS FP4 B200 TRT-LLM

* make changes to perf changelog

---------

Co-authored-by: Cameron Quilici <cjquilici@gmail.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Cameron Quilici <cjquilici@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: ppalanga <ppalanga@amd.com>
Co-authored-by: Ankur Singh <ankusingh@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Development

Successfully merging this pull request may close these issues.

4 participants