-
Notifications
You must be signed in to change notification settings - Fork 70
Update MI355x Deepseek-R1 FP4 SGLang Image to v0.5.6.post1 #330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
|
Please put a link to a successful test run in your PR decsription. |
|
Reminder:
So for this PR, you will add something like the following entry to the bottom of - config-keys:
- dsr1-fp4-mi355x-sglang
description: |
- Updating MI355x Deepseek-R1 FP4 SGLang Image to upstream v0.5.6.post1
PR: https://github.com/InferenceMAX/InferenceMAX/pull/330Then add the |
cquil11
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
* Update amd-master.yaml * Update perf-changelog.yaml * Update dsr1_fp4_mi355x_docker.sh * Update dsr1_fp4_mi355x_docker.sh --------- Co-authored-by: Cameron Quilici <cjquilici@gmail.com>
* Initial commit, for #304 * Allow testing on own PR * condense workflow * Rename Workflow * Use environments * Changed environment location * Stricter activation * Test replies * Test replies * Use token for comment perm * Forgot validation * feat: performance changelog triggered runs (as opposed to nightly) (#267) [skip-sweep] * add logic for event driven runs new single workflow that runs on merge to main, new perg-changelog.yaml to track performance changes, new logic to parse changelog, removed cron job in full sweep schedulers * testing pt 1 * raise error if yaml diff in perf changelog is not valid * remove unused imports in process_changelog.py * config data key fix * raise error if test-config subprocess fails to run * backfill changelog * backfill changelog pt 2 * backfill changelog pt 3 * backfill changelog pt 4 * backfill changelog pt 5 * backfill changelog pt 6 * add always() condition to upload changelog metadata * backfill changelog pt 7 (test) * backfill changelog pt 8 (revert test) * backfill changelog pt 9 * backfill changelog pt 11 * change if condition for jobs in run sweep workflow * debugging run sweep workflow * debugging run sweep workflow pt 2 * debugging run sweep workflow pt 3 (revert) * debugging run sweep workflow pt 4 * debugging run sweep workflow pt 5 * debugging run sweep workflow pt 6 * debugging run sweep workflow pt 7 * add always() condition to upload changelog metadata (add back, this got removed) * add bmk prefix to results * backfill changelog official * for concurrency group, use more unique sha * chore(deps): bump the github-actions group across 1 directory with 3 updates (#331) Bumps the github-actions group with 3 updates in the / directory: [actions/checkout](https://github.com/actions/checkout), [actions/upload-artifact](https://github.com/actions/upload-artifact) and [actions/download-artifact](https://github.com/actions/download-artifact). Updates `actions/checkout` from 6.0.0 to 6.0.1 - [Release notes](https://github.com/actions/checkout/releases) - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md) - [Commits](actions/checkout@v6...8e8c483) Updates `actions/upload-artifact` from 5.0.0 to 6.0.0 - [Release notes](https://github.com/actions/upload-artifact/releases) - [Commits](actions/upload-artifact@330a01c...b7c566a) Updates `actions/download-artifact` from 6.0.0 to 7.0.0 - [Release notes](https://github.com/actions/download-artifact/releases) - [Commits](actions/download-artifact@018cc2c...37930b1) --- updated-dependencies: - dependency-name: actions/checkout dependency-version: 6.0.1 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: github-actions - dependency-name: actions/upload-artifact dependency-version: 6.0.0 dependency-type: direct:production update-type: version-update:semver-major dependency-group: github-actions - dependency-name: actions/download-artifact dependency-version: 7.0.0 dependency-type: direct:production update-type: version-update:semver-major dependency-group: github-actions ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * fix: add final newline to original perf-changelog.yaml so that there wont be erroneous negative diff [skip-sweep] (#333) * Update MI355x Deepseek-R1 FP4 SGLang Image to v0.5.6.post1 (#330) * Update amd-master.yaml * Update perf-changelog.yaml * Update dsr1_fp4_mi355x_docker.sh * Update dsr1_fp4_mi355x_docker.sh --------- Co-authored-by: Cameron Quilici <cjquilici@gmail.com> * TOCTOU * Test new env * Ready for merge * Add benchmark script for GPTOSS FP4 B200 TRT-LLM (#256) * Add benchmark script for GPTOSS FP4 B200 TRT-LLM * make changes to perf changelog --------- Co-authored-by: Cameron Quilici <cjquilici@gmail.com> --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: Cameron Quilici <cjquilici@gmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: ppalanga <ppalanga@amd.com> Co-authored-by: Ankur Singh <ankusingh@nvidia.com>
Updating the SGLang docker image to the SGLang community docker image
Image=
lmsysorg/sglang:v0.5.6.post1-rocm700-mi35xImpacted configurations: MI355x Deepseek-R1 FP4
Link to the runs: https://github.com/InferenceMAX/InferenceMAX/actions/runs/20248091324?pr=330