Skip to content

Update torch to 2.10.0#18862

Closed
Fridge003 wants to merge 13 commits intomainfrom
upd-torch
Closed

Update torch to 2.10.0#18862
Fridge003 wants to merge 13 commits intomainfrom
upd-torch

Conversation

@Fridge003
Copy link
Copy Markdown
Collaborator

Motivation

#18066

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Review Process

  1. Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or contact authorized users to do so.
    • /tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci
  4. After green CI and required approvals, ask Merge Oncalls to merge.

@github-actions github-actions Bot added dependencies Pull requests that update a dependency file sgl-kernel labels Feb 15, 2026
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @Fridge003, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request primarily focuses on upgrading the core PyTorch dependency to version 2.10.0. This upgrade allows for the removal of specific compatibility workarounds and checks that were necessary for the older 2.9.1 version, particularly concerning CuDNN interactions and nn.Conv3d performance. The changes streamline the dependency management and build processes by eliminating outdated conditional logic.

Highlights

  • PyTorch Version Upgrade: Updated the torch and torchaudio dependencies from version 2.9.1 to 2.10.0 across the project's configuration and build scripts.
  • Removed CuDNN Compatibility Check: Eliminated the specific compatibility check and warning mechanism for torch 2.9.1 and CuDNN versions older than 9.15, as the upgrade to torch 2.10.0 resolves the underlying nn.Conv3d performance issues.
  • CI/CD Script Updates: Modified CI/CD installation scripts to first uninstall torch before installing the new version, ensuring a clean upgrade, and removed the now-obsolete CuDNN installation logic.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • python/pyproject.toml
    • Updated torch dependency from 2.9.1 to 2.10.0.
    • Updated torchaudio dependency from 2.9.1 to 2.10.0.
  • python/sglang/srt/server_args.py
    • Removed the check_torch_2_9_1_cudnn_compatibility method call.
    • Deleted the check_torch_2_9_1_cudnn_compatibility method entirely, which handled warnings and errors for torch 2.9.1 and older CuDNN versions.
  • scripts/ci/cuda/ci_install_dependency.sh
    • Added an uninstall torch command before installation to ensure a clean update.
    • Removed the conditional installation logic for nvidia-cudnn-cu12 that was specific to torch 2.9.1 compatibility.
  • sgl-kernel/Dockerfile
    • Updated the TORCH_VER variable to 2.10.0 for all CUDA versions in the Docker build process.
Activity
  • No specific human activity (comments, reviews, or progress updates) has been recorded for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@Fridge003 Fridge003 mentioned this pull request Feb 15, 2026
5 tasks
@Fridge003
Copy link
Copy Markdown
Collaborator Author

Fridge003 commented Feb 15, 2026

/tag-and-rerun-ci again/

@Fridge003 Fridge003 marked this pull request as ready for review February 15, 2026 15:34
@Fridge003
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request successfully updates the project's PyTorch dependency to version 2.10.0. The changes are well-contained, including updates to pyproject.toml, removal of compatibility code for the previous torch version, and adjustments to CI/Docker configurations. The modifications appear correct and align with the PR's goal.

For broader project consistency, it might be worth checking other configuration files not included in this PR, such as python/pyproject_cpu.toml and python/pyproject_xpu.toml, which still reference older torch versions. Additionally, the sgl-kernel/README.md should be updated to reflect the new torch version requirement. These are outside the scope of the current changes but would be good follow-up actions.

@SoluMilken
Copy link
Copy Markdown
Contributor

Thanks for updating torch to 2.10!

I ran a quick scan and found a few files that might also need updates:

  1. python/pyproject.toml
  2. python/pyproject_cpu.toml
  3. python/pyproject_xpu.toml
  4. sgl-kernel/pyproject_cpu.toml
  5. sgl-kernel/README.md
  6. docker/Dockerfile
  7. docker/xpu.Dockerfile
  8. docker/rocm720.Dockerfile
  9. scripts/ci/cuda/ci_install_dependency.sh
  10. docs/platforms/xpu.md
  11. python/sglang/srt/server_args.py

I'm new to the project, so please let me know if I missed anything or misunderstood something! 🛐

image

@FlamingoPg
Copy link
Copy Markdown
Collaborator

FlamingoPg commented Feb 15, 2026

/rerun-failed-ci again again

… 2.10 compatibility

When PyTorch is upgraded but sgl-kernel source is not changed, the CI
needs to rebuild sgl-kernel locally to ensure ABI compatibility.

Changes:
- Add sgl-kernel-build-wheels as dependency
- Download pre-built wheel when sgl-kernel/** is changed
- Build sgl-kernel locally with PyTorch 2.10 when sgl-kernel/** is not changed
- Always use CUSTOM_BUILD_SGL_KERNEL=true to avoid PyPI version
@FlamingoPg FlamingoPg force-pushed the upd-torch branch 2 times, most recently from 42ca214 to 29d40ba Compare February 16, 2026 12:16
@github-actions github-actions Bot added the documentation Improvements or additions to documentation label Feb 16, 2026
@FlamingoPg FlamingoPg force-pushed the upd-torch branch 2 times, most recently from f5c95f7 to 040f652 Compare February 16, 2026 12:34
@github-actions github-actions Bot added the Multi-modal multi-modal language model label Feb 16, 2026
@FlamingoPg FlamingoPg force-pushed the upd-torch branch 2 times, most recently from 94db0d2 to 8ed7c8f Compare February 16, 2026 21:11
@FlamingoPg
Copy link
Copy Markdown
Collaborator

FlamingoPg commented Feb 16, 2026

Waiting for PR #18903 to merge first, which properly fixes the JIT kernel crash on AMD/ROCm by adding forward_hip() routing to `forward_native()".

Once #18903 is merged, we can revert the AMD-specific JIT kernel guards in this PR to keep the changes focused on the PyTorch 2.10.0 upgrade.

@FlamingoPg FlamingoPg force-pushed the upd-torch branch 2 times, most recently from 23c8095 to 1f8e8b0 Compare February 16, 2026 21:19
@FlamingoPg
Copy link
Copy Markdown
Collaborator

Note: PyTorch 2.10.0 still requires cuDNN >= 9.15 for Conv3D operations to avoid performance regression. The runtime check was removed but CI enforces nvidia-cudnn-cu12==9.16.0.29 installation.

@FlamingoPg FlamingoPg force-pushed the upd-torch branch 2 times, most recently from 3f14c77 to 648b4a8 Compare February 17, 2026 07:08
PyTorch 2.10 ships with cudnn 9.10.2.21 which has Conv3D performance regression.
Force install cudnn 9.16.0.29 to fix diffusion test performance.
@johnnynunez
Copy link
Copy Markdown
Contributor

thank you @Fridge003
torch 2.10 comes with FBGemm and cutlass matmuls for Jetson AGX Thor and DGX Spark.

@Kangyan-Zhou Kangyan-Zhou requested a review from bingxche as a code owner March 7, 2026 23:32
@johnnynunez
Copy link
Copy Markdown
Contributor

johnnynunez commented Mar 8, 2026

@Fridge003 what do you think to skip 2.10.0 and jumpy directly to 2.11.0 + triton 3.7.0 in ten days? For me 2.10.0 is very bugged in some nvidia devices

M3.2: Release first RC1 Binary for PyTorch Core (17/2/26) COMPLETE
M3.3: Domain libraries cut RC Branch (18/2/26) COMPLETE
M4: Release branch finalized, Announce final launch date, Feature classifications published (week of 9/3/26) - Final RC is produced.
M4.1: Tutorial drafts submission deadline (11/3/26)
M5: External-Facing Content Finalized (13/3/26)
M6: Release Day (18/3/26)

@Fridge003
Copy link
Copy Markdown
Collaborator Author

Fridge003 commented Mar 11, 2026

@johnnynunez Let's go for 2.11 when this version has been stabilized and patched (maybe 2.11.1?)

@johnnynunez
Copy link
Copy Markdown
Contributor

johnnynunez commented Mar 11, 2026

@johnnynunez Let's go for 2.11 when this version has been stabilized and patched (maybe 2.11.1?)

it comes this 18th March as official release... it is because new devices we had a lot of problems like jetson thor..., so with 2.11.0 and triton 3.7.0 it is running smooth

@b8zhong
Copy link
Copy Markdown
Collaborator

b8zhong commented Mar 11, 2026

@Fridge003
Copy link
Copy Markdown
Collaborator Author

Directly to 2.11

@Fridge003 Fridge003 closed this Mar 16, 2026
@b8zhong b8zhong deleted the upd-torch branch March 16, 2026 23:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

amd dependencies Pull requests that update a dependency file documentation Improvements or additions to documentation high priority Multi-modal multi-modal language model run-ci sgl-kernel

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants