[CI] Add cu13.0 wheel + container builds and nightly wheel releases#3069
[CI] Add cu13.0 wheel + container builds and nightly wheel releases#3069ApostaC merged 3 commits intoLMCache:devfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request updates the project to support CUDA 12.9 and 13.0, involving significant updates to the Dockerfiles, build configurations, and a major restructuring of the installation documentation. The changes include conditional Triton kernel installation and explicit Torch versioning in the build stages. Feedback identifies a shell word-splitting bug in the Dockerfile's package installation logic and an outdated comment in the pyproject.toml file.
59cdbef to
9f44377
Compare
b8b4362 to
47cda1e
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 2 total unresolved issues (including 1 from previous review).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 47cda1e. Configure here.
publish.yml:
- Add build-cu130-artifacts job: builds cu13 wheel via cibuildwheel
using pytorch/manylinux2_28-builder:cuda13.0, TORCH_CUDA_ARCH_LIST
for sm 8.0–12.0, and PIP_EXTRA_INDEX_URL for cu130 torch
- Add publish-cu130-github-release job: creates v{tag}-cu13 GitHub
Release with --latest=false so pip can resolve via --find-links
- publish-image: add cu13 stable Docker steps (image-release-cu13
target); guard with needs.publish-cu130-github-release.result==success
so cu130 failures skip cu13 images without blocking cu12.9 images;
use needs.publish-pypi.result==success to decouple from cu130 failures
- Add download-r2.pytorch.org and pypi.nvidia.com to egress allowlist
(torch cu130 downloads via CDN; cuda-bindings on NVIDIA PyPI)
- Add index.docker.io to allowlist for DockerHub pulls inside containers
- Make DockerHub login conditional on vars.DOCKERHUB_USERNAME for fork PRs
nightly_build.yml:
- Add nightly-wheels job: builds cu12.9 and cu13.0 wheels daily,
publishes to separate rolling GitHub Releases (nightly / nightly-cu13);
interleave build→publish per variant so a cu13 failure cannot block
cu12.9 publishing; SETUPTOOLS_SCM_PRETEND_VERSION for dev versioning
- nightly-build: add cu13 container builds (vllm-openai + standalone);
docker system/builder prune before each build to prevent disk exhaustion
- Add download-r2.pytorch.org and pypi.nvidia.com to egress allowlist
- Make DockerHub login conditional on vars.DOCKERHUB_USERNAME
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: deng451e <838677410@qq.com>
nightly_build.yml: interleave build→publish per CUDA variant so the cu13 wheel build step runs after cu12.9 is already published; a cu13 build failure now skips only the cu13 publish, leaving cu12.9 unaffected publish.yml: change publish-image guard from !failure() to needs.publish-pypi.result=='success' so a failing (not just skipped) publish-cu130-github-release does not block cu12.9 Docker publishing Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: deng451e <838677410@qq.com>
db7dc2e to
ca8a76d
Compare
Signed-off-by: deng451e <838677410@qq.com>
ApostaC
left a comment
There was a problem hiding this comment.
LGTM! Excited to see the 0.4.4 release!

Summary
no version pinning
Release workflow
Note
Medium Risk
Medium risk because it expands release automation (new nightly wheel publishing, new cu13 release artifacts, and additional Docker image variants), which could break packaging or deployment if misconfigured.
Overview
Adds automated nightly wheel publishing: the nightly workflow now builds CUDA 12.9 and CUDA 13.0 wheels via
cibuildwheeland publishes them as rolling prerelease GitHub Releases (nightlyandnightly-cu13).Extends the release pipeline to build a dedicated CUDA 13.0 (
cu130) wheel artifact, publish it to a separate${tag}-cu13GitHub Release (not PyPI), and gate additional Docker publishing steps on that job.Updates Docker builds to default to CUDA 12.9, allow overriding the NVIDIA base via
BASE_IMAGE, pull vLLM/torch from CUDA-specific indexes, and add new CUDA 13 image targets/tags for both nightly and release; docs andpyproject.tomlare updated accordingly (incl. manylinux image bump tocuda12.9).Reviewed by Cursor Bugbot for commit ecfecf4. Bugbot is set up for automated code reviews on this repo. Configure here.