-
Notifications
You must be signed in to change notification settings - Fork 584
CI: pin cibuildwheel TF/PT deps to global pinnings #5071
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This will make the CI more robust, i.e., new TF/PT versions will not break the CI until we bump the version.
for more information, see https://pre-commit.ci
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## devel #5071 +/- ##
=======================================
Coverage 84.28% 84.28%
=======================================
Files 709 709
Lines 70562 70561 -1
Branches 3619 3618 -1
=======================================
+ Hits 59472 59473 +1
+ Misses 9924 9923 -1
+ Partials 1166 1165 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR aims to make CI builds more robust by pinning TensorFlow and PyTorch dependencies to specific versions defined in global dependency groups, preventing CI breakage from new upstream releases.
Key changes:
- Added
dependency_groupspackage as a build requirement to support PEP 735 dependency groups - Created centralized dependency group definitions in
pyproject.tomlwith platform-specific version pins - Refactored CI build logic to read pinned versions from dependency groups instead of hardcoding them
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| pyproject.toml | Added dependency_groups build dependency and defined pin_tensorflow_cpu, pin_tensorflow_gpu, pin_pytorch_cpu, and pin_pytorch_gpu dependency groups with platform-specific version constraints |
| backend/utils.py | New utility module providing read_dependencies_from_dependency_group() function to extract dependencies from pyproject.toml dependency groups |
| backend/find_tensorflow.py | Updated to use pinned TensorFlow dependencies from dependency groups for CUDA 12+ builds instead of hardcoded version strings |
| backend/find_pytorch.py | Updated to use pinned PyTorch dependencies from dependency groups for CUDA 12+ builds, removed unused platform import |
Comments suppressed due to low confidence (1)
backend/find_tensorflow.py:112
- Potential conflicting requirements: When
CIBUILDWHEELis set and CUDA version is >= 12, the code adds pinned requirements fromread_dependencies_from_dependency_group("pin_tensorflow_cpu")torequires, but then also extendsrequireswith unpinned requirements fromget_tf_requirement("")["cpu"].
For example, on Linux x86_64, this would result in both:
tensorflow-cpu~=2.18.0; platform_machine=='x86_64' and platform_system == 'Linux'(pinned, from line 100)tensorflow-cpu; platform_machine!='aarch64' and (platform_machine!='arm64' or platform_system != 'Darwin')(unpinned, from line 112)
Both requirements match the same platform, which may cause conflicting version constraints.
Consider setting tf_version to a specific version (e.g., "2.18.0") when using the pinned dependency group, or conditionally skip calling get_tf_requirement() when pinned requirements are already added:
if cuda_version == "" or cuda_version in SpecifierSet(">=12,<13"):
requires.extend(
read_dependencies_from_dependency_group("pin_tensorflow_cpu")
)
tf_version = None # Skip adding unpinned requirements
elif cuda_version in SpecifierSet(">=11,<12"):
# ... if cuda_version == "" or cuda_version in SpecifierSet(">=12,<13"):
# CUDA 12.2, cudnn 9
# or CPU builds
requires.extend(
read_dependencies_from_dependency_group("pin_tensorflow_cpu")
)
elif cuda_version in SpecifierSet(">=11,<12"):
# CUDA 11.8, cudnn 8
requires.extend(
[
"tensorflow-cpu>=2.5.0,<2.15; platform_machine=='x86_64' and platform_system == 'Linux'",
]
)
tf_version = "2.14.1"
else:
raise RuntimeError("Unsupported CUDA version") from None
requires.extend(get_tf_requirement(tf_version)["cpu"])
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Jinzhe Zeng <njzjz@qq.com>
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughWalkthroughAdds a utility to read dependency-groups from pyproject.toml and updates backend finders to use it for CPU-specific TensorFlow and PyTorch pins, replacing prior hardcoded strings and broadening CI/CIBUILDWHEEL CPU pin logic. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
backend/find_pytorch.py (1)
32-34: CIBW PyTorch CPU pinning via dependency groups looks good; fix minor docstring nit
- The new
cibw_requirementhandling withread_dependencies_from_dependency_group("pin_pytorch_cpu")in theCIBUILDWHEEL+ CPU / CUDA≥12 path cleanly reuses the central pyproject pins and composes them with the existingtorch>=2.1.0/pt_versionlogic andmpichrequirement. This should keep CIBW CPU builds on the globally pinned torch versions while preserving MPI constraints. Based on learnings, this also aligns with PyTorch ≥2.8 where the default PyPI wheel is CPU‑only.- Minor nit: the
find_pytorchdocstring’s return description still says “TensorFlow requirement” (Line 56). Consider updating it to “PyTorch requirement” to avoid confusion.Also applies to: 96-148
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
backend/find_pytorch.py(3 hunks)backend/find_tensorflow.py(2 hunks)backend/utils.py(1 hunks)pyproject.toml(2 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: njzjz
Repo: deepmodeling/deepmd-kit PR: 4884
File: .github/workflows/test_cuda.yml:46-46
Timestamp: 2025-08-14T07:11:51.357Z
Learning: As of PyTorch 2.8 (August 2025), the default wheel on PyPI installed by `pip install torch` is CPU-only. CUDA-enabled wheels are available on PyPI for Linux x86 and Windows x86 platforms, but require explicit specification via index URLs or variant-aware installers. For CUDA support, use `--index-url https://download.pytorch.org/whl/cu126` (or appropriate CUDA version).
Learnt from: CR
Repo: deepmodeling/deepmd-kit PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-24T22:42:23.128Z
Learning: Install backend dependencies (TensorFlow, PyTorch, JAX, or Paddle) before building C++ components, and set TENSORFLOW_ROOT and PYTORCH_ROOT environment variables
📚 Learning: 2025-08-14T07:11:51.357Z
Learnt from: njzjz
Repo: deepmodeling/deepmd-kit PR: 4884
File: .github/workflows/test_cuda.yml:46-46
Timestamp: 2025-08-14T07:11:51.357Z
Learning: As of PyTorch 2.8 (August 2025), the default wheel on PyPI installed by `pip install torch` is CPU-only. CUDA-enabled wheels are available on PyPI for Linux x86 and Windows x86 platforms, but require explicit specification via index URLs or variant-aware installers. For CUDA support, use `--index-url https://download.pytorch.org/whl/cu126` (or appropriate CUDA version).
Applied to files:
pyproject.tomlbackend/find_pytorch.py
🧬 Code graph analysis (2)
backend/find_tensorflow.py (1)
backend/utils.py (1)
read_dependencies_from_dependency_group(17-36)
backend/find_pytorch.py (1)
backend/utils.py (1)
read_dependencies_from_dependency_group(17-36)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: CodeQL analysis (python)
- GitHub Check: Agent
🔇 Additional comments (2)
pyproject.toml (1)
8-9: TF/PT CPU dependency groups and build-system wiring look consistent with backend usage
- Adding
"dependency_groups"to[build-system].requiresmatches the newbackend.utilsimport, so the build backend can resolve groups without relying on runtime deps.- The
pin_tensorflow_cpuandpin_pytorch_cpugroups appear to partition major platforms as intended (Linux x86_64 vs macOS x86_64 vs “everything else”), and their names match the strings used inread_dependencies_from_dependency_group(...).From my side this configuration is coherent with the updated
find_tensorflow/find_pytorchlogic; you might still want to sanity‑check the markers on a couple of target platforms (Linux, macOS, Windows) via a dry‑run resolver (pip/uv) to confirm they expand as expected.Also applies to: 156-183
backend/find_tensorflow.py (1)
29-31: Using dependency-group TF CPU pins in CIBW builds looks correctThe new import and the change to:
- treat
CUDA_VERSION == ""or>=12,<13as the CPU/“modern CUDA” path, and- extend
requireswithread_dependencies_from_dependency_group("pin_tensorflow_cpu")nicely centralize TF CPU pinning in
pyproject.tomlwhile still layering onget_tf_requirement(tf_version)["cpu"]for ancillary deps. The group name matches the[dependency-groups]entry andrequires.extend(...)is compatible with the returned iterable.Also applies to: 93-101
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
backend/utils.py (1)
34-34: Consider safer dictionary access with explicit error handling.Line 34 uses direct dictionary access
pyproject["dependency-groups"], which will raise a genericKeyErrorif the key is missing. A previous review comment suggested using.get()with explicit error handling to provide clearer diagnostics.Apply this diff to add explicit error handling:
- groups = pyproject["dependency-groups"] + groups = pyproject.get("dependency-groups") + if groups is None: + raise ValueError( + "Missing 'dependency-groups' section in pyproject.toml" + )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
backend/utils.py(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (25)
- GitHub Check: Test Python (5, 3.12)
- GitHub Check: Test Python (6, 3.12)
- GitHub Check: Test Python (6, 3.9)
- GitHub Check: Test Python (2, 3.12)
- GitHub Check: Analyze (c-cpp)
- GitHub Check: Test Python (4, 3.12)
- GitHub Check: Test Python (5, 3.9)
- GitHub Check: Test Python (3, 3.12)
- GitHub Check: Test Python (3, 3.9)
- GitHub Check: Test Python (4, 3.9)
- GitHub Check: Test Python (2, 3.9)
- GitHub Check: Test Python (1, 3.12)
- GitHub Check: Test Python (1, 3.9)
- GitHub Check: Build wheels for cp310-manylinux_aarch64
- GitHub Check: Build C++ (cuda, cuda)
- GitHub Check: Build C++ (cuda120, cuda)
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Build C++ (rocm, rocm)
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Build wheels for cp311-win_amd64
- GitHub Check: Build wheels for cp311-macosx_x86_64
- GitHub Check: Build C library (2.14, >=2.5.0,<2.15, libdeepmd_c_cu11.tar.gz)
- GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
- GitHub Check: Test C++ (false)
- GitHub Check: Test C++ (true)
🔇 Additional comments (2)
backend/utils.py (2)
17-30: The docstring has been correctly updated.The
Returnssection (lines 28-29) now correctly statestuple[str, ...]to match the return type annotation. This addresses the previous comment about the list vs. tuple inconsistency.
7-14: Verify that external dependencies are declared.The code imports from
dependency_groups(line 7) and conditionally importstomlifor Python < 3.11 (line 14). Ensure these packages are properly declared in the project's dependencies configuration.
This will make the CI more robust, i.e., new TF/PT versions will not break the CI until we bump the version. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Added CPU-optimized build support for PyTorch and TensorFlow. * Platform-specific dependency resolution for Linux x86_64, macOS, and other architectures. * **Chores** * Updated build configuration to use configurable dependency groups and conditional pins for CPU vs GPU builds. * Adjusted CI/build selection logic to better handle CPU-targeted packaging across environments. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Jinzhe Zeng <njzjz@qq.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
This will make the CI more robust, i.e., new TF/PT versions will not break the CI until we bump the version.
Summary by CodeRabbit
New Features
Chores
✏️ Tip: You can customize this high-level summary in your review settings.