CI: pin cibuildwheel TF/PT deps to global pinnings #5071

njzjz · 2025-12-02T05:20:53Z

This will make the CI more robust, i.e., new TF/PT versions will not break the CI until we bump the version.

Summary by CodeRabbit

New Features
- Added CPU-optimized build support for PyTorch and TensorFlow.
- Platform-specific dependency resolution for Linux x86_64, macOS, and other architectures.
Chores
- Updated build configuration to use configurable dependency groups and conditional pins for CPU vs GPU builds.
- Adjusted CI/build selection logic to better handle CPU-targeted packaging across environments.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

This will make the CI more robust, i.e., new TF/PT versions will not break the CI until we bump the version.

for more information, see https://pre-commit.ci

codecov · 2025-12-02T06:49:53Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 84.28%. Comparing base (91c5110) to head (637fd9b).
⚠️ Report is 2 commits behind head on devel.

Additional details and impacted files

@@           Coverage Diff           @@
##            devel    #5071   +/-   ##
=======================================
  Coverage   84.28%   84.28%           
=======================================
  Files         709      709           
  Lines       70562    70561    -1     
  Branches     3619     3618    -1     
=======================================
+ Hits        59472    59473    +1     
+ Misses       9924     9923    -1     
+ Partials     1166     1165    -1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copilot

Pull request overview

This PR aims to make CI builds more robust by pinning TensorFlow and PyTorch dependencies to specific versions defined in global dependency groups, preventing CI breakage from new upstream releases.

Key changes:

Added dependency_groups package as a build requirement to support PEP 735 dependency groups
Created centralized dependency group definitions in pyproject.toml with platform-specific version pins
Refactored CI build logic to read pinned versions from dependency groups instead of hardcoding them

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File	Description
pyproject.toml	Added `dependency_groups` build dependency and defined `pin_tensorflow_cpu`, `pin_tensorflow_gpu`, `pin_pytorch_cpu`, and `pin_pytorch_gpu` dependency groups with platform-specific version constraints
backend/utils.py	New utility module providing `read_dependencies_from_dependency_group()` function to extract dependencies from pyproject.toml dependency groups
backend/find_tensorflow.py	Updated to use pinned TensorFlow dependencies from dependency groups for CUDA 12+ builds instead of hardcoded version strings
backend/find_pytorch.py	Updated to use pinned PyTorch dependencies from dependency groups for CUDA 12+ builds, removed unused `platform` import

Comments suppressed due to low confidence (1)

backend/find_tensorflow.py:112

Potential conflicting requirements: When CIBUILDWHEEL is set and CUDA version is >= 12, the code adds pinned requirements from read_dependencies_from_dependency_group("pin_tensorflow_cpu") to requires, but then also extends requires with unpinned requirements from get_tf_requirement("")["cpu"].

For example, on Linux x86_64, this would result in both:

tensorflow-cpu~=2.18.0; platform_machine=='x86_64' and platform_system == 'Linux' (pinned, from line 100)
tensorflow-cpu; platform_machine!='aarch64' and (platform_machine!='arm64' or platform_system != 'Darwin') (unpinned, from line 112)

Both requirements match the same platform, which may cause conflicting version constraints.

Consider setting tf_version to a specific version (e.g., "2.18.0") when using the pinned dependency group, or conditionally skip calling get_tf_requirement() when pinned requirements are already added:

if cuda_version == "" or cuda_version in SpecifierSet(">=12,<13"):
    requires.extend(
        read_dependencies_from_dependency_group("pin_tensorflow_cpu")
    )
    tf_version = None  # Skip adding unpinned requirements
elif cuda_version in SpecifierSet(">=11,<12"):
    # ...

            if cuda_version == "" or cuda_version in SpecifierSet(">=12,<13"):
                # CUDA 12.2, cudnn 9
                # or CPU builds
                requires.extend(
                    read_dependencies_from_dependency_group("pin_tensorflow_cpu")
                )
            elif cuda_version in SpecifierSet(">=11,<12"):
                # CUDA 11.8, cudnn 8
                requires.extend(
                    [
                        "tensorflow-cpu>=2.5.0,<2.15; platform_machine=='x86_64' and platform_system == 'Linux'",
                    ]
                )
                tf_version = "2.14.1"
            else:
                raise RuntimeError("Unsupported CUDA version") from None
        requires.extend(get_tf_requirement(tf_version)["cpu"])

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

backend/utils.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Jinzhe Zeng <njzjz@qq.com>

coderabbitai · 2025-12-02T07:11:55Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

Adds a utility to read dependency-groups from pyproject.toml and updates backend finders to use it for CPU-specific TensorFlow and PyTorch pins, replacing prior hardcoded strings and broadening CI/CIBUILDWHEEL CPU pin logic.

Changes

Cohort / File(s)	Change Summary
Dependency group configuration `pyproject.toml`	Added `dependency_groups` with platform-conditional pins for CPU TensorFlow and PyTorch (multiple platform_machine/platform_system conditions and comments about macOS x86 deprecation).
Dependency resolution utility `backend/utils.py`	New function `read_dependencies_from_dependency_group(group: str) -> tuple[str, ...]` that loads pyproject.toml (using `tomllib` or `tomli`) and resolves the named dependency group. Includes LGPL-3.0-or-later header.
Backend dependency injection `backend/find_pytorch.py`, `backend/find_tensorflow.py`	Import and use `read_dependencies_from_dependency_group(...)`. `find_pytorch.py` appends CPU pin requirements when CIBUILDWHEEL and CPU-compatible CUDA_VERSION conditions are met; `find_tensorflow.py` replaces a hardcoded CPU TF string with the dependency-group result. Existing CI-MPI/CUDA logic retained and combined with the new pins.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Verify backend/utils.py correctly selects and uses tomllib vs tomli and handles missing keys or unexpected pyproject structure.
Check find_pytorch.py CIBUILDWHEEL and CUDA_VERSION condition logic and that cibw_requirement appends safely to existing requirements.
Confirm find_tensorflow.py replacement preserves prior platform/CPU vs GPU branching.
Validate pyproject.toml conditional markers (platform_machine, platform_system) are syntactically correct.

Possibly related PRs

fix(CI): prefer stable versions #4857 — Related changes to find_tensorflow.py CPU requirement handling.
chore: manage CI pinnings in pyproject.toml #5068 — Introduced dependency-groups configuration that this change reads programmatically.

Suggested reviewers

wanghan-iapcm

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title accurately describes the main change: pinning TensorFlow and PyTorch dependencies in cibuildwheel to match global pinnings across multiple backend files and pyproject.toml.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

backend/find_pytorch.py (1)

32-34: CIBW PyTorch CPU pinning via dependency groups looks good; fix minor docstring nit

The new cibw_requirement handling with read_dependencies_from_dependency_group("pin_pytorch_cpu") in the CIBUILDWHEEL + CPU / CUDA≥12 path cleanly reuses the central pyproject pins and composes them with the existing torch>=2.1.0/pt_version logic and mpich requirement. This should keep CIBW CPU builds on the globally pinned torch versions while preserving MPI constraints. Based on learnings, this also aligns with PyTorch ≥2.8 where the default PyPI wheel is CPU‑only.

Minor nit: the find_pytorch docstring’s return description still says “TensorFlow requirement” (Line 56). Consider updating it to “PyTorch requirement” to avoid confusion.

Also applies to: 96-148

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 91c5110 and 4c24d08.

📒 Files selected for processing (4)

backend/find_pytorch.py (3 hunks)
backend/find_tensorflow.py (2 hunks)
backend/utils.py (1 hunks)
pyproject.toml (2 hunks)

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: njzjz
Repo: deepmodeling/deepmd-kit PR: 4884
File: .github/workflows/test_cuda.yml:46-46
Timestamp: 2025-08-14T07:11:51.357Z
Learning: As of PyTorch 2.8 (August 2025), the default wheel on PyPI installed by `pip install torch` is CPU-only. CUDA-enabled wheels are available on PyPI for Linux x86 and Windows x86 platforms, but require explicit specification via index URLs or variant-aware installers. For CUDA support, use `--index-url https://download.pytorch.org/whl/cu126` (or appropriate CUDA version).

Learnt from: CR
Repo: deepmodeling/deepmd-kit PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-24T22:42:23.128Z
Learning: Install backend dependencies (TensorFlow, PyTorch, JAX, or Paddle) before building C++ components, and set TENSORFLOW_ROOT and PYTORCH_ROOT environment variables

📚 Learning: 2025-08-14T07:11:51.357Z

Learnt from: njzjz
Repo: deepmodeling/deepmd-kit PR: 4884
File: .github/workflows/test_cuda.yml:46-46
Timestamp: 2025-08-14T07:11:51.357Z
Learning: As of PyTorch 2.8 (August 2025), the default wheel on PyPI installed by `pip install torch` is CPU-only. CUDA-enabled wheels are available on PyPI for Linux x86 and Windows x86 platforms, but require explicit specification via index URLs or variant-aware installers. For CUDA support, use `--index-url https://download.pytorch.org/whl/cu126` (or appropriate CUDA version).

Applied to files:

pyproject.toml
backend/find_pytorch.py

🧬 Code graph analysis (2)

backend/find_tensorflow.py (1)

backend/utils.py (1)

read_dependencies_from_dependency_group (17-36)

backend/find_pytorch.py (1)

backend/utils.py (1)

read_dependencies_from_dependency_group (17-36)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: CodeQL analysis (python)
GitHub Check: Agent

🔇 Additional comments (2)

pyproject.toml (1)

8-9: TF/PT CPU dependency groups and build-system wiring look consistent with backend usage

Adding "dependency_groups" to [build-system].requires matches the new backend.utils import, so the build backend can resolve groups without relying on runtime deps.

The pin_tensorflow_cpu and pin_pytorch_cpu groups appear to partition major platforms as intended (Linux x86_64 vs macOS x86_64 vs “everything else”), and their names match the strings used in read_dependencies_from_dependency_group(...).

From my side this configuration is coherent with the updated find_tensorflow/find_pytorch logic; you might still want to sanity‑check the markers on a couple of target platforms (Linux, macOS, Windows) via a dry‑run resolver (pip/uv) to confirm they expand as expected.

Also applies to: 156-183

backend/find_tensorflow.py (1)

29-31: Using dependency-group TF CPU pins in CIBW builds looks correct

The new import and the change to:

treat CUDA_VERSION == "" or >=12,<13 as the CPU/“modern CUDA” path, and

extend requires with read_dependencies_from_dependency_group("pin_tensorflow_cpu")

nicely centralize TF CPU pinning in pyproject.toml while still layering on get_tf_requirement(tf_version)["cpu"] for ancillary deps. The group name matches the [dependency-groups] entry and requires.extend(...) is compatible with the returned iterable.

Also applies to: 93-101

backend/utils.py

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

backend/utils.py (1)
34-34: Consider safer dictionary access with explicit error handling.

Line 34 uses direct dictionary access pyproject["dependency-groups"], which will raise a generic KeyError if the key is missing. A previous review comment suggested using .get() with explicit error handling to provide clearer diagnostics.

Apply this diff to add explicit error handling:
-    groups = pyproject["dependency-groups"]
+    groups = pyproject.get("dependency-groups")
+    if groups is None:
+        raise ValueError(
+            "Missing 'dependency-groups' section in pyproject.toml"
+        )

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4c24d08 and 637fd9b.

📒 Files selected for processing (1)

backend/utils.py (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (25)

GitHub Check: Test Python (5, 3.12)
GitHub Check: Test Python (6, 3.12)
GitHub Check: Test Python (6, 3.9)
GitHub Check: Test Python (2, 3.12)
GitHub Check: Analyze (c-cpp)
GitHub Check: Test Python (4, 3.12)
GitHub Check: Test Python (5, 3.9)
GitHub Check: Test Python (3, 3.12)
GitHub Check: Test Python (3, 3.9)
GitHub Check: Test Python (4, 3.9)
GitHub Check: Test Python (2, 3.9)
GitHub Check: Test Python (1, 3.12)
GitHub Check: Test Python (1, 3.9)
GitHub Check: Build wheels for cp310-manylinux_aarch64
GitHub Check: Build C++ (cuda, cuda)
GitHub Check: Build C++ (cuda120, cuda)
GitHub Check: Build wheels for cp311-manylinux_x86_64
GitHub Check: Build C++ (rocm, rocm)
GitHub Check: Build wheels for cp311-manylinux_x86_64
GitHub Check: Build wheels for cp311-win_amd64
GitHub Check: Build wheels for cp311-macosx_x86_64
GitHub Check: Build C library (2.14, >=2.5.0,<2.15, libdeepmd_c_cu11.tar.gz)
GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
GitHub Check: Test C++ (false)
GitHub Check: Test C++ (true)

🔇 Additional comments (2)

backend/utils.py (2)

17-30: The docstring has been correctly updated.

The Returns section (lines 28-29) now correctly states tuple[str, ...] to match the return type annotation. This addresses the previous comment about the list vs. tuple inconsistency.

7-14: Verify that external dependencies are declared.

The code imports from dependency_groups (line 7) and conditionally imports tomli for Python < 3.11 (line 14). Ensure these packages are properly declared in the project's dependencies configuration.

backend/utils.py

This will make the CI more robust, i.e., new TF/PT versions will not break the CI until we bump the version.  ## Summary by CodeRabbit * **New Features** * Added CPU-optimized build support for PyTorch and TensorFlow. * Platform-specific dependency resolution for Linux x86_64, macOS, and other architectures. * **Chores** * Updated build configuration to use configurable dependency groups and conditional pins for CPU vs GPU builds. * Adjusted CI/build selection logic to better handle CPU-targeted packaging across environments. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub>  --------- Signed-off-by: Jinzhe Zeng <njzjz@qq.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

njzjz and others added 5 commits December 2, 2025 13:20

CI: pin cibuildwheel TF/PT deps to global pinnings

a1be867

This will make the CI more robust, i.e., new TF/PT versions will not break the CI until we bump the version.

[pre-commit.ci] auto fixes from pre-commit.com hooks

74d2488

for more information, see https://pre-commit.ci

fix TF cpu pinning on other platforms

631ac66

fix macos x86 pinning

1789252

fix TF macos x86 pinning

4c24d08

njzjz marked this pull request as ready for review December 2, 2025 07:05

njzjz requested review from Copilot and wanghan-iapcm December 2, 2025 07:05

Copilot started reviewing on behalf of njzjz December 2, 2025 07:05 View session

Copilot finished reviewing on behalf of njzjz December 2, 2025 07:08

Copilot AI reviewed Dec 2, 2025

View reviewed changes

backend/utils.py Outdated Show resolved Hide resolved

Update backend/utils.py

637fd9b

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Jinzhe Zeng <njzjz@qq.com>

coderabbitai bot reviewed Dec 2, 2025

View reviewed changes

backend/utils.py Show resolved Hide resolved

coderabbitai bot reviewed Dec 2, 2025

View reviewed changes

backend/utils.py Show resolved Hide resolved

wanghan-iapcm approved these changes Dec 2, 2025

View reviewed changes

njzjz enabled auto-merge December 2, 2025 18:18

njzjz added this pull request to the merge queue Dec 2, 2025

Merged via the queue into deepmodeling:devel with commit a38f483 Dec 2, 2025
60 checks passed

njzjz deleted the pin_cibw branch December 2, 2025 21:23

coderabbitai bot mentioned this pull request Jan 10, 2026

feat(pt): DPLR in PyTorch backend #5138

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CI: pin cibuildwheel TF/PT deps to global pinnings #5071

CI: pin cibuildwheel TF/PT deps to global pinnings #5071

Uh oh!

njzjz commented Dec 2, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

codecov bot commented Dec 2, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

coderabbitai bot commented Dec 2, 2025 •

edited

Loading

Other AI code review bot(s) detected

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

CI: pin cibuildwheel TF/PT deps to global pinnings #5071

CI: pin cibuildwheel TF/PT deps to global pinnings #5071

Uh oh!

Conversation

njzjz commented Dec 2, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

codecov bot commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

coderabbitai bot commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

njzjz commented Dec 2, 2025 •

edited by coderabbitai bot

Loading

codecov bot commented Dec 2, 2025 •

edited

Loading

coderabbitai bot commented Dec 2, 2025 •

edited

Loading