Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Aug 26, 2025

The numel function in the Paddle backend was using int for computing tensor element counts, which can overflow for large tensors. This fix changes the return type and intermediate calculations to size_t to handle larger tensor sizes safely.

Problem

The original implementation multiplied tensor dimensions as int values:

int numel(const paddle_infer::Tensor& x) const {
  // TODO: There might be a overflow problem here for multiply int numbers.
  int ret = 1;
  std::vector<int> x_shape = x.shape();
  for (std::size_t i = 0, n = x_shape.size(); i < n; ++i) {
    ret *= x_shape[i];  // Can overflow for large tensors
  }
  return ret;
}

For large tensors (e.g., shape [50000, 50000, 10] = 25 billion elements), this causes integer overflow and returns negative values.

Solution

  • Changed return type from int to size_t
  • Changed intermediate calculations to use size_t with explicit casting
  • Updated all calling sites to use size_t variables
  • Removed the TODO comment since the overflow issue is now resolved
size_t numel(const paddle_infer::Tensor& x) const {
  size_t ret = 1;
  std::vector<int> x_shape = x.shape();
  for (std::size_t i = 0, n = x_shape.size(); i < n; ++i) {
    ret *= static_cast<size_t>(x_shape[i]);  // Safe from overflow
  }
  return ret;
}

The size_t type can handle up to 2^64 elements on 64-bit systems (vs 2^31 for int), making it appropriate for tensor element counts. This change is backward compatible since std::vector::resize() and other consumers already accept size_t.

Fixes #4551.


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

…ent overflow

Co-authored-by: njzjz <9496702+njzjz@users.noreply.github.com>
Copilot AI changed the title [WIP] There might be a overflow problem here for multiply int numbers. fix(pd): change numel function return type from int to size_t to prevent overflow Aug 26, 2025
Copilot AI requested a review from njzjz August 26, 2025 18:39
@njzjz njzjz marked this pull request as ready for review August 26, 2025 18:40
Copilot AI review requested due to automatic review settings August 26, 2025 18:40
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes an integer overflow vulnerability in the Paddle backend's numel function by changing the return type and calculations from int to size_t. The function computes tensor element counts by multiplying dimensions, which can overflow with large tensors (e.g., 25 billion elements).

  • Changed numel function return type from int to size_t with safe casting
  • Updated all calling sites to use size_t variables instead of int
  • Removed the TODO comment acknowledging the overflow issue

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
source/api_cc/include/DeepPotPD.h Modified numel function signature and implementation to use size_t with explicit casting
source/api_cc/src/DeepPotPD.cc Updated all variables that store numel results to use size_t type

@github-actions github-actions bot added the C++ label Aug 26, 2025
@njzjz njzjz closed this Aug 26, 2025
@njzjz njzjz reopened this Aug 26, 2025
@codecov
Copy link

codecov bot commented Aug 26, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 84.29%. Comparing base (6349238) to head (439192f).
⚠️ Report is 69 commits behind head on devel.

Additional details and impacted files
@@           Coverage Diff           @@
##            devel    #4924   +/-   ##
=======================================
  Coverage   84.29%   84.29%           
=======================================
  Files         703      703           
  Lines       68728    68728           
  Branches     3573     3572    -1     
=======================================
  Hits        57935    57935           
  Misses       9653     9653           
  Partials     1140     1140           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@njzjz njzjz requested a review from HydrogenSulfate August 27, 2025 01:22
Copy link
Collaborator

@HydrogenSulfate HydrogenSulfate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@njzjz njzjz requested a review from caic99 August 27, 2025 03:19
@njzjz njzjz added this pull request to the merge queue Aug 27, 2025
Merged via the queue into devel with commit 996d192 Aug 27, 2025
157 of 253 checks passed
@njzjz njzjz deleted the copilot/fix-4551 branch August 27, 2025 06:30
ChiahsinChu pushed a commit to ChiahsinChu/deepmd-kit that referenced this pull request Dec 17, 2025
…ent overflow (deepmodeling#4924)

The `numel` function in the Paddle backend was using `int` for computing
tensor element counts, which can overflow for large tensors. This fix
changes the return type and intermediate calculations to `size_t` to
handle larger tensor sizes safely.

## Problem

The original implementation multiplied tensor dimensions as `int`
values:

```cpp
int numel(const paddle_infer::Tensor& x) const {
  // TODO: There might be a overflow problem here for multiply int numbers.
  int ret = 1;
  std::vector<int> x_shape = x.shape();
  for (std::size_t i = 0, n = x_shape.size(); i < n; ++i) {
    ret *= x_shape[i];  // Can overflow for large tensors
  }
  return ret;
}
```

For large tensors (e.g., shape `[50000, 50000, 10]` = 25 billion
elements), this causes integer overflow and returns negative values.

## Solution

- Changed return type from `int` to `size_t`
- Changed intermediate calculations to use `size_t` with explicit
casting
- Updated all calling sites to use `size_t` variables
- Removed the TODO comment since the overflow issue is now resolved

```cpp
size_t numel(const paddle_infer::Tensor& x) const {
  size_t ret = 1;
  std::vector<int> x_shape = x.shape();
  for (std::size_t i = 0, n = x_shape.size(); i < n; ++i) {
    ret *= static_cast<size_t>(x_shape[i]);  // Safe from overflow
  }
  return ret;
}
```

The `size_t` type can handle up to 2^64 elements on 64-bit systems (vs
2^31 for `int`), making it appropriate for tensor element counts. This
change is backward compatible since `std::vector::resize()` and other
consumers already accept `size_t`.

Fixes deepmodeling#4551.

<!-- START COPILOT CODING AGENT TIPS -->
---

✨ Let Copilot coding agent [set things up for
you](https://github.com/deepmodeling/deepmd-kit/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot)
— coding agent works faster and does higher quality work when set up for
your repo.

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: njzjz <9496702+njzjz@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

There might be a overflow problem here for multiply int numbers.

4 participants