Skip to content

Fix excessive memory usage of CollisionPipeline#2384

Merged
adenzler-nvidia merged 4 commits into
newton-physics:mainfrom
nvtw:dev/tw2/fix_quadratic_memory_scaling
Apr 9, 2026
Merged

Fix excessive memory usage of CollisionPipeline#2384
adenzler-nvidia merged 4 commits into
newton-physics:mainfrom
nvtw:dev/tw2/fix_quadratic_memory_scaling

Conversation

@nvtw

@nvtw nvtw commented Apr 9, 2026

Copy link
Copy Markdown
Member

Description

Fix O(W²·S²) memory explosion in CollisionPipeline shape-pair buffer allocation
for NXN and SAP broad phase modes. The buffer capacity shape_pairs_max was computed
as N*(N-1)/2 over the global shape count across all worlds, but shapes in
different worlds can never collide. The fix computes the sum of per-world
n*(n-1)/2 instead, matching the thread layout already used by the broad phase
kernels. At 1024 worlds with ~20 shapes each this reduces shape_pairs_max from
~210 M to ~215 K (~975×), saving 10+ GB of GPU memory across the pipeline's
pair and narrow-phase buffers.

The same fix is applied to the Kamino CollisionPipelineUnifiedKamino.

Checklist

  • New or existing tests cover these changes
  • The documentation is up to date with these changes
  • CHANGELOG.md has been updated (if user-facing change)

Test plan

uv run --extra dev -m newton.tests -k TestShapePairsMaxScaling -v
uv run --extra dev -m newton.tests -k TestContactEstimator -v

Seven new tests in TestShapePairsMaxScaling verify:

  • Single-world result matches the original formula
  • Multi-world pairs scale linearly (1×/2×/4× with 1/2/4 worlds)
  • 256 worlds: result is <1% of the global N² value
  • Global shapes (world=-1) are correctly included per world segment
  • End-to-end: CollisionPipeline with NXN and SAP on a 64-world model
    has shape_pairs_max equal to the linear per-world sum

Bug fix

Steps to reproduce:

  1. Create a model with ≥1024 worlds of a multi-shape robot (e.g. ANYmal)
  2. Construct CollisionPipeline(model, broad_phase="nxn") or "sap"
  3. Observe GPU OOM or multi-GB allocation for pair buffers that are mostly empty

Minimal reproduction:

import newton

robot = newton.ModelBuilder()
for _ in range(5):
    b = robot.add_body()
    robot.add_shape_box(body=b, hx=0.1, hy=0.1, hz=0.1)

builder = newton.ModelBuilder()
for _ in range(1024):
    builder.add_world(robot)

model = builder.finalize()
# Before fix: shape_pairs_max ≈ 13 M, allocates ~100 MB+ just for pair buffer
# After fix:  shape_pairs_max = 10240, allocates ~80 KB
pipeline = newton.CollisionPipeline(model, broad_phase="nxn")
print(pipeline.shape_pairs_max)

Summary by CodeRabbit

  • Bug Fixes

    • Reduced excessive memory use in collision detection for multi-world simulations by computing shape-pair counts per world instead of using a global all-pairs bound, improving scaling from quadratic to linear as worlds increase.
  • Tests

    • Added tests validating per-world pair-count scaling, proper handling of global shapes, and prevention of quadratic memory blowup.

…g instead of linear scaling (w.r.t. number of worlds)
@nvtw nvtw self-assigned this Apr 9, 2026
@coderabbitai

coderabbitai Bot commented Apr 9, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: a2df0def-78a0-4b5e-9083-782cc564d074

📥 Commits

Reviewing files that changed from the base of the PR and between 2062279 and 5e62d07.

📒 Files selected for processing (1)
  • newton/_src/solvers/kamino/_src/geometry/unified.py

📝 Walkthrough

Walkthrough

Computes collision shape-pair limits per-world (including global shapes per-world) instead of a global N² bound, updating CollisionPipeline and Kamino unified geometry initialization and adding tests and a changelog entry to prevent quadratic memory growth across multiple worlds.

Changes

Cohort / File(s) Summary
Changelog
CHANGELOG.md
Added entry describing fix: shape-pair allocation now uses per-world pair counts instead of a global N² upper bound.
Collision pipeline
newton/_src/sim/collide.py
Added _compute_per_world_shape_pairs_max(model) and use it in CollisionPipeline.__init__ for non-explicit broad-phase modes (nxn, sap, expert path) to compute shape_pairs_max from per-world candidate counts (includes world == -1 globals; filters by ShapeFlags.COLLIDE_SHAPES).
Kamino geometry
newton/_src/solvers/kamino/_src/geometry/unified.py
Compute _max_shape_pairs by summing worst-case geom-pair counts per regular world using geoms.wid (including wid == -1 globals into each world slice), with fallback to global N*(N-1)/2 when wid unavailable; updates _max_contacts accordingly.
Tests
newton/tests/test_collision_pipeline.py
Added TestShapePairsMaxScaling unit tests and updated imports to validate per-world pair-count behavior, linear scaling across worlds, inclusion of global shapes per-world, zero-shape handling, and end-to-end CollisionPipeline.shape_pairs_max assertions.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

1.0-release

Suggested reviewers

  • eric-heiden
  • camevor
  • dylanturpin
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: fixing excessive memory usage in CollisionPipeline by eliminating O(W²·S²) memory allocation.
Docstring Coverage ✅ Passed Docstring coverage is 92.86% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov

codecov Bot commented Apr 9, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 91.30435% with 2 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
newton/_src/sim/collide.py 91.30% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

@nvtw nvtw requested a review from camevor April 9, 2026 07:24

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
newton/tests/test_collision_pipeline.py (1)

803-822: Add one mixed-ShapeFlags case here.

The new helper in newton/_src/sim/collide.py also filters by ShapeFlags.COLLIDE_SHAPES, but every model built in this block marks all shapes collidable. A single mixed-flags test would lock down the branch that keeps NXN/SAP capacity aligned with broad-phase filtering for visual-only shapes.

Also applies to: 824-912

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@newton/tests/test_collision_pipeline.py` around lines 803 - 822, Tests always
set all shapes to ShapeFlags.COLLIDE_SHAPES, so add a mixed-flags scenario to
exercise the new broad-phase filter: in the test that calls _make_model, create
a model where some shapes keep ShapeFlags.COLLIDE_SHAPES and others are set to a
non-colliding flag (e.g., ShapeFlags.VISUAL_ONLY or 0) by mutating
model.shape_flags (or passing an array of flags) after _make_model returns; this
will ensure the branch in the new collision filtering logic (involving
ShapeFlags.COLLIDE_SHAPES) and the NXN/SAP capacity alignment is covered.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@newton/_src/solvers/kamino/_src/geometry/unified.py`:
- Around line 471-477: The current pair-counting uses np.unique(wid_np) and
treats each UID independently, which undercounts because geoms with wid == -1
(shared/global geoms) participate in every regular-world slice; update the logic
in the block that sets self._max_shape_pairs to first compute global_count =
np.count_nonzero(wid_np == -1), then iterate only over world IDs uid != -1 and
for each compute n = np.count_nonzero(wid_np == uid) + global_count and add (n *
(n - 1)) // 2 to per_world_pairs; if there are no regular worlds but
global_count > 0, also add (global_count * (global_count - 1)) // 2 once to
account for the dedicated global-only slice; assign the final sum to
self._max_shape_pairs.

---

Nitpick comments:
In `@newton/tests/test_collision_pipeline.py`:
- Around line 803-822: Tests always set all shapes to ShapeFlags.COLLIDE_SHAPES,
so add a mixed-flags scenario to exercise the new broad-phase filter: in the
test that calls _make_model, create a model where some shapes keep
ShapeFlags.COLLIDE_SHAPES and others are set to a non-colliding flag (e.g.,
ShapeFlags.VISUAL_ONLY or 0) by mutating model.shape_flags (or passing an array
of flags) after _make_model returns; this will ensure the branch in the new
collision filtering logic (involving ShapeFlags.COLLIDE_SHAPES) and the NXN/SAP
capacity alignment is covered.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 12a32417-1a26-4e13-b261-57d8c626dc7d

📥 Commits

Reviewing files that changed from the base of the PR and between 38c8908 and 2062279.

📒 Files selected for processing (4)
  • CHANGELOG.md
  • newton/_src/sim/collide.py
  • newton/_src/solvers/kamino/_src/geometry/unified.py
  • newton/tests/test_collision_pipeline.py

Comment thread newton/_src/solvers/kamino/_src/geometry/unified.py
@nvtw nvtw requested a review from adenzler-nvidia April 9, 2026 09:10

@camevor camevor left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice fix, lgtm!

@adenzler-nvidia adenzler-nvidia added this pull request to the merge queue Apr 9, 2026
Merged via the queue into newton-physics:main with commit 1f6f6f5 Apr 9, 2026
25 checks passed
adenzler-nvidia added a commit to adenzler-nvidia/newton that referenced this pull request Apr 9, 2026
The CollisionPipeline fix was not cherry-picked but its changelog
entry came along via a merge conflict resolution in newton-physics#2385.
jsw7460 pushed a commit to jsw7460/newton that referenced this pull request Apr 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants