Update broad phase to work properly with world ids by nvtw · Pull Request #984 · newton-physics/newton

nvtw · 2025-10-24T13:34:17Z

Description

Should address #946

This PR improves the broad phase collision detection system with efficiency optimizations and consistency fixes:

Core Changes

Optional pair precomputation: Added build_shape_contact_pairs parameter to ModelBuilder.finalize() to skip expensive O(N²) computation when using dynamic broad phases (NXN/SAP)
Shape flags filtering: Extended precompute_world_map() to filter out non-colliding shapes (e.g., visual-only) from broad phase processing, preventing unnecessary work
Terminology fixes: Clarified distinction between collision groups and their filtering behavior
Type fixes: Converted model.shape_collision_group from Python list to warp array for kernel compatibility

Enhancements

SAP sort configuration: Added sap_sort_type parameter to control sorting algorithm (SEGMENTED vs TILE) in SAP broad phase
Consistent filtering: All three broad phase modes (EXPLICIT, NXN, SAP) now consistently respect ShapeFlags.COLLIDE_SHAPES
Comprehensive testing: Parametrized unified collision pipeline tests to run for all three broad phase modes (27 tests total)

Newton Migration Guide

Please ensure the migration guide for warp.sim users is up-to-date with the changes made in this PR.

The migration guide in docs/migration.rst is up-to date

Before your PR is "Ready for review"

Necessary tests have been added and new examples are tested (see newton/tests/test_examples.py)
Documentation is up-to-date
Code passes formatting and linting checks with pre-commit run -a

Summary by CodeRabbit

New Features
- Public SAPSortType added; new options to supply per-shape world, flags, collision groups, and SAP sort selection.
Enhancements
- Multi-world broad-phase with per-world precomputation and world-aware pair generation for NXN and SAP.
- Segmented (default) and optional tile-based SAP sorting for improved configurability and performance.
Tests
- Expanded coverage for multi-world, SAP segmented/tile variants, flags, and edge cases.
Chores
- shape_collision_group now finalized and stored as an array; model initialization updated.

…shapes from collision)

…oup_pair

coderabbitai · 2025-10-24T13:34:47Z

📝 Walkthrough

Walkthrough

Adds world-aware broad-phase preprocessing and routing, configurable SAP sorting (SEGMENTED/TILE) with exported SAPSortType, converts per-shape collision groups to Warp arrays, and threads world/flag/sort metadata through broad-phase constructors, pipeline initialization, and tests.

Changes

Cohort / File(s)	Summary
Top-level export `newton/__init__.py`, `newton/_src/geometry/__init__.py`	Re-export `SAPSortType` from `newton._src.geometry` and include it in package `__all__`.
World precomputation util `newton/_src/geometry/broad_phase_common.py`	Add `precompute_world_map(geom_world, geom_flags=None)` to build per-world index maps and slice boundaries; import `numpy` and `ShapeFlags`; export function.
NxN broad phase (world-aware) `newton/_src/geometry/broad_phase_nxn.py`	Add numpy precompute usage, `_find_world_and_local_id`, change `_get_lower_triangular_indices` to return `(r, c)`, integrate per-world slices into `_nxn_broadphase_kernel`, and update `BroadPhaseAllPairs` / `.launch` to accept precomputed world arrays and remove prior geom_count-based launch dim.
SAP broad phase (segmented & tile sort) `newton/_src/geometry/broad_phase_sap.py`	Add `SAPSortType` IntEnum and per-world preprocessing via `precompute_world_map`; implement segmented and tile sort flows (`_create_tile_sort_kernel`, `tile_sort_kernel`, `binary_search_segment`); refactor per-world SAP buffers and make `BroadPhaseSAP` constructor/launch world-aware with new params (`sort_type`, `tile_block_dim`, `device`).
Model / builder shape groups `newton/_src/sim/model.py`, `newton/_src/sim/builder.py`	Change `Model.shape_collision_group` default to `None`; write `shape_collision_group` as `wp.array(..., dtype=wp.int32)` in `builder.finalize()` for emitted Model.
Collision pipeline integration `newton/_src/sim/collide_unified.py`	Extend `CollisionPipelineUnified.__init__` and `from_model()` to accept/forward `shape_collision_group`, `shape_world`, `shape_flags`, and `sap_sort_type`; require `shape_world` for NXN/SAP broad phases and forward `sap_sort_type` to `BroadPhaseSAP`.
Solver & tests tensor access `newton/_src/solvers/mujoco/solver_mujoco.py`, `newton/tests/test_environment_group_collision.py`	Access `shape_collision_group` via `.numpy()` for CPU-side indexing in solver and tests.
Tests: broad-phase & pipeline `newton/tests/test_broad_phase.py`, `newton/tests/test_collision_pipeline.py`	Add `test_group_pair`, `test_world_and_group_pair`; extend numpy overlap helper with `shape_world`/`shape_flags`; add multi-world NxN and SAP tests (segmented/tile); update tests to construct broad phases with `shape_world` and parametrize unified collision tests for EXPLICIT/NXN/SAP.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Pipeline as CollisionPipelineUnified
    participant Precomp as precompute_world_map
    participant BroadPhase
    participant Kernel
    participant Results

    User->>Pipeline: __init__(..., shape_world, shape_flags, sap_sort_type)
    Pipeline->>BroadPhase: Create(shape_world, geom_flags=shape_flags, sort_type)
    BroadPhase->>Precomp: precompute_world_map(geom_world, geom_flags)
    Precomp-->>BroadPhase: world_index_map, world_slice_ends, slice_ends

    User->>Pipeline: launch(...)
    Pipeline->>BroadPhase: launch(...)

    alt NxN flow
        BroadPhase->>Kernel: _nxn_broadphase_kernel(..., world_cumsum_lower_tri, world_slice_ends, world_index_map, ...)
        Kernel->>Kernel: _find_world_and_local_id(tid)
        Kernel->>Kernel: map local→global, skip cross-world global-global (unless dedicated)
        Kernel->>Results: write_pair()
    else SAP flow
        BroadPhase->>Kernel: _sap_project_kernel(...)
        BroadPhase->>BroadPhase: per-world sort (SEGMENTED or TILE)
        BroadPhase->>Kernel: _sap_range_kernel(...)
        Kernel->>Kernel: binary_search_segment / sweep per-segment
        Kernel->>Results: write_pair()
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60–90 minutes

Areas needing focused review:
- correctness of precompute_world_map() index layout, validation, and COLLIDE_SHAPES filtering,
- thread-to-world mapping and lower-triangular indexing adjustments in broad_phase_nxn.py,
- SAP segmented vs tile sort kernel generation, buffer sizing/padding, and per-world flattened indexing in broad_phase_sap.py,
- propagation and validation of shape_world, shape_flags, and sap_sort_type across CollisionPipelineUnified and broad-phase constructors,
- test expectations for multi-world scenarios and .numpy() conversions.

Possibly related PRs

Broad phase collision API #286 — Overlaps with broad-phase refactors (nxn/sap/common) and world-aware precomputation changes.
Rename environment id to world id #945 — Related export/addition of SAPSortType and SAP codepath updates.
Fix MuJoCo shape index mapping, improve conversion speed #587 — Related model/builder emission changes for per-shape group arrays (Warp array emission).

Suggested reviewers

mmacklin
adenzler-nvidia
eric-heiden

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 69.23% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Update broad phase to work properly with world ids" is directly and specifically related to the main changes in the pull request. The raw summary confirms that substantial modifications have been made across the broad phase modules—including a new `precompute_world_map()` function for grouping geometries by world IDs, world-aware dispatch in NXN and SAP kernels, per-world slicing and mapping, and propagation of `shape_world` parameters through the collision pipeline. The title accurately captures the core technical objective and is concise, clear, and specific enough to convey the primary focus to teammates reviewing the history.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

newton/_src/utils/import_usd.py (1)
1092-1098: Critical bug confirmed: first named collision group silently disabled

The code maps named collision groups starting from 0, but collision group 0 means "no collisions". This silently breaks collisions for the first named group.
-                        collision_group_ids[cgroup_name] = len(collision_group_ids)
+                        collision_group_ids[cgroup_name] = len(collision_group_ids) + 1
Keep collision_group = 1 as default (shapes without explicit groups get normal collisions). Apply the +1 offset to prevent named groups from ever mapping to 0.

🧹 Nitpick comments (8)

newton/_src/geometry/broad_phase_common.py (1)

123-211: precompute_world_map: correct semantics; minor ergonomic tweak

Logic correctly filters by COLLIDE_SHAPES and appends shared (negative) indices to each world. Consider returning unique_worlds alongside (index_map, slice_ends) to avoid callers recomputing world IDs when needed.

newton/tests/test_collision_pipeline.py (1)

64-64: Good parametrization across EXPLICIT/NXN/SAP

Broad-phase mode is cleanly threaded through; wrappers/readable test names help. Watch total runtime (200 steps × 9 pairs × 3 modes × devices).

Optionally reduce steps in smoke CI or mark a subset as slow to keep pipeline times reasonable.

Also applies to: 99-100, 283-301, 309-332, 335-367
newton/_src/geometry/broad_phase_nxn.py (3)
57-76: Prefer returning wp.vec2i from Warp func

Tuple returns from @wp.func can be brittle; use a typed vector for clarity and codegen stability.

Apply this diff:
-@wp.func
-def _get_lower_triangular_indices(index: int, matrix_size: int):
+@wp.func
+def _get_lower_triangular_indices(index: int, matrix_size: int) -> wp.vec2i:
     total = (matrix_size * (matrix_size - 1)) >> 1
     if index >= total:
-        # In Warp, we can't throw, so return an invalid pair
-        return -1, -1
+        return wp.vec2i(-1, -1)
@@
-    return r, c
+    return wp.vec2i(r, c)
And update call site:
-    local_geom1, local_geom2 = _get_lower_triangular_indices(local_id, num_geoms_in_world)
+    pair_rc = _get_lower_triangular_indices(local_id, num_geoms_in_world)
+    local_geom1 = pair_rc[0]
+    local_geom2 = pair_rc[1]
Confirm Warp version supports tuple returns; if yes, this change is optional.

79-114: Binary search helper looks correct; remove redundant casts

low = int(0) and high = int(num_worlds - 1) are unnecessary.

Apply:
-    low = int(0)
-    high = int(num_worlds - 1)
+    low = 0
+    high = num_worlds - 1
This also addresses the Ruff hints.

206-264: Precompute and storage are solid; minor ergonomic improvement

Handles Warp/Numpy inputs and device selection well. Optionally store num_worlds for debugging and to short-circuit launches early.
newton/tests/test_broad_phase.py (1)

26-73: Consider importing these helpers from broad_phase_common to avoid duplication.

The test_group_pair and test_world_and_group_pair functions duplicate logic from newton._src.geometry.broad_phase_common. While having test-specific implementations can provide independence, importing from the source module would:

Ensure tests validate the actual production logic

Reduce maintenance burden when collision logic changes

Catch any divergence between test and production implementations

If test independence is desired, consider adding a comment explaining why duplication is intentional.
newton/_src/sim/collide_unified.py (1)
936-949: World data validation and broad phase initialization are correct.

The validation ensuring shape_world is provided for NXN and SAP modes prevents runtime errors. The conditional sap_kwargs construction for SAP's optional sort_type is clean.

The error messages at lines 937 and 944 could be extracted to constants or a custom exception class per the linter suggestion, but this is a minor style consideration.

Optional: Extract error messages to improve maintainability:
class BroadPhaseConfigError(ValueError):
    """Raised when broad phase configuration is invalid."""
    
    MISSING_SHAPE_WORLD = "shape_world must be provided when using {}"

# Then use:
if shape_world is None:
    raise BroadPhaseConfigError(
        BroadPhaseConfigError.MISSING_SHAPE_WORLD.format("BroadPhaseMode.NXN")
    )
Based on static analysis hints.
newton/_src/sim/builder.py (1)
4455-4467: Consider making the conservative estimate configurable.

The conservative estimate of shape_count * 30 * 5 (assuming each shape collides with up to 30 others with 5 contacts per pair) may be overly generous for many scenarios. For a scene with 10,000 shapes, this allocates space for 1.5 million contact points, which could waste significant memory.

Consider one of these approaches:

Make it configurable: Add a parameter to control the max contacts per shape estimate:
 def finalize(
-    self, device: Devicelike | None = None, requires_grad: bool = False, build_shape_contact_pairs: bool = True
+    self, device: Devicelike | None = None, requires_grad: bool = False, build_shape_contact_pairs: bool = True,
+    estimated_contacts_per_shape: int = 30
 ) -> Model:
Use a better heuristic: Base the estimate on actual shape distribution, world count, or shape density rather than a fixed constant.

Document the heuristic: At minimum, add a comment explaining why "30 shapes" was chosen as the estimate.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between acf8c90 and 3099f6e.

📒 Files selected for processing (11)

newton/__init__.py (2 hunks)
newton/_src/geometry/__init__.py (2 hunks)
newton/_src/geometry/broad_phase_common.py (2 hunks)
newton/_src/geometry/broad_phase_nxn.py (7 hunks)
newton/_src/geometry/broad_phase_sap.py (8 hunks)
newton/_src/sim/builder.py (10 hunks)
newton/_src/sim/collide_unified.py (6 hunks)
newton/_src/sim/model.py (1 hunks)
newton/_src/utils/import_usd.py (1 hunks)
newton/tests/test_broad_phase.py (8 hunks)
newton/tests/test_collision_pipeline.py (5 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-08-12T17:51:37.474Z

Learnt from: nvlukasz
PR: newton-physics/newton#519
File: newton/geometry.py:16-22
Timestamp: 2025-08-12T17:51:37.474Z
Learning: In the Newton public API refactor, common geometry symbols like ParticleFlags and ShapeFlags are now exposed at the top level in newton/__init__.py rather than through newton.geometry. The newton/geometry.py module is intentionally focused only on broad-phase collision detection classes (BroadPhaseAllPairs, BroadPhaseExplicit, BroadPhaseSAP).

Applied to files:

newton/__init__.py
newton/_src/geometry/__init__.py
newton/_src/geometry/broad_phase_common.py
newton/_src/geometry/broad_phase_nxn.py

🧬 Code graph analysis (9)

newton/__init__.py (1)

newton/_src/geometry/broad_phase_sap.py (1)

SAPSortType (32-36)

newton/_src/geometry/__init__.py (1)

newton/_src/geometry/broad_phase_sap.py (2)

BroadPhaseSAP (355-598)

SAPSortType (32-36)

newton/tests/test_collision_pipeline.py (3)

newton/_src/sim/collide_unified.py (1)

BroadPhaseMode (43-54)

newton/_src/geometry/types.py (1)

GeoType (25-67)

newton/tests/unittest_utils.py (1)

add_function_test (312-331)

newton/_src/geometry/broad_phase_sap.py (1)

newton/_src/geometry/broad_phase_common.py (5)

binary_search (45-53)

check_aabb_overlap (25-41)

precompute_world_map (123-212)

test_world_and_group_pair (94-120)

write_pair (57-68)

newton/_src/geometry/broad_phase_common.py (2)

newton/_src/sim/builder.py (2)

flags (176-182)

flags (185-190)

newton/_src/geometry/flags.py (1)

ShapeFlags (30-42)

newton/_src/sim/builder.py (3)

newton/_src/sim/collide.py (2)

device (334-341)

count_rigid_contact_points (33-66)

newton/_src/sim/model.py (1)

Model (29-645)

newton/_src/geometry/flags.py (1)

ShapeFlags (30-42)

newton/_src/geometry/broad_phase_nxn.py (2)

newton/_src/geometry/broad_phase_common.py (4)

check_aabb_overlap (25-41)

precompute_world_map (123-212)

test_world_and_group_pair (94-120)

write_pair (57-68)

newton/_src/geometry/broad_phase_sap.py (1)

launch (459-598)

newton/_src/sim/collide_unified.py (2)

newton/_src/geometry/broad_phase_nxn.py (1)

BroadPhaseAllPairs (190-327)

newton/_src/geometry/broad_phase_sap.py (1)

BroadPhaseSAP (355-598)

newton/tests/test_broad_phase.py (3)

newton/_src/geometry/broad_phase_sap.py (3)

SAPSortType (32-36)

BroadPhaseSAP (355-598)

launch (459-598)

newton/_src/geometry/broad_phase_nxn.py (3)

BroadPhaseAllPairs (190-327)

launch (265-327)

launch (344-396)

newton/_src/geometry/broad_phase_common.py (2)

test_group_pair (73-90)

test_world_and_group_pair (94-120)

🪛 Ruff (0.14.1)

newton/_src/geometry/broad_phase_nxn.py

96-96: Value being cast to int is already an integer

Remove unnecessary int call

(RUF046)

98-98: Value being cast to int is already an integer

Remove unnecessary int call

(RUF046)

newton/_src/sim/collide_unified.py

879-879: Unused method argument: shape_collision_group

(ARG002)

937-937: Avoid specifying long messages outside the exception class

(TRY003)

944-944: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Run GPU Benchmarks (Pull Request)

🔇 Additional comments (33)

newton/_src/geometry/__init__.py (1)

17-17: Public export looks correct

Re-export and all update align with SAP changes. No issues.

Also applies to: 51-51

newton/__init__.py (1)

39-39: Top-level SAPSortType export is good

Keeps public API consistent with geometry package.

Also applies to: 48-48

newton/_src/geometry/broad_phase_common.py (2)

18-22: Imports OK and necessary

NumPy and ShapeFlags usage is appropriate here.

94-121: World + group filtering logic is sound

Rules match doc: globals (-1) collide across worlds; positive/negative group semantics respected; 0 disables collisions. No issues.

Ensure all producers of collision groups avoid 0 except to explicitly disable (see import_usd fix). Based on learnings.

newton/_src/geometry/broad_phase_nxn.py (4)

16-20: Imports and common helpers usage are appropriate

Using precompute_world_map and shared helpers keeps NXN and SAP consistent.

124-157: Kernel world-mapping + dedup are correct

Canonical global index ordering prevents (i,j)/(j,i) duplicates.

Skipping global/global pairs for world_id > 0 avoids multi-world duplication.

World/group filter applied before AABB overlap is efficient.

Edge case: worlds with <2 geoms (0 pairs) produce zero cumsum increments; confirm no launch with num_kernel_threads==0 (guard is present).

Also applies to: 164-184

310-327: Launch uses precomputed thread count correctly

Good device defaulting and zero-thread guard.

380-382: Device defaulting in explicit path

Small but useful fix; aligns with NXN/SAP device handling.

newton/_src/sim/model.py (1)

138-139: None default is safe; no pre-finalize indexing found

Verified that all accesses to shape_collision_group on the model occur after builder.finalize() completes. The attribute transitions from Python list (in builder) to wp.array during finalization, and all collision detection calls happen post-finalize. No indexing occurs before finalization. The None default is appropriate for model initialization.

newton/tests/test_broad_phase.py (6)

193-196: Good update to world-aware API.

The change from implicitly using world -1 (global) to explicitly passing shape_world with world 0 aligns with the PR's objective to properly support world IDs. The initialization of BroadPhaseAllPairs(shape_world) correctly provides world mapping for precomputation.

262-421: Excellent multi-world test coverage.

This test thoroughly validates:

Distribution of geometries across 4 worlds plus global entities

Mixed collision group interactions

World-based collision filtering

Detailed diagnostics on mismatch

The test data setup with random distribution and controlled global entity count provides good coverage of realistic scenarios.

602-606: SAP broad phase properly configured with world-aware data.

The initialization of BroadPhaseSAP(shape_world, sort_type=sort_type) correctly passes both world mapping and sort type, enabling per-world segmented sorting as described in the PR objectives.

674-680: Good test parametrization for SAP sort strategies.

The split into test_sap_broadphase_segmented and test_sap_broadphase_tile ensures both SEGMENTED and TILE sort implementations are validated, which is important given the PR introduces configurable SAP sorting algorithms.

831-1220: Exceptional edge case coverage for NxN broad phase.

This test is exemplary for stress-testing GPU collision code:

124 geometries with carefully crafted scenarios (touching boxes, gaps with cutoffs, chains, nested, global entities, group filtering, world isolation)

Explicit verification for duplicates (line 1180-1182)

Clear diagnostics showing missing/extra pairs with world and group context

Tests boundary conditions like zero-distance contacts and cutoff-based overlaps

The test structure with base cases + clusters + isolated geometries provides excellent coverage.

1221-1636: Comprehensive SAP edge case validation.

Similar to the NxN edge case test but specifically targets SAP algorithm challenges:

171 geometries stress-testing sorting and sweep phases

Critical duplicate prevention test for global entities (lines 1273-1277, 1507-1508)

Reverse order in space to verify sorting correctness (lines 1303-1307)

Overlapping clusters along sweep axis to stress segmented sorting

The focus on duplicate prevention for global entities is particularly important given the per-world segmentation approach.

newton/_src/sim/collide_unified.py (2)

879-882: API properly extended for world-aware collision detection.

The new parameters shape_collision_group, shape_world, shape_flags, and sap_sort_type correctly support the PR's objective to make broad phase world-aware. The parameters are appropriately optional to maintain backward compatibility where possible.

Note: The static analysis warning about shape_collision_group being unused is a false positive—it's passed through to NXN/SAP broad phases and used in the collide() method at line 1129.

Based on static analysis hints.

1059-1063: Model attributes correctly propagated to collision pipeline.

The from_model method properly extracts and forwards shape_collision_group, shape_world, shape_flags, and sap_sort_type to the constructor. Using hasattr checks ensures backward compatibility with models that may not have these attributes yet.
newton/_src/geometry/broad_phase_sap.py (10)
32-36: Clean enum design for SAP sort strategy selection.

The SAPSortType enum provides clear options (SEGMENTED and TILE) with descriptive comments. This makes the API self-documenting and enables performance tuning for different geometry distribution patterns.

58-88: Correct implementation of segment-aware binary search.

The binary_search_segment function properly offsets indices by base_idx for segment-local searching. The logic is correct: it finds the leftmost position where value can be inserted to maintain sorted order within the segment [start, end).

91-134: Elegant tile sort kernel factory using closure.

The _create_tile_sort_kernel function cleverly uses a closure to capture tile_size as a compile-time constant for Warp. This approach:

Avoids runtime tile size lookups

Enables Warp's optimizer to generate efficient tile operations

Creates type-specialized kernels for different world sizes

The kernel correctly loads data into shared memory tiles, sorts in-place, and stores results back.

137-179: Per-world projection kernel properly structured.

The kernel correctly:

Uses 2D indexing (world_id, local_geom_id) and flattens to 1D for storage

Computes world slice boundaries to determine valid geometry count

Pads invalid threads with sentinel values (1e30, -1) for robust sorting

Maps local geometry IDs to actual geometry indices via world_index_map

The padding approach ensures uniform data layout for segmented/tile sorting without requiring variable-length segments.

182-226: Range kernel correctly implements per-world overlap detection.

The kernel properly:

Validates thread is within world's geometry count

Handles padding (sort_idx < 0)

Uses binary_search_segment to find overlap range within the world

Computes range relative to current position in sorted order

The use of binary_search_segment with world-relative indices is correct for per-world processing.

334-338: Critical duplicate prevention for global entities.

Lines 334-338 correctly prevent duplicate pair generation when both geometries are global (world -1):
if world1 == -1 and world2 == -1 and world_id > 0:
    workid += nsweep_in
    continue
Since global entities are replicated in each world's segment, this ensures they're only processed once (in world_id == 0). This is essential for correctness in the per-world segmentation approach.

363-457: Comprehensive world-aware initialization.

The refactored __init__ properly:

Accepts both numpy and warp arrays for flexibility

Calls precompute_world_map to filter non-colliding shapes and organize by world

Calculates num_worlds and max_geoms_per_world from slice_ends

Conditionally creates tile sort kernel when TILE mode selected

Sizes tile_block_dim as next power of 2 ≥ max_geoms_per_world, capped at 512

Allocates 2x space for projection_lower/sort_index for segmented sort scratch memory

Creates segment_indices for segmented sort

The 512 cap on tile_block_dim (line 439) is appropriate for GPU hardware constraints.

527-550: Dual sort strategy implementation is well-structured.

The conditional branching between tile sort (lines 529-541) and segmented sort (lines 542-550) properly:

Uses wp.launch_tiled with dynamic block_dim for tile sort

Falls back to wp.utils.segmented_sort_pairs for segmented sort

Passes correct array sizes (scratch space vs actual elements)

The count parameter in segmented_sort_pairs (line 548) correctly specifies actual elements (not including 2x scratch space), as the first half of the array contains the data to sort.

567-572: Cumulative sum approach enables efficient sweep work distribution.

Using wp.utils.array_scan to compute cumulative sums of overlap ranges (line 568) is an elegant approach for distributing sweep work across threads. Each thread can binary search the cumulative sum to find its assigned geometry pair, enabling load balancing for variable overlap ranges.

The nsweep_in calculation (line 572) using a multiplier provides good parallelism even when some worlds have fewer geometries.

429-441: I need to verify whether slice_ends_np is indeed a numpy array to confirm the type concern.

Let me search for where slice_ends_np is defined in the file:

Conversion is necessary—confirms defensive necessity.

The explicit int() conversion on line 430 is correct and necessary. When iterating over slice_ends_np (a numpy array), each end_idx is a numpy scalar. Subtracting to get num_geoms and using max() with a Python int produces a numpy scalar type (e.g., numpy.int64), not a plain Python int. Numpy arrays return elements as array scalars (a scalar with an associated dtype), and arithmetic operations maintain the numpy type. The conversion to plain Python int is required, not defensive programming, to ensure compatibility with Warp closures.
newton/_src/sim/builder.py (6)

164-165: LGTM! Breaking change is intentional and well-documented.

The collision group default change from -1 to 1 aligns with the new world-aware collision semantics described in the PR objectives. The updated docstring clearly explains that 0 disables collisions and 1 is the default group.

455-457: LGTM! Default world index change is intentional.

The current_world default change from -1 (global) to 0 (first world) is consistent with the PR's world-aware collision detection updates. The comment clearly documents the new semantics.

4110-4125: LGTM! Well-designed API extension.

The build_shape_contact_pairs parameter provides a clean way to skip expensive O(N²) precomputation when using dynamic broad phase modes (NXN or SAP). The default value True maintains backward compatibility, and the docstring clearly explains when to set it to False.

4222-4222: LGTM! Necessary for kernel compatibility.

Converting shape_collision_group from a Python list to a wp.array is required for GPU kernel access, as mentioned in the PR objectives. This change aligns with how other Model arrays are stored.

4486-4528: LGTM! Collision filtering logic is correct.

The helper methods _test_group_pair and _test_world_and_group_pair correctly implement the collision filtering semantics:

Group 0: no collisions

Positive groups: collide within same group or with negative groups

Negative groups: collide with everything except themselves

World filtering: entities in different non-global worlds don't collide

The logic is symmetric and matches the documented kernel behavior.

4529-4578: LGTM! Improved collision pair filtering.

The updated find_shape_contact_pairs method correctly:

Filters to only shapes with the COLLIDE_SHAPES flag set (line 4552-4553)

Uses the same _test_world_and_group_pair logic as the broad phase kernels for consistency

Avoids duplicate pairs with proper loop iteration (i2 starts from i1 + 1)

Maintains canonical ordering for filter lookups

This ensures EXPLICIT mode (precomputed pairs) behaves identically to dynamic broad phase modes (NXN/SAP).

…fault group id

…/nvtw/newton into dev/tw/collision_filtering_fixes

adenzler-nvidia

Approving, thank you!

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

newton/_src/geometry/broad_phase_nxn.py (1)
96-98: Minor: Unnecessary int() casts on integer literals.

Lines 96 and 98 cast integer literals to int, which is redundant. While not harmful, removing them would simplify the code:
-    low = int(0)
+    low = 0
     high = int(num_worlds - 1)
-    world_id = int(0)
+    world_id = 0
However, these explicit casts may serve as type hints for Warp's type inference, so this is optional.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 24ebc93 and 8a281de.

📒 Files selected for processing (3)

newton/_src/geometry/broad_phase_nxn.py (7 hunks)
newton/_src/solvers/mujoco/solver_mujoco.py (1 hunks)
newton/tests/test_broad_phase.py (7 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

newton/_src/solvers/mujoco/solver_mujoco.py

🧰 Additional context used

🧠 Learnings (9)

📚 Learning: 2025-08-12T17:51:37.474Z

Learnt from: nvlukasz
Repo: newton-physics/newton PR: 519
File: newton/geometry.py:16-22
Timestamp: 2025-08-12T17:51:37.474Z
Learning: In the Newton public API refactor, common geometry symbols like ParticleFlags and ShapeFlags are now exposed at the top level in newton/__init__.py rather than through newton.geometry. The newton/geometry.py module is intentionally focused only on broad-phase collision detection classes (BroadPhaseAllPairs, BroadPhaseExplicit, BroadPhaseSAP).

Applied to files:

newton/tests/test_broad_phase.py
newton/_src/geometry/broad_phase_nxn.py

📚 Learning: 2025-09-26T05:58:21.013Z

Learnt from: WenchaoHuang
Repo: newton-physics/newton PR: 835
File: newton/_src/solvers/style3d/collision/collision.py:49-53
Timestamp: 2025-09-26T05:58:21.013Z
Learning: In newton/_src/solvers/style3d/collision/collision.py, the broad_phase buffers (broad_phase_ee, broad_phase_ef, broad_phase_vf) created with wp.array() don't need zero-initialization because they follow a write-first-read-later pattern: BVH query kernels in frame_begin() populate these buffers completely (write-only operations), and collision kernels only read from them afterward in accumulate_contact_force(), ensuring no undefined reads occur.

Applied to files:

newton/tests/test_broad_phase.py
newton/_src/geometry/broad_phase_nxn.py

📚 Learning: 2025-09-26T05:58:21.013Z

Learnt from: WenchaoHuang
Repo: newton-physics/newton PR: 835
File: newton/_src/solvers/style3d/collision/collision.py:49-53
Timestamp: 2025-09-26T05:58:21.013Z
Learning: In newton/_src/solvers/style3d/collision/collision.py, the broad_phase buffers (broad_phase_ee, broad_phase_ef, broad_phase_vf) created with wp.array() don't need zero-initialization because the BVH query kernels that use them always write the count to query_results[0, tid] = 0 first before populating any data, ensuring no undefined reads occur.

Applied to files:

newton/tests/test_broad_phase.py
newton/_src/geometry/broad_phase_nxn.py

📚 Learning: 2025-08-12T17:58:16.929Z

Learnt from: nvlukasz
Repo: newton-physics/newton PR: 519
File: newton/tests/test_ik.py:25-26
Timestamp: 2025-08-12T17:58:16.929Z
Learning: In the Newton physics engine codebase, it's acceptable for tests to import and use private `_src` internal modules and functions when needed for testing purposes, even though this breaks the public API boundary. This is an intentional project decision for test code.

Applied to files:

newton/tests/test_broad_phase.py

📚 Learning: 2025-10-30T07:28:13.077Z

Learnt from: nvtw
Repo: newton-physics/newton PR: 984
File: newton/_src/geometry/broad_phase_nxn.py:318-336
Timestamp: 2025-10-30T07:28:13.077Z
Learning: In Newton's BroadPhaseAllPairs class (newton/_src/geometry/broad_phase_nxn.py), device management is the caller's responsibility. The library intentionally does not perform automatic device transfers for precomputed arrays in the launch method, even if there's a device mismatch. Users must ensure that the device specified in __init__ matches the device used in launch.

Applied to files:

newton/tests/test_broad_phase.py
newton/_src/geometry/broad_phase_nxn.py

📚 Learning: 2025-08-20T18:02:36.099Z

Learnt from: eric-heiden
Repo: newton-physics/newton PR: 587
File: newton/_src/sim/builder.py:656-661
Timestamp: 2025-08-20T18:02:36.099Z
Learning: In Newton physics, collisions between shapes with body index -1 (world-attached/global shapes) are automatically skipped by the collision system, so no manual collision filter pairs need to be added between global shapes from different builders.

Applied to files:

newton/tests/test_broad_phase.py

📚 Learning: 2025-08-12T05:17:34.423Z

Learnt from: shi-eric
Repo: newton-physics/newton PR: 521
File: newton/examples/example_cloth_hanging.py:36-36
Timestamp: 2025-08-12T05:17:34.423Z
Learning: The Newton migration guide (docs/migration.rst) is specifically for documenting how to migrate existing warp.sim functionality to Newton equivalents. New Newton-only features that didn't exist in warp.sim do not need migration documentation.

Applied to files:

newton/_src/geometry/broad_phase_nxn.py

📚 Learning: 2025-08-12T18:04:06.577Z

Learnt from: nvlukasz
Repo: newton-physics/newton PR: 519
File: newton/_src/solvers/featherstone/kernels.py:75-75
Timestamp: 2025-08-12T18:04:06.577Z
Learning: The Newton physics framework requires nightly Warp builds, which means compatibility concerns with older stable Warp versions (like missing functions such as wp.spatial_adjoint) are not relevant for this project.

Applied to files:

newton/_src/geometry/broad_phase_nxn.py

📚 Learning: 2025-07-23T14:36:42.182Z

Learnt from: shi-eric
Repo: newton-physics/newton PR: 461
File: asv/benchmarks/envs/example_humanoid.py:40-41
Timestamp: 2025-07-23T14:36:42.182Z
Learning: In Warp benchmarks, explicit wp.init() calls are not needed in most circumstances since the first Warp API call that requires initialization will automatically call wp.init(). Explicit wp.init() in setup() methods is helpful when the ASV benchmark is measuring a Warp API call, as wp.init() has non-trivial overhead that should be excluded from the benchmark timing.

Applied to files:

newton/_src/geometry/broad_phase_nxn.py

🧬 Code graph analysis (2)

newton/tests/test_broad_phase.py (4)

newton/_src/geometry/broad_phase_sap.py (3)

SAPSortType (32-36)

BroadPhaseSAP (370-621)

launch (481-621)

newton/_src/geometry/flags.py (1)

ShapeFlags (30-42)

newton/_src/geometry/broad_phase_nxn.py (3)

BroadPhaseAllPairs (192-337)

launch (275-337)

launch (354-406)

newton/_src/geometry/broad_phase_common.py (2)

test_group_pair (73-90)

test_world_and_group_pair (94-120)

newton/_src/geometry/broad_phase_nxn.py (2)

newton/_src/geometry/broad_phase_common.py (4)

check_aabb_overlap (25-41)

precompute_world_map (123-240)

test_world_and_group_pair (94-120)

write_pair (57-68)

newton/_src/geometry/broad_phase_sap.py (1)

launch (481-621)

🪛 Ruff (0.14.2)

newton/_src/geometry/broad_phase_nxn.py

96-96: Value being cast to int is already an integer

Remove unnecessary int call

(RUF046)

98-98: Value being cast to int is already an integer

Remove unnecessary int call

(RUF046)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Run GPU Benchmarks (Pull Request)
GitHub Check: Run GPU Unit Tests on AWS EC2 (Pull Request)

🔇 Additional comments (14)

newton/tests/test_broad_phase.py (8)

22-32: LGTM! Imports and duplication comment are appropriate.

The imports of SAPSortType and ShapeFlags support the new world-aware and flag-filtered testing. The comment explaining the code duplication properly addresses the past review feedback.

104-160: LGTM! World and flag filtering correctly implemented.

The updates to find_overlapping_pairs_np properly implement:

Shape flag filtering via ShapeFlags.COLLIDE_SHAPES checks

World-aware collision testing via test_world_and_group_pair

Backward compatibility with shape_world=None and shape_flags=None

217-220: LGTM! Correct world-aware BroadPhaseAllPairs initialization.

The test correctly initializes BroadPhaseAllPairs with shape_world array (all shapes in world 0), reflecting the new world-aware API.

286-445: LGTM! Comprehensive multi-world test coverage.

This test thoroughly validates world-aware broad phase behavior:

Tests distribution across 4 worlds with global entities (world -1)

Validates world and group filtering logic

Includes detailed mismatch reporting for debugging

Proper verification against numpy reference implementation

446-675: LGTM! Critical flag filtering test with edge case coverage.

This test validates the important bug fix for num_regular_worlds computation after flag filtering:

Lines 501-504: Critical edge case filtering out all world 0 geometries

Lines 572-583: Validates num_regular_worlds is computed from filtered slices

Lines 662-671: Confirms no filtered geometries appear in output pairs

804-934: LGTM! SAP tests properly parametrized by sort type.

The refactoring to _test_sap_broadphase_impl(sort_type) with separate test methods for SEGMENTED and TILE variants correctly exercises both SAP sort implementations. The shape_world initialization (lines 856-860) aligns with the world-aware API.

1248-1637: LGTM! Exceptionally thorough edge case test.

This comprehensive test validates critical NxN broad phase behaviors with 124 geometries across multiple scenarios:

Boundary conditions (touching AABBs)

Cutoff distance effects

Global entities and world-specific filtering

Collision group edge cases

Duplicate pair prevention (lines 1595-1599)

Zero collision group handling

The extensive test coverage significantly increases confidence in the GPU implementation.

1638-2053: LGTM! SAP-specific edge cases thoroughly tested.

These tests effectively stress-test the SAP implementation with 171 geometries:

Sweep axis boundary conditions

Sorting correctness with overlapping clusters along x-axis

Critical duplicate prevention for global entities (lines 1689-1694)

Both SEGMENTED and TILE sort variants tested

Proper validation against numpy reference

newton/_src/geometry/broad_phase_nxn.py (6)

16-19: LGTM! Necessary imports for world-aware precomputation.

The numpy import supports precomputation in __init__, and the imports from broad_phase_common provide essential helpers for world-aware collision detection.

57-76: LGTM! Return type change to tuple is appropriate.

The change from wp.vec2i to tuple[int, int] is more idiomatic for Warp function returns and supports tuple unpacking in kernel code. The logic remains correct.

117-189: LGTM! World-aware kernel logic is correct.

The kernel correctly implements per-world collision detection:

Lines 135-142: Proper world slice boundary computation

Lines 148-157: Correct local-to-global mapping with canonical ordering

Lines 165-169: Critical logic preventing duplicate (-1, -1) pairs by checking is_dedicated_minus_one_segment

Lines 172-173: Proper world and group compatibility checks

208-273: LGTM! Critical bug fix for num_regular_worlds computation.

The constructor correctly implements world-aware precomputation:

Lines 222-237: Proper handling of numpy and warp array inputs

Lines 240: Calls precompute_world_map with optional flag filtering

Lines 243-245: Critical fix - num_regular_worlds now computed from filtered slice_ends_np length, not raw geom_world_np. This ensures correct identification of the dedicated -1 segment after flag filtering.

Lines 247-262: Correct cumulative sum of lower triangular elements per world

Lines 265-273: All necessary precomputed data stored for kernel

275-337: LGTM! Launch method correctly uses precomputed world data.

The launch method properly leverages precomputed world structures:

Line 323: Launches with self.num_kernel_threads (total pairs across all worlds)

Lines 330-333: Passes precomputed world-aware arrays to kernel

Docstring accurately documents world-aware collision behavior

390-391: LGTM! Consistent device handling added.

Device inference added for consistency with other broad phase implementations. The change is minimal and appropriate.

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (2)

newton/_src/geometry/broad_phase_nxn.py (1)

165-170: Correct duplicate-prevention for (-1,-1) pairs.

Skipping global-global pairs outside the dedicated -1 segment avoids duplicates while still producing them once in the final segment. This fixes the earlier misclassification when computing num_regular_worlds from filtered slices.

newton/tests/test_broad_phase.py (1)

572-583: Nice verification of num_regular_worlds after flag filtering.

Asserting against worlds present post-filter catches the earlier bug where this was derived from unfiltered data.

🧹 Nitpick comments (4)

newton/_src/solvers/mujoco/solver_mujoco.py (1)
1705-1706: Vectorize collision-group gather for clarity and speed.

Use NumPy fancy indexing instead of a Python loop.

Apply:
-        shape_collision_group_np = model.shape_collision_group.numpy()
-        cgroup = [shape_collision_group_np[i] for i in selected_shapes]
+        shape_collision_group_np = model.shape_collision_group.numpy()
+        cgroup = shape_collision_group_np[selected_shapes]
newton/_src/geometry/broad_phase_nxn.py (3)
57-76: Optional safety: guard against invalid local_id results.

_get_lower_triangular_indices() returns (-1, -1) for out-of-range index. It shouldn’t occur given dim=self.num_kernel_threads, but adding a fast check after the call prevents accidental OOB if assumptions change.

Example:
-    local_geom1, local_geom2 = _get_lower_triangular_indices(local_id, num_geoms_in_world)
+    local_geom1, local_geom2 = _get_lower_triangular_indices(local_id, num_geoms_in_world)
+    if local_geom1 < 0:
+        return
208-274: Doc: call out device responsibility explicitly.

Constructor builds precomputed arrays on the provided/inferred device; launch assumes the same device (by design). Add a short note in the class docstring/Args(device) to prevent accidental CPU/GPU mismatches.

Proposed addition in the docstring:

“Note: Callers must pass the same device here as used in launch; arrays are not auto-migrated.” (Based on learnings)

320-334: geom_count is unused; consider deprecating or clarifying.

Launch ignores geom_count in the world-aware path. Either mark it unused in the docstring or plan deprecation to avoid confusion.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 24ebc93 and 8a281de.

📒 Files selected for processing (3)

newton/_src/geometry/broad_phase_nxn.py (7 hunks)
newton/_src/solvers/mujoco/solver_mujoco.py (1 hunks)
newton/tests/test_broad_phase.py (7 hunks)

🧰 Additional context used

🧠 Learnings (11)

📓 Common learnings

Learnt from: eric-heiden
Repo: newton-physics/newton PR: 587
File: newton/_src/sim/builder.py:656-661
Timestamp: 2025-08-20T18:02:36.099Z
Learning: In Newton physics, collisions between shapes with body index -1 (world-attached/global shapes) are automatically skipped by the collision system, so no manual collision filter pairs need to be added between global shapes from different builders.

📚 Learning: 2025-08-12T17:51:37.474Z

Learnt from: nvlukasz
Repo: newton-physics/newton PR: 519
File: newton/geometry.py:16-22
Timestamp: 2025-08-12T17:51:37.474Z
Learning: In the Newton public API refactor, common geometry symbols like ParticleFlags and ShapeFlags are now exposed at the top level in newton/__init__.py rather than through newton.geometry. The newton/geometry.py module is intentionally focused only on broad-phase collision detection classes (BroadPhaseAllPairs, BroadPhaseExplicit, BroadPhaseSAP).

Applied to files:

newton/tests/test_broad_phase.py
newton/_src/geometry/broad_phase_nxn.py

📚 Learning: 2025-09-26T05:58:21.013Z

Learnt from: WenchaoHuang
Repo: newton-physics/newton PR: 835
File: newton/_src/solvers/style3d/collision/collision.py:49-53
Timestamp: 2025-09-26T05:58:21.013Z
Learning: In newton/_src/solvers/style3d/collision/collision.py, the broad_phase buffers (broad_phase_ee, broad_phase_ef, broad_phase_vf) created with wp.array() don't need zero-initialization because they follow a write-first-read-later pattern: BVH query kernels in frame_begin() populate these buffers completely (write-only operations), and collision kernels only read from them afterward in accumulate_contact_force(), ensuring no undefined reads occur.

Applied to files:

newton/tests/test_broad_phase.py
newton/_src/geometry/broad_phase_nxn.py

📚 Learning: 2025-10-30T07:28:13.077Z

Learnt from: nvtw
Repo: newton-physics/newton PR: 984
File: newton/_src/geometry/broad_phase_nxn.py:318-336
Timestamp: 2025-10-30T07:28:13.077Z
Learning: In Newton's BroadPhaseAllPairs class (newton/_src/geometry/broad_phase_nxn.py), device management is the caller's responsibility. The library intentionally does not perform automatic device transfers for precomputed arrays in the launch method, even if there's a device mismatch. Users must ensure that the device specified in __init__ matches the device used in launch.

Applied to files:

newton/tests/test_broad_phase.py
newton/_src/geometry/broad_phase_nxn.py

📚 Learning: 2025-09-26T05:58:21.013Z

Learnt from: WenchaoHuang
Repo: newton-physics/newton PR: 835
File: newton/_src/solvers/style3d/collision/collision.py:49-53
Timestamp: 2025-09-26T05:58:21.013Z
Learning: In newton/_src/solvers/style3d/collision/collision.py, the broad_phase buffers (broad_phase_ee, broad_phase_ef, broad_phase_vf) created with wp.array() don't need zero-initialization because the BVH query kernels that use them always write the count to query_results[0, tid] = 0 first before populating any data, ensuring no undefined reads occur.

Applied to files:

newton/tests/test_broad_phase.py
newton/_src/geometry/broad_phase_nxn.py

📚 Learning: 2025-08-12T17:58:16.929Z

Learnt from: nvlukasz
Repo: newton-physics/newton PR: 519
File: newton/tests/test_ik.py:25-26
Timestamp: 2025-08-12T17:58:16.929Z
Learning: In the Newton physics engine codebase, it's acceptable for tests to import and use private `_src` internal modules and functions when needed for testing purposes, even though this breaks the public API boundary. This is an intentional project decision for test code.

Applied to files:

newton/tests/test_broad_phase.py

📚 Learning: 2025-08-20T18:02:36.099Z

Learnt from: eric-heiden
Repo: newton-physics/newton PR: 587
File: newton/_src/sim/builder.py:656-661
Timestamp: 2025-08-20T18:02:36.099Z
Learning: In Newton physics, collisions between shapes with body index -1 (world-attached/global shapes) are automatically skipped by the collision system, so no manual collision filter pairs need to be added between global shapes from different builders.

Applied to files:

newton/tests/test_broad_phase.py

📚 Learning: 2025-08-25T20:10:59.536Z

Learnt from: dylanturpin
Repo: newton-physics/newton PR: 634
File: newton/examples/ik/example_ik_franka.py:0-0
Timestamp: 2025-08-25T20:10:59.536Z
Learning: In Newton's IK examples, when creating solver arrays with `wp.array(source_array, shape=new_shape)`, this creates a view into the same underlying memory rather than a copy. This means updates to the reshaped array are automatically reflected in the original array without needing explicit copy operations like `wp.copy()`.

Applied to files:

newton/_src/solvers/mujoco/solver_mujoco.py

📚 Learning: 2025-10-24T07:56:14.792Z

Learnt from: nvtw
Repo: newton-physics/newton PR: 981
File: newton/_src/geometry/collision_convex.py:180-289
Timestamp: 2025-10-24T07:56:14.792Z
Learning: In newton/_src/geometry/collision_convex.py, the single-contact solver (create_solve_convex_single_contact) intentionally returns count=1 even when shapes are separated (signed_distance > contact_threshold). The calling code is responsible for deciding whether to accept the contact based on the signed distance value. This design pushes filtering responsibility to the caller.

Applied to files:

newton/_src/solvers/mujoco/solver_mujoco.py

📚 Learning: 2025-08-12T05:17:34.423Z

Learnt from: shi-eric
Repo: newton-physics/newton PR: 521
File: newton/examples/example_cloth_hanging.py:36-36
Timestamp: 2025-08-12T05:17:34.423Z
Learning: The Newton migration guide (docs/migration.rst) is specifically for documenting how to migrate existing warp.sim functionality to Newton equivalents. New Newton-only features that didn't exist in warp.sim do not need migration documentation.

Applied to files:

newton/_src/geometry/broad_phase_nxn.py

📚 Learning: 2025-07-23T14:36:42.182Z

Learnt from: shi-eric
Repo: newton-physics/newton PR: 461
File: asv/benchmarks/envs/example_humanoid.py:40-41
Timestamp: 2025-07-23T14:36:42.182Z
Learning: In Warp benchmarks, explicit wp.init() calls are not needed in most circumstances since the first Warp API call that requires initialization will automatically call wp.init(). Explicit wp.init() in setup() methods is helpful when the ASV benchmark is measuring a Warp API call, as wp.init() has non-trivial overhead that should be excluded from the benchmark timing.

Applied to files:

newton/_src/geometry/broad_phase_nxn.py

🧬 Code graph analysis (2)

newton/tests/test_broad_phase.py (3)

newton/_src/geometry/broad_phase_sap.py (3)

SAPSortType (32-36)

BroadPhaseSAP (370-621)

launch (481-621)

newton/_src/geometry/flags.py (1)

ShapeFlags (30-42)

newton/_src/geometry/broad_phase_nxn.py (3)

BroadPhaseAllPairs (192-337)

launch (275-337)

launch (354-406)

newton/_src/geometry/broad_phase_nxn.py (2)

newton/_src/geometry/broad_phase_common.py (4)

check_aabb_overlap (25-41)

precompute_world_map (123-240)

test_world_and_group_pair (94-120)

write_pair (57-68)

newton/_src/geometry/broad_phase_sap.py (1)

launch (481-621)

🪛 Ruff (0.14.2)

newton/_src/geometry/broad_phase_nxn.py

96-96: Value being cast to int is already an integer

Remove unnecessary int call

(RUF046)

98-98: Value being cast to int is already an integer

Remove unnecessary int call

(RUF046)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Run GPU Benchmarks (Pull Request)

🔇 Additional comments (2)

newton/_src/geometry/broad_phase_nxn.py (1)

390-392: Good default device selection.

Falling back to geom_lower.device in BroadPhaseExplicit avoids accidental host launches.

newton/tests/test_broad_phase.py (1)

26-33: Duplication justified.

Keeping host-side copies of the helper functions to avoid @wp.func call overhead in tests is reasonable. The explanatory comment is sufficient.

coderabbitai · 2025-10-31T13:36:27Z

+@wp.func
+def _find_world_and_local_id(
+    tid: int,
+    world_cumsum_lower_tri: wp.array(dtype=int, ndim=1),
+):
+    """Binary search to find world ID and local ID from thread ID.
+
+    Args:
+        tid: Global thread ID
+        world_cumsum_lower_tri: Cumulative sum of lower triangular elements per world
+
+    Returns:
+        tuple: (world_id, local_id) - World ID and local index within that world
+    """
+    num_worlds = world_cumsum_lower_tri.shape[0]
+
+    # Find world_id using binary search
+    # Declare as dynamic variables for loop mutation
+    low = int(0)
+    high = int(num_worlds - 1)
+    world_id = int(0)
+
+    while low <= high:
+        mid = (low + high) >> 1
+        if tid < world_cumsum_lower_tri[mid]:
+            high = mid - 1
+            world_id = mid
+        else:
+            low = mid + 1
+
+    # Calculate local index within this world
+    local_id = tid
+    if world_id > 0:
+        local_id = tid - world_cumsum_lower_tri[world_id - 1]
+
+    return world_id, local_id



⚠️ Potential issue | 🟡 Minor

Remove unnecessary int() casts; simplify binary search locals.

low/high/world_id are already ints; the extra casts add noise.

Apply:

- low = int(0) - high = int(num_worlds - 1) - world_id = int(0) + low = 0 + high = num_worlds - 1 + world_id = 0

Also consider a brief comment that this finds the first index with cumsum > tid for future readers. (Static analysis: RUF046)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

@wp.func

def _find_world_and_local_id(

tid: int,

world_cumsum_lower_tri: wp.array(dtype=int, ndim=1),

):

"""Binary search to find world ID and local ID from thread ID.

Args:

tid: Global thread ID

world_cumsum_lower_tri: Cumulative sum of lower triangular elements per world

Returns:

tuple: (world_id, local_id) - World ID and local index within that world

"""

num_worlds = world_cumsum_lower_tri.shape[0]

# Find world_id using binary search

# Declare as dynamic variables for loop mutation

low = int(0)

high = int(num_worlds - 1)

world_id = int(0)

while low <= high:

mid = (low + high) >> 1

if tid < world_cumsum_lower_tri[mid]:

high = mid - 1

world_id = mid

else:

low = mid + 1

# Calculate local index within this world

local_id = tid

if world_id > 0:

local_id = tid - world_cumsum_lower_tri[world_id - 1]

return world_id, local_id

@wp.func

def _find_world_and_local_id(

tid: int,

world_cumsum_lower_tri: wp.array(dtype=int, ndim=1),

):

"""Binary search to find world ID and local ID from thread ID.

Args:

tid: Global thread ID

world_cumsum_lower_tri: Cumulative sum of lower triangular elements per world

Returns:

tuple: (world_id, local_id) - World ID and local index within that world

"""

num_worlds = world_cumsum_lower_tri.shape[0]

# Find world_id using binary search

# Declare as dynamic variables for loop mutation

low = 0

high = num_worlds - 1

world_id = 0

while low <= high:

mid = (low + high) >> 1

if tid < world_cumsum_lower_tri[mid]:

high = mid - 1

world_id = mid

else:

low = mid + 1

# Calculate local index within this world

local_id = tid

if world_id > 0:

local_id = tid - world_cumsum_lower_tri[world_id - 1]

return world_id, local_id

🧰 Tools

🪛 Ruff (0.14.2)

96-96: Value being cast to int is already an integer

Remove unnecessary int call

(RUF046)

98-98: Value being cast to int is already an integer

Remove unnecessary int call

(RUF046)

🤖 Prompt for AI Agents

In newton/_src/geometry/broad_phase_nxn.py around lines 78 to 114, remove the redundant int() casts on low, high, and world_id (they are already integers) and simplify their initialization to plain integer assignments; keep the binary-search logic but add a concise comment above it stating "Find first index where world_cumsum_lower_tri > tid" to clarify intent for readers. Ensure behavior is unchanged: initialize low = 0, high = num_worlds - 1, world_id = 0, run the same while loop and compute local_id as before using world_cumsum_lower_tri[world_id - 1] when world_id > 0.

coderabbitai · 2025-10-31T13:36:27Z

        # Create shape world array with all shapes in global world (-1)
-        shape_world = wp.array(np.full(ngeom, -1, dtype=np.int32), dtype=wp.int32)
+        shape_world = wp.array(np.full(ngeom, 0, dtype=np.int32), dtype=wp.int32)
+
+        # Initialize BroadPhaseAllPairs with shape_world (which represents world grouping)
+        nxn_broadphase = BroadPhaseAllPairs(shape_world)



⚠️ Potential issue | 🟡 Minor

Fix comment to match the data.

The comment says “global world (-1)” but the array is filled with 0. Adjust the comment (or use -1) to avoid confusion.

Apply either:

Change comment to “all shapes in world 0”, or

Initialize with -1 if that was intended.

🤖 Prompt for AI Agents

In newton/tests/test_broad_phase.py around lines 216 to 221, the comment incorrectly states “global world (-1)” while the shape_world array is filled with 0; update the comment to reflect that all shapes are in world 0 (e.g., "Create shape world array with all shapes in world 0") or, if the intent was to use -1, change the array initialization to use -1 instead of 0 so the comment and data match.

# Description Currently, the actions from the policy are directly applied to the environment and also often fed back to the policy using the last action as observation. Doing this can lead to instability during training since applying a large action can introduce a negative feedback loop. More specifically, applying a very large action leads to a large last_action observations, which often results in a large error in the critic, which can lead to even larger actions being sampled in the future. This PR aims to fix this for RSL-RL library, by clipping the actions to (large) hard limits before applying them to the environment. This prohibits the actions from growing continuously and greatly improves training stability. Fixes newton-physics#984, newton-physics#1732, newton-physics#1999 ## Type of change - Bug fix (non-breaking change which fixes an issue) - New feature (non-breaking change which adds functionality) ## Checklist - [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with `./isaaclab.sh --format` - [x] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [x] I have updated the changelog and the corresponding version in the extension's `config/extension.toml` file - [x] I have added my name to the `CONTRIBUTORS.md` or my name already exists there

nvtw added 14 commits October 24, 2025 11:04

Improve handling of worlds in the nxn broad phase

41ecf4f

Extended the nxn broad phase unit test

7f3679b

Backup commit

52fd4af

Fix issues in improved SAP broad phase.

3b539bc

More testing

a66ed34

Add duplicate reporting prevention to nxn broad phase

f96994f

Add tests for the tile sort in the broad phase

7dafa5f

Make precomputation of shape_contact_pairs optional

1cfcc87

Sap and nxn broad phase start to respect ShapeFlags (the can exclude …

b0e4ed8

…shapes from collision)

Fix inconsistency in precompute_world_map

51da37e

Make find_shape_contact_pairs fully consistent with test_world_and_gr…

1241cd1

…oup_pair

Fix more world and group default values

a834d7f

Ran ruff

76ac18f

Extend the unified collision pipeline unit tests

3099f6e

nvtw self-assigned this Oct 24, 2025

nvtw marked this pull request as draft October 24, 2025 13:34

nvtw changed the title ~~Update broad phase to work properly with world ids~~ Draft: Update broad phase to work properly with world ids Oct 24, 2025

coderabbitai Bot reviewed Oct 24, 2025

View reviewed changes

nvtw added 11 commits October 27, 2025 07:41

More collision group assignment fixes

59d5465

Fix issues with collision filtering

4c6f891

Fix unit tests that are affected by changing the default world and de…

c6cc41e

…fault group id

Fix topological_sort

8a87092

Attempt to fix more tests

8e449f0

Merge branch 'dev/tw/collision_filtering_fixes' of https://github.com…

53bab6c

…/nvtw/newton into dev/tw/collision_filtering_fixes

Fix more tests

3dc1ca8

Increase the allegro hand test tolerance

a99ad04

Attempt to recover a performance optimization

d9fea72

Improve the handling of the -1 world id in nxn and sap broad phase

a911c1a

Undo changing the default world id

b817610

nvtw and others added 2 commits October 31, 2025 14:02

Implement MR comments

e5f12ab

Merge branch 'main' into dev/tw/update_broad_phase

8a281de

nvtw requested a review from adenzler-nvidia October 31, 2025 13:05

adenzler-nvidia approved these changes Oct 31, 2025

View reviewed changes

nvtw enabled auto-merge (squash) October 31, 2025 13:15

coderabbitai Bot reviewed Oct 31, 2025

View reviewed changes

nvtw merged commit e3df401 into newton-physics:main Oct 31, 2025
17 of 18 checks passed

coderabbitai Bot reviewed Oct 31, 2025

View reviewed changes

This was referenced Oct 31, 2025

Introduce the NarrowPhase class #1014

Merged

Add the contact matcher class that allows solver warm starting for contacts #1041

Merged

coderabbitai Bot mentioned this pull request Nov 19, 2025

Add per shape contact margin #1101

Merged

4 tasks

This was referenced Nov 27, 2025

Cable VBD with Rigid Body Sim #1114

Merged

Adds PhysX style SDF collisions #1150

Merged

Per-shape torsional and rolling friction #1126

Merged

Update ModelBuilder API for creating worlds #1079 #1147

Merged

coderabbitai Bot mentioned this pull request Dec 19, 2025

SDF Hydroelastic Collisions #1248

Merged

4 tasks

coderabbitai Bot mentioned this pull request Jan 5, 2026

Address Warp Deprecation Warnings #1305

Merged

coderabbitai Bot mentioned this pull request Jan 14, 2026

Unified collision pipeline performance improvements #1355

Closed

4 tasks

eric-heiden pushed a commit to eric-heiden/newton that referenced this pull request Jan 28, 2026

Update broad phase to work properly with world ids (newton-physics#984)

3f41059

This was referenced Feb 9, 2026

Heightfield support #1189 #1547

Merged

Make NxN and SAP broad phase respect filtered pairs #1554

Merged

coderabbitai Bot mentioned this pull request Feb 19, 2026

Respect MJCF contype=conaffinity=0 via collision_group=0 #1703

Merged

5 tasks

coderabbitai Bot mentioned this pull request Mar 25, 2026

Introduce per-shape color information in Model(Builder) #2219

Merged

mmacklin pushed a commit to mmacklin/newton that referenced this pull request Apr 7, 2026

Update broad phase to work properly with world ids (newton-physics#984)

8b30ff0

coderabbitai Bot mentioned this pull request Jun 14, 2026

Fix MJCF collision mask filtering #3104

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update broad phase to work properly with world ids#984

Update broad phase to work properly with world ids#984
nvtw merged 38 commits into
newton-physics:mainfrom
nvtw:dev/tw/update_broad_phase

nvtw commented Oct 24, 2025 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Oct 24, 2025 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai Bot left a comment

Uh oh!

adenzler-nvidia left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Oct 31, 2025

Uh oh!

coderabbitai Bot Oct 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nvtw commented Oct 24, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Core Changes

Enhancements

Newton Migration Guide

Before your PR is "Ready for review"

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Pre-merge checks and finishing touches

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

adenzler-nvidia left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nvtw commented Oct 24, 2025 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Oct 24, 2025 •

edited

Loading