Port WrenchComposer dual-buffer fix to develop branch#5265
Conversation
There was a problem hiding this comment.
🤖 Isaac Lab Review Bot
Summary
This PR replaces the single-buffer WrenchComposer with a dual-buffer architecture that separately stores global (world-frame) and local (body-frame) forces, deferring the world→body rotation until compose_to_body_frame() is called with the current link pose. This fixes the core bug where pre-composed body-frame forces became stale when the body rotated. The implementation is sound, well-tested, and correctly ported across all six PhysX/Newton asset classes.
Design Assessment
Design is sound. The dual-buffer approach (5 input buffers + 2 output buffers) cleanly separates force accumulation from frame transformation. Storing cross(P, F) in world frame and correcting at composition time via cross(P, F) − cross(link_pos, F) is mathematically elegant and ensures torques remain correct as bodies translate. The add_raw_buffers_from method for combining instantaneous + permanent wrenches at the input-buffer level (before composition) is the right fix — the old code added already-composed body-frame forces into the instantaneous composer, which double-rotated global contributions.
Findings
🟡 Warning: Misleading docstring for positions parameter — wrench_composer.py:207, 265, 330, 389
The docstring for all four public methods (add_forces_and_torques_index, set_forces_and_torques_index, add_forces_and_torques_mask, set_forces_and_torques_mask) describes positions as:
Position offsets from the link origin, expressed in the link frame.
This is only correct when is_global=False. When is_global=True, positions are absolute world-frame coordinates (not offsets, not link-frame). The kernel computes cross(position, force) directly — it expects the full world-frame position. The integration tests confirm this (e.g., test_global_force_at_offset_generates_torque passes com_pos + offset as an absolute world position).
Suggested fix: update the docstring to document both cases:
positions: When is_global=False, offsets from the link origin in the body frame.
When is_global=True, absolute positions in the world frame.
🟡 Warning: Partial reset doesn't clear _active/_has_* flags — wrench_composer.py:512-600
When reset(env_ids=...) is called for a subset of environments, the _active, _has_global, and _has_local flags are left untouched with the comment "other envs may still have forces." This is correct for the general case, but if all environments are individually reset (e.g., all envs terminated at once with explicit env_ids), the composer remains "active" with stale flags, which will cause unnecessary compose_to_body_frame calls and prevent the write_data_to_sim early-exit optimization. This is a minor efficiency concern, not a correctness issue.
🟡 Warning: set_forces_and_torques_index unconditionally sets _has_local = not is_global — wrench_composer.py:278-279
When calling set_forces_and_torques_index with is_global=True, the method sets self._has_local = False. The set_* methods first clear all 5 input buffers, so the local buffers are indeed empty — the flag is technically accurate. However, this means a sequence like add_*(local) → set_*(global) will correctly discard the local flag since set_* replaces everything. The logic is correct but could use an inline comment explaining this interaction for future maintainers.
🔵 Suggestion: Partial reset launches 4 kernel pairs — consider a dedicated 7-buffer reset kernel — wrench_composer.py:530-600
The partial reset (env_ids or env_mask) reuses the existing 2-buffer reset_wrench_composer_index/reset_wrench_composer_mask kernels by calling them 4 times (with the last call zeroing out_torque_b twice). A single 7-buffer reset kernel would reduce kernel launch overhead from 4 launches to 1. For small num_envs, the overhead of 4 Warp kernel launches may be noticeable.
🔵 Suggestion: Mock compose_to_body_frame ignores global buffers — mock_wrench_composer.py:175-195
The mock's compose_to_body_frame() only copies local buffers to output, ignoring global buffers entirely. While the identity-transform assumption is documented, any test that mixes global and local forces through the mock will silently lose the global contributions. Since identity rotation means quat_rotate_inv(identity, v) = v, the mock could add global buffers to the output for more accurate behavior:
# More accurate mock for identity transforms:
out_force_torch.add_(wp.to_torch(self._global_force_w))
out_force_torch.add_(wp.to_torch(self._global_force_at_com_w))
out_torque_torch.add_(wp.to_torch(self._global_torque_w))Test Coverage
This PR includes excellent test coverage:
- Bug fix regression: Yes —
test_global_force_invariant_under_rotationdirectly tests the core bug (force invariance under body rotation) - New test files: Two comprehensive test suites added:
test_wrench_composer_integration.py(819 lines) — 9 integration tests covering rotation invariance, translation-dependent torque, multi-cube scenarios, and far-from-origin precisiontest_wrench_composer_vs_physx.py(748 lines) — 8 comparison tests validating composer output against raw PhysX API
- Existing tests updated: All existing unit tests properly migrated to use new API (
add_forces_and_torques_index+ explicitcompose_to_body_frame()) - Edge cases covered: Large world coordinates, multi-env indexing, mid-simulation reset, payload scenario with gravity
CI Status
Only the labeler check has run (pass). No code/test CI results available — this may be expected for external PRs or the CI may need manual triggering.
Verdict
Minor fixes needed
The dual-buffer architecture is well-designed and the core physics math is correct. The bug fix (deferring world→body rotation until apply time) is the right approach. The port across all 6 asset classes (3 PhysX + 3 Newton) is consistent. The test suite is thorough. The main actionable item is the misleading positions parameter docstring, which could confuse users applying global forces at specific world coordinates. The other findings are minor efficiency and maintainability improvements.
Greptile SummaryThis PR replaces the single-buffer
Confidence Score: 4/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[User: add/set forces] -->|is_global=True| B[global_force_w\nglobal_torque_w\nglobal_force_at_com_w]
A -->|is_global=False| C[local_force_b\nlocal_torque_b]
B --> D[add_raw_buffers_from\npermanent → instantaneous]
C --> D
D --> E[compose_to_body_frame\nkernel]
E -->|quat_rotate_inv + correction| F[out_force_b\nout_torque_b]
F --> G[apply_forces_and_torques_at_position\nis_global=False]
subgraph reset["reset()"]
H{env_ids=None\nenv_mask=None?}
H -->|Yes| I[Full reset\n_active=False\nall 7 buffers zeroed]
H -->|No, env_mask| J[Partial reset\nvia mask kernel]
H -->|No, env_ids| K[Partial reset\nvia index kernel]
end
note1["⚠️ RigidObject/Collection\ndrops env_mask before\nreaching reset()"]
J -.->|missed in RigidObject| note1
|
mmichelis
left a comment
There was a problem hiding this comment.
Nice fix of permanent wrenches not using up-to-date link poses.
|
Replaces the single-buffer WrenchComposer with a dual-buffer architecture that stores global (world-frame) and local (body-frame) forces separately. Follows develop's warp-first paradigm: all internal buffers are warp arrays, inputs accept both torch.Tensor and wp.array (warp ingests both natively), outputs are wp.array. Updates PhysX and Newton asset write_data_to_sim to use add_raw_buffers_from + compose_to_body_frame instead of the old composed_force/composed_torque merge pattern. Addresses PR review feedback: - Removed unused _has_global/_has_local/global_only flags - Consolidated 4 partial-reset kernel launches into 1 (7-buffer kernel) - Fixed link origin vs CoM terminology in docstrings Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
665f183 to
d1167a0
Compare
|
The wrench composer now has the following behavior compared to PhysX:
|
Critical fixes: - Add missing else/raise for _get_link_quat_fn in WrenchComposer constructor - Remove .any() checks in deprecated set_external_force_and_torque (crashes on wp.array, triggers GPU sync on torch) - Apply reset+add pattern to base_rigid_object_collection.py (was missed) Changelog fixes: - Move set_external_force_and_torque fix entry from isaaclab_physx to isaaclab changelog (where the base classes live) - Add isaaclab_newton changelog 0.5.14 + version bump for write_data_to_sim changes Mock fix: - MockWrenchComposer.compose_to_body_frame now sums all 5 input buffers (previously dropped global contributions silently) Documentation fixes: - Fix property shape docs from (N,B,3) to (N,B) with dtype wp.vec3f - Add docstrings to all 8 warp kernels (dispatch dims, routing logic) - Document cross-product contributions in global_torque_w property - Fix "will be deprecated" to "is deprecated" in wrench-related warnings - Fix missing space in compose_to_body_frame docstring New tests: - add_raw_buffers_from: merge correctness + inactive no-op - Mask-based APIs: mask vs index equivalence (local + global) - set_forces_and_torques_index: overwrite + global buffer clearing - Partial reset: specified envs zeroed, others retained - Deprecated API: composed_force/torque warnings + add_forces_and_torques
Verify that calling compose_to_body_frame twice without intervening writes produces bitwise-identical output, guarding against accidental accumulation into the output buffers.
…r-dual-buffer-develop # Conflicts: # source/isaaclab/config/extension.toml # source/isaaclab/docs/CHANGELOG.rst # source/isaaclab_newton/config/extension.toml # source/isaaclab_newton/docs/CHANGELOG.rst # source/isaaclab_physx/config/extension.toml # source/isaaclab_physx/docs/CHANGELOG.rst
|
Hey @AntoineRichard, I just checked the latest changes and it seems like the docstring updates changed things to the wrong version again. I put some time into formulating those to be as precise and well-understandable as possible, so I would consider keeping them. I am mainly referring to: becoming in 29dc9f8. |
Thanks Clemens, I'll revert it. Mike is going to take another look, and then we'll merge. |
Restore the original phrasing ("global positions expressed in the world
frame" / "offsets from the body's CoM expressed in the body frame")
across all four add/set methods while keeping the [N], [N·m], [m] unit
annotations.
|
Cool thanks :) |
There was a problem hiding this comment.
Great testing against physx!
mmichelis
left a comment
There was a problem hiding this comment.
Looks great to me, thanks for checking and updating the documentation for the is_global/position arguments!
Bump PR changelog entries to new versions (isaaclab 4.6.8, isaaclab_newton 0.5.18, isaaclab_physx 0.5.20) to sit above develop's 4.6.7/0.5.17/0.5.19.
Reduce external force magnitude from 500N to 50N in Newton rigid object tests. The original 1000N (set+add) on a ~65g DexCube caused numerical instability in Newton (which lacks PhysX's acceleration limiters), producing chaotic angular velocities. Remove skip marker and stale composed_force/composed_torque buffer assertions from test_rigid_object — those tested old single-buffer internals that no longer apply with the dual-buffer architecture. Add long-run PhysX comparison test validating that the composer matches raw PhysX for permanent global forces at world-frame positions over 100 steps with significant body motion.
There was a problem hiding this comment.
🤖 Isaac Lab Review Bot
Summary
This PR implements a well-designed dual-buffer architecture for the WrenchComposer, separating global (world-frame) and local (body-frame) forces into 5 input buffers plus 2 output buffers. The core fix is correct: by deferring the world-to-body rotation until compose_to_body_frame() is called with the current body pose, the bug where pre-composed body-frame forces became stale when bodies rotated is now fixed. The implementation is clean, the math is sound, and the test coverage is extensive.
Architecture Assessment
Design is sound. The approach is mathematically elegant:
- Store global forces with their world-frame torque contributions (
cross(P, F)) - At composition time, correct for the body's CoM via
cross(P, F) - cross(com_pos_w, F) - Rotate into body frame and sum with local contributions
The add_raw_buffers_from() method correctly merges instantaneous + permanent wrenches at the input-buffer level before composition, fixing the old double-rotation bug where already-composed body-frame forces were re-rotated.
Cross-Module Impact
The changes propagate correctly across all 6 asset classes (3 PhysX + 3 Newton). The write_data_to_sim pattern is consistent: check composer activity, merge buffers if needed, compose to body frame, apply to sim, reset instantaneous composer.
Test Coverage
Excellent test coverage:
test_wrench_composer.py— Updated unit tests (1071 lines) properly use new API with explicitcompose_to_body_frame()callstest_wrench_composer_integration.py— 817 lines covering rotation invariance, translation-dependent torque, multi-cube scenariostest_wrench_composer_vs_physx.py— 837 lines validating composer output against raw PhysX API- The core bug fix has proper regression tests (
test_global_force_invariant_under_rotation)
CI Status
- ✅ pre-commit, license-check, Build Wheel, Build Docs — all passing
- ❌ test-skillgen — failed (exit code 1, no detailed error message in annotations)
- ⏳ isaaclab (core), isaaclab_physx, isaaclab_newton — pending
The test-skillgen failure may be unrelated to this PR (need to check if it passes on develop at the same commit) but should be investigated before merge.
Branch Status
PR branch is only 1 commit behind develop with MERGEABLE status — no concerns here.
Findings
🔵 Improvement: Consider adding body_com_pos_w docstring clarification — wrench_composer.py:59-63
The _get_com_pos_fn pulls from body_com_pos_w and _get_link_quat_fn pulls from body_link_quat_w. These are correct for the composition math, but worth noting in a docstring that the CoM position (not link origin) is used for the torque correction. The kernel comment correctly explains this but it would help to mention it at the property level.
🔵 Suggestion: Mock compose_to_body_frame could be more accurate — mock_wrench_composer.py:175-195
The mock's compose_to_body_frame() now correctly sums all 5 buffers (fixed from the previous version that only copied local buffers). This is correct for identity transforms — good improvement from the previous review.
Verdict
Ship it ✅
The dual-buffer architecture is well-designed and correctly fixes the bug where global forces became stale under body rotation. The implementation is clean, the math is sound, and the test coverage is comprehensive. The only outstanding item is investigating the test-skillgen CI failure to confirm it's not related to this PR.
This PR has already been approved by @mmichelis. The previous bot review findings have been addressed (docstring fixes, missing else/raise, mock improvements). Ready to merge pending CI resolution.
…r-dual-buffer-develop # Conflicts: # source/isaaclab/config/extension.toml # source/isaaclab/docs/CHANGELOG.rst # source/isaaclab_newton/docs/CHANGELOG.rst
Resolve conflicts in WrenchComposer and its warp kernels by taking origin/develop's dual-buffer architecture (from PR isaac-sim#5265), which supersedes this branch's earlier "fix WrenchComposer stale COM pose" approach. The capture-safety pieces (new warp/utils.py, @capture_unsafe decorators in articulation/rigid_object data) from commit 50f5417 are preserved — they are orthogonal to the WrenchComposer redesign.
# Description Replaces the single-buffer WrenchComposer with a dual-buffer architecture that stores global (world-frame) and local (body-frame) forces separately. Follows develop's warp-first paradigm: all internal buffers are warp arrays, inputs accept both torch.Tensor and wp.array (warp ingests both natively), outputs are wp.array. Updates PhysX and Newton asset write_data_to_sim to use add_raw_buffers_from + compose_to_body_frame instead of the old composed_force/composed_torque merge pattern. ## Type of change - Bug fix (non-breaking change which fixes an issue) - ## Checklist - [ ] I have read and understood the [contribution guidelines](https://isaac-sim.github.io/IsaacLab/main/source/refs/contributing.html) - [ ] I have run the [`pre-commit` checks](https://pre-commit.com/) with `./isaaclab.sh --format` - [ ] I have made corresponding changes to the documentation - [ ] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have updated the changelog and the corresponding version in the extension's `config/extension.toml` file - [ ] I have added my name to the `CONTRIBUTORS.md` or my name already exists there --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: ClemensSchwarke <clemens.schwarke@gmail.com>
# Description Fast-path early-return in :func:`~isaaclab.envs.mdp.events.apply_external_force_torque` when both `force_range` and `torque_range` are exactly zero — a common configuration for tasks that declare the event term but apply no disturbance. Before this change, zero-wrench configurations were still sampled, written into the dual-buffer `WrenchComposer` introduced in #5265, and pushed through the per-step compose-and-apply path in `Articulation.write_data_to_sim`, paying the full per-step cost for what is semantically a no-op. Applying a zero wrench is equivalent to not applying one at all, so the function now returns immediately when both ranges are zero. This restores the H1, G1, and Anymal-C `Velocity-Rough` throughput observed prior to #5265, as recorded in the OmniPerf DB regression flagged in `isaac-sim/IsaacLab-Internal#906`. **Scope limitation.** This only addresses the zero-force case. Tasks that apply non-zero external forces (curriculum disturbances, push events, domain-randomized wrenches) still pay the per-step body-frame recompose cost under the new dual-buffer architecture. That broader optimization (compose caching / kernel fusion) is tracked separately in `isaac-sim/IsaacLab-Internal#911` and is out of scope here. **Correctness.** The dual-buffer `WrenchComposer` architecture from #5265 is untouched; this fix sits one layer above it in the event term. For any non-zero `force_range` or `torque_range`, the early-return predicate is false and behavior is unchanged. Fixes `isaac-sim/IsaacLab-Internal#906` Follow-up: `isaac-sim/IsaacLab-Internal#911` ## Type of change - Bug fix (non-breaking change which fixes an issue) ## Screenshots N/A — performance fix, no user-visible behavior change. ## Checklist - [x] I have read and understood the [contribution guidelines](https://isaac-sim.github.io/IsaacLab/main/source/refs/contributing.html) - [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with `./isaaclab.sh --format` - [x] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [x] I have added a changelog fragment under `source/<pkg>/changelog.d/` for every touched package (do **not** edit `CHANGELOG.rst` or bump `extension.toml` — CI handles that) - [x] I have added my name to the `CONTRIBUTORS.md` or my name already exists there --------- Co-authored-by: Kelly Guo <kellyg@nvidia.com>
Description
Replaces the single-buffer WrenchComposer with a dual-buffer architecture that stores global (world-frame) and local (body-frame) forces separately.
Follows develop's warp-first paradigm: all internal buffers are warp arrays, inputs accept both torch.Tensor and wp.array (warp ingests both natively), outputs are wp.array.
Updates PhysX and Newton asset write_data_to_sim to use add_raw_buffers_from
Type of change
Checklist
pre-commitchecks with./isaaclab.sh --formatconfig/extension.tomlfileCONTRIBUTORS.mdor my name already exists there