Draft: Migrate diff sim examples from Warp by mzamoramora-nvidia · Pull Request #560 · newton-physics/newton

mzamoramora-nvidia · 2025-08-15T13:36:41Z

Description

Porting examples from warp/optim to address #558 .

This PR is replaced by #616

Newton Migration Guide

Please ensure the migration guide for warp.sim users is up-to-date with the changes made in this MR.

The migration guide in docs/migration.rst is up-to date

Before your PR is "Ready for review"

All commits are signed-off to indicate that your contribution adheres to the Developer Certificate of Origin requirements
Necessary tests have been added and new examples are tested (see newton/tests/test_examples.py)
Documentation is up-to-date
Code passes formatting and linting checks with pre-commit run -a

Summary by CodeRabbit

New Features
- Added a suite of differentiable simulation examples: rigid ball targeting, soft-body material optimization, cloth throw, spring-cage tuning, a neural-controlled soft “bear,” and a quadrotor with MPC.
- Each example includes a gradient-based training loop, optional CUDA-graph acceleration, multi-backend rendering (USD/GL/Rerun), CLI controls, and periodic visualization/saving.
Documentation
- New examples serve as end-to-end tutorials showcasing setup, optimization, rendering, and (where applicable) gradient checks, helping users learn differentiable physics workflows and performance best practices.

…iffsim_ball.py Signed-off-by: Miguel Angel Zamora Mora <mzamoramora@nvidia.com>

Signed-off-by: Miguel Angel Zamora Mora <mzamoramora@nvidia.com>

Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

coderabbitai · 2025-08-15T13:36:49Z

📝 Walkthrough

Walkthrough

Adds six new differentiable simulation example scripts under newton/examples/diffsim: ball, cloth, soft body, spring cage, bear (neural controller), and drone (MPC). Each defines kernels, an Example class orchestrating forward/backward optimization, optional CUDA graph capture, and viewer integration (USD/GL/Rerun), with CLI entry points for running and rendering.

Changes

Cohort / File(s)	Summary
Core DiffSim examples `newton/examples/diffsim/example_diffsim_ball.py`, `newton/examples/diffsim/example_diffsim_cloth.py`, `newton/examples/diffsim/example_diffsim_soft_body.py`, `newton/examples/diffsim/example_diffsim_spring_cage.py`	Introduce differentiable examples (particle/ball, cloth, soft-body FEM, spring cage), each adding Warp kernels for loss/metrics and updates, an Example class with forward/backward steps, optional CUDA capture, rendering via viewer backends, and CLI runners.
Advanced controllers `newton/examples/diffsim/example_diffsim_bear.py`, `newton/examples/diffsim/example_diffsim_drone.py`	Add complex examples: bear soft-tet neural controller (tile-based GEMM, per-tet activations) and drone MPC (propeller model, rollout sampling, costs, selection). Define multiple kernels, helper structs/functions, Example classes coordinating optimization and rendering.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant Example as Example (any)
  participant Physics as Solver/Model
  participant Kernels as Kernels (loss/update)
  participant Tape as Autograd/Graph

  User->>Example: step()
  activate Example
  Example->>Tape: start capture/tape
  Example->>Physics: simulate trajectory (forward)
  Physics-->>Example: final state
  Example->>Kernels: compute loss
  Example->>Tape: backward()
  Tape-->>Example: grads (w.r.t. controls/params)
  Example->>Kernels: apply update (SGD/Adam/clamp)
  Example-->>User: step complete
  deactivate Example

sequenceDiagram
  autonumber
  actor User
  participant Ex as Example (Drone MPC)
  participant Rollouts as Rollout Ensemble
  participant Physics as Solver
  participant Kernels as Cost/Collision/Interp
  participant Select as Trajectory Picker

  User->>Ex: step(frame)
  Ex->>Rollouts: sample noise around mean controls
  Ex->>Rollouts: replicate initial states
  loop horizon
    Ex->>Kernels: interpolate controls
    Ex->>Physics: step rollouts
    Ex->>Kernels: accumulate drone_cost + collision_cost
  end
  Ex->>Select: pick_best_trajectory()
  Select-->>Ex: best controls/trajectory
  Ex->>Physics: simulate chosen trajectory (nominal)
  Ex-->>User: loss metric, optional render

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

Viewer refactor #522 — Introduces the unified viewer backends (ViewerGL/Rerun/USD) used by these examples for rendering and stage output.

Suggested reviewers

shi-eric
eric-heiden

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 3

🔭 Outside diff range comments (2)

newton/examples/diffsim/example_diffsim_ball.py (2)
1-274: Add a "Differentiable (diffsim)" migration section to docs/migration.rst — required

The PR adds a diffsim example (newton/examples/diffsim/example_diffsim_ball.py) but docs/migration.rst contains no guidance for migrating warp.sim differentiable workflows; please add a migration subsection and link the example.

Files/locations to update:

docs/migration.rst — add a "Differentiable simulation (diffsim)" subsection that documents how to migrate warp.sim diffsim code to Newton and links to the new example.

Reference example: newton/examples/diffsim/example_diffsim_ball.py

Suggested minimal items to include in the new subsection (can be expanded into prose/examples):

Enable gradients: builder.finalize(requires_grad=True)

State & control: use model.state() for per-timestep states and model.control() for controls

Contacts: use model.collide(state, soft_contact_margin=...) and pass contacts into solver.step

Stepping: solver.step(state_i, state_{i+1}, control, contacts, dt) (see SolverSemiImplicit usage)

Gradients: capture forward pass with wp.Tape() and call tape.backward(loss); optionally use wp.ScopedCapture + wp.capture_launch(graph) for CUDA-graph acceleration

Point readers to newton/examples/diffsim/example_diffsim_ball.py for a complete end-to-end diffsim training loop and numeric gradient check

I can draft the exact rst text and a patch for docs/migration.rst if you want.

1-274: Add a unit test for the diffsim example and fix CLI flag mismatch

The tests file does not register newton.examples.diffsim.example_diffsim_ball — add a test entry. Also the example uses underscore-style argparse flags (e.g. --stage_path, --train_iters) while the test harness passes hyphen-style flags (e.g. --stage-path). Fix one of these (prefer adding hyphen aliases in the example) so the test runner's arguments are recognized.

Files / changes:

Add a test registration to newton/tests/test_examples.py (place under TestOtherExamples):
add_example_test(
    TestOtherExamples,
    name="diffsim.example_diffsim_ball",
    devices=test_devices,
    test_options={"stage_path": "None", "train_iters": 0},
)
Update flags in newton/examples/diffsim/example_diffsim_ball.py so the parser accepts hyphen-style options (add aliases), e.g.:
parser.add_argument(
    "--stage-path", "--stage_path",
    type=lambda x: None if x == "None" else str(x),
    default="example_bounce.usd",
    help="Path to the output USD file.",
)
parser.add_argument(
    "--train-iters", "--train_iters",
    type=int,
    default=250,
    help="Total number of training iterations.",
)
These changes will (1) register the example in the test suite and (2) ensure the test harness' hyphenated options are accepted.

🧹 Nitpick comments (1)

newton/examples/diffsim/example_diffsim_ball.py (1)
17-17: Fix the outdated comment header.

The comment header says "Example Sim Grad Bounce" but the filename is example_diffsim_ball.py. Update the comment to match the filename for consistency.
-# Example Sim Grad Bounce
+# Example DiffSim Ball

📜 Review details

Configuration used: .coderabbit.yml
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between bb73e84 and 8fbbd745fe0ed6ba7df95f3687ef19ab2ec7b234.

📒 Files selected for processing (1)

newton/examples/diffsim/example_diffsim_ball.py (1 hunks)

🧰 Additional context used

🧠 Learnings (1)

📓 Common learnings

Learnt from: shi-eric
PR: newton-physics/newton#521
File: newton/examples/example_cloth_hanging.py:36-36
Timestamp: 2025-08-12T05:17:34.423Z
Learning: The Newton migration guide (docs/migration.rst) is specifically for documenting how to migrate existing warp.sim functionality to Newton equivalents. New Newton-only features that didn't exist in warp.sim do not need migration documentation.

🧬 Code Graph Analysis (1)

newton/examples/diffsim/example_diffsim_ball.py (1)

newton/tests/test_cloth.py (1)

finalize (773-815)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: run-newton-tests / newton-unittests (ubuntu-latest)
GitHub Check: run-newton-tests / newton-unittests (windows-latest)

🔇 Additional comments (1)

newton/examples/diffsim/example_diffsim_ball.py (1)

108-109: Script executed but output not yet provided.

coderabbitai · 2025-08-15T14:00:09Z

🛠️ Refactor suggestion

USD imports should be at the module level.

The pxr imports are done inside the render function, which will cause repeated import overhead on each render call. Additionally, the noqa comment suggests this was flagged by a linter.

Move the imports to the top of the file with other imports:

import numpy as np import warp as wp import newton + +try: + from pxr import Gf, UsdGeom + HAS_USD = True +except ImportError: + HAS_USD = False

Then update the render method:

-from pxr import Gf, UsdGeom # noqa: PLC0415 +if not HAS_USD: + return particles_prim = self.renderer.stage.GetPrimAtPath("/root/particles") particles = UsdGeom.Points.Get(self.renderer.stage, particles_prim.GetPath()) particles.CreateDisplayColorAttr().Set([Gf.Vec3f(1.0, 1.0, 1.0)], time=self.renderer.time)

🤖 Prompt for AI Agents

In newton/examples/diffsim/example_diffsim_ball.py around lines 191 to 196, the pxr imports (from pxr import Gf, UsdGeom) are placed inside the render function causing repeated import overhead and linter warnings; move "from pxr import Gf, UsdGeom" to the module-level imports at the top of the file with the other imports, remove the in-function import and the noqa comment, and update the render method to reference Gf and UsdGeom directly (no local import) so the linter no longer flags PLC0415 and imports happen only once.

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (3)

newton/examples/diffsim/example_diffsim_soft_body.py (3)
17-26: Update the comment header to reflect the actual optimization target

The comment states this example optimizes material parameters for a bounce trajectory to hit a target, but the code actually optimizes to drive the center of mass toward a target position without any bouncing behavior requirement.
-# Example FEM Bounce
+# Example Soft Body Material Optimization
 #
-# Shows how to use Warp to optimize for the material parameters of a soft body,
-# such that it bounces off the wall and floor in order to hit a target.
+# Shows how to use Warp to optimize for the material parameters of a soft body
+# to drive its center of mass toward a target position.
36-41: Add bounds checking for array indexing

The kernel accesses params array using modulo arithmetic but doesn't validate the resulting indices are within bounds. While the modulo operation should keep indices valid, adding explicit bounds checking would make the code more robust.
 @wp.kernel
 def assign_param(params: wp.array(dtype=wp.float32), tet_materials: wp.array2d(dtype=wp.float32)):
     tid = wp.tid()
     params_idx = 2 * wp.tid() % params.shape[0]
+    # Ensure we have at least 2 elements available from params_idx
+    if params_idx + 1 >= params.shape[0]:
+        return
     tet_materials[tid, 0] = params[params_idx]
     tet_materials[tid, 1] = params[params_idx + 1]
300-315: Add boundary checks for numpy array operations

The code reshapes gradient arrays without verifying the shape is compatible with the reshape operation, which could raise an exception if the array size isn't divisible by 2.
 def log_step(self):
+    if self.material_params.shape[0] % 2 != 0:
+        print(f"Warning: material_params size {self.material_params.shape[0]} is not divisible by 2")
+        return
+        
     x = self.material_params.numpy().reshape(-1, 2)
     x_grad = self.material_params.grad.numpy().reshape(-1, 2)

📜 Review details

Configuration used: .coderabbit.yml
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 8fbbd745fe0ed6ba7df95f3687ef19ab2ec7b234 and d09ec54cff4a51c9fe0c8e79927c0e5b483f08e9.

📒 Files selected for processing (1)

newton/examples/diffsim/example_diffsim_soft_body.py (1 hunks)

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: shi-eric
PR: newton-physics/newton#521
File: newton/examples/example_cloth_hanging.py:36-36
Timestamp: 2025-08-12T05:17:34.423Z
Learning: The Newton migration guide (docs/migration.rst) is specifically for documenting how to migrate existing warp.sim functionality to Newton equivalents. New Newton-only features that didn't exist in warp.sim do not need migration documentation.

📚 Learning: 2025-08-12T05:17:34.423Z

Learnt from: shi-eric
PR: newton-physics/newton#521
File: newton/examples/example_cloth_hanging.py:36-36
Timestamp: 2025-08-12T05:17:34.423Z
Learning: The Newton migration guide (docs/migration.rst) is specifically for documenting how to migrate existing warp.sim functionality to Newton equivalents. New Newton-only features that didn't exist in warp.sim do not need migration documentation.

Applied to files:

newton/examples/diffsim/example_diffsim_soft_body.py

🧬 Code Graph Analysis (1)

newton/examples/diffsim/example_diffsim_soft_body.py (1)

newton/tests/test_cloth.py (1)

finalize (773-815)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: run-newton-tests / newton-unittests (windows-latest)
GitHub Check: run-newton-tests / newton-unittests (ubuntu-latest)

🔇 Additional comments (1)

newton/examples/diffsim/example_diffsim_soft_body.py (1)

159-167: Verify CUDA graph capture for gradient operations

I inspected newton/examples/diffsim/example_diffsim_soft_body.py. The example creates a new wp.Tape() inside the ScopedCapture (capture + backward is at ~lines 159–166 and again ~266–272) and clears the tape/loss after each iteration (self.tape.zero(), self.loss.zero_(), etc. at ~lines 289–295). However the captured graph does not include an explicit gradient-zeroing step before the capture — depending on Warp's backward semantics this can allow gradient buffers to accumulate across graph replays.

Please verify and/or address by either:

Explicitly zeroing gradients before the capture (e.g., zero grad-bearing arrays or call self.tape.zero() / reset on the relevant tape) so the captured backward starts from a clean state, or

Ensuring the captured graph itself includes the zeroing operation if you intend to reuse the graph across iterations.

Locations to review:

newton/examples/diffsim/example_diffsim_soft_body.py: lines ~159–166 — ScopedCapture / wp.Tape() / backward(self.loss)

newton/examples/diffsim/example_diffsim_soft_body.py: lines ~266–272 — alternate capture path

newton/examples/diffsim/example_diffsim_soft_body.py: lines ~289–295 — per-iteration cleanup (self.tape.zero(), self.loss.zero_(), ...)

Suggested minimal change (insert a zeroing step before forward capture):
   with wp.ScopedCapture() as capture:
       self.tape = wp.Tape()
       with self.tape:
           self.forward()
       self.tape.backward(self.loss)
   with wp.ScopedCapture() as capture:
       self.tape = wp.Tape()
       # ensure gradient buffers are reset before capture to avoid accumulation on replay
       self.tape.zero()
       with self.tape:
           self.forward()
       self.tape.backward(self.loss)

coderabbitai · 2025-08-15T16:29:04Z

⚠️ Potential issue

Incorrect gradient clearing approach

The code calls self.tape.zero() to clear gradients, but based on Warp's API, the tape should be recreated for each iteration rather than zeroed. Additionally, the gradient clearing for material_params is missing.

# clear grads for next iteration - self.tape.zero() + if not self.use_cuda_graph: + self.tape = None # Release the tape + self.material_params.grad.zero_() self.loss.zero_() self.com.zero_() self.pos_error.zero_()

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# clear grads for next iteration

self.tape.zero()

self.loss.zero_()

self.com.zero_()

self.pos_error.zero_()

# clear grads for next iteration

if not self.use_cuda_graph:

self.tape = None # Release the tape

self.material_params.grad.zero_()

self.loss.zero_()

self.com.zero_()

self.pos_error.zero_()

🤖 Prompt for AI Agents

In newton/examples/diffsim/example_diffsim_soft_body.py around lines 291 to 296, the code incorrectly calls self.tape.zero() to clear gradients and neglects clearing gradients for material_params; replace the tape.zero() call by recreating a fresh tape each iteration (e.g., self.tape = warp.Tape() or equivalent per Warp API) and ensure you zero or reset gradients on material_params (e.g., call .zero_() on their .grad tensors or reinitialize their gradient storage) along with the existing self.loss.zero_(), self.com.zero_(), and self.pos_error.zero_().

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (5)

newton/examples/diffsim/example_diffsim_bear.py (2)
61-64: Replace sqrt(v^2) with abs(v) in loss computation

sqrt(v*v) is equivalent to abs(v) but more expensive and less readable. Behavior is the same (subgrad at 0).
-    delta = wp.sqrt(vy * vy) + wp.sqrt(vz * vz) - vx
+    delta = wp.abs(vy) + wp.abs(vz) - vx
212-228: Optional: initialize loss/com buffers within forward to reduce reliance on external zeroing

You zero self.loss/self.coms in step(), which works. For clarity and local correctness, zeroing com[frame] and loss inside forward before accumulation makes forward self-contained.

Example tweak (inside forward, before com/loss launches):
self.coms[frame].zero_()
# self.loss.zero_()  # if you want per-frame loss; otherwise keep as accumulator
newton/examples/diffsim/example_diffsim_cloth.py (1)
38-39: Prefer wp.float32 cast inside kernels for type consistency

Using float(n) can promote to host float64 on some backends. Casting with wp.float32 avoids implicit widening.
-    wp.atomic_add(com, 0, positions[tid] / float(n))
+    wp.atomic_add(com, 0, positions[tid] / wp.float32(n))
newton/examples/diffsim/example_diffsim_drone.py (1)

34-36: Import SDF kernels from a public module
I didn’t find any public re-exports of box_sdf, sphere_sdf, etc. (they only live in newton._src.geometry.kernels). To avoid coupling your example to private internals, please:

• File: newton/examples/diffsim/example_diffsim_drone.py (lines 34–36)
• Add public re-exports of the SDF kernels in something like
newton/geometry/kernels.py or newton/geometry/__init__.py, for example:
python # newton/geometry/kernels.py from newton._src.geometry.kernels import ( box_sdf, box_sdf_grad, capsule_sdf, capsule_sdf_grad, cone_sdf, cone_sdf_grad, cylinder_sdf, cylinder_sdf_grad, mesh_sdf, plane_sdf, sphere_sdf, sphere_sdf_grad )
• Update your example to import from the public API:
diff - from newton._src.geometry.kernels import box_sdf, capsule_sdf, … + from newton.geometry.kernels import box_sdf, capsule_sdf, …

This will ensure the example remains stable as the internal layout evolves.

newton/examples/diffsim/example_diffsim_spring_cage.py (1)

117-128: Add minimal tests and a docs note per PR checklist

Examples are great; the PR notes tests/docs are pending. Consider adding a smoke test per example (CPU device, tiny frame/iter counts) and updating docs/migration.rst to clarify these are Newton-native examples (not direct warp.sim API).

I can add a newton/tests/test_examples.py with parametrized runs like:

spring_cage: 1 frame, 1 iter, device="cpu", no stage_path

bear: 1 frame, verbose=False, device="cpu", no stage_path

cloth: 1-2 iters, device="cpu", no stage_path

drone: 2 frames, 2 rollouts, headless, device="cpu", no stage_path

Would you like me to draft that?

Also applies to: 186-207, 237-244

📜 Review details

Configuration used: .coderabbit.yml
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between d09ec54cff4a51c9fe0c8e79927c0e5b483f08e9 and 7c7da76e481c199640c397e3914e357103eef7e9.

📒 Files selected for processing (4)

newton/examples/diffsim/example_diffsim_bear.py (1 hunks)
newton/examples/diffsim/example_diffsim_cloth.py (1 hunks)
newton/examples/diffsim/example_diffsim_drone.py (1 hunks)
newton/examples/diffsim/example_diffsim_spring_cage.py (1 hunks)

🧰 Additional context used

🧠 Learnings (1)

📓 Common learnings

Learnt from: shi-eric
PR: newton-physics/newton#521
File: newton/examples/example_cloth_hanging.py:36-36
Timestamp: 2025-08-12T05:17:34.423Z
Learning: The Newton migration guide (docs/migration.rst) is specifically for documenting how to migrate existing warp.sim functionality to Newton equivalents. New Newton-only features that didn't exist in warp.sim do not need migration documentation.

🧬 Code Graph Analysis (4)

newton/examples/diffsim/example_diffsim_bear.py (2)

newton/examples/__init__.py (1)

get_asset_directory (27-28)

newton/examples/diffsim/example_diffsim_drone.py (2)

control (449-450)

state (441-442)

newton/examples/diffsim/example_diffsim_drone.py (2)

newton/examples/__init__.py (1)

get_asset_directory (27-28)

newton/tests/test_cloth.py (1)

finalize (773-815)

newton/examples/diffsim/example_diffsim_cloth.py (2)

newton/tests/test_cloth.py (1)

finalize (773-815)

newton/examples/diffsim/example_diffsim_drone.py (1)

state (441-442)

newton/examples/diffsim/example_diffsim_spring_cage.py (2)

newton/tests/test_cloth.py (1)

finalize (773-815)

newton/examples/diffsim/example_diffsim_drone.py (2)

state (441-442)

control (449-450)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: run-newton-tests / newton-unittests (windows-latest)
GitHub Check: run-newton-tests / newton-unittests (ubuntu-latest)

🔇 Additional comments (11)

newton/examples/diffsim/example_diffsim_spring_cage.py (3)

153-162: LGTM: loss kernel usage is minimal and differentiable

Assigning loss[0] directly makes the backward path simple; no accumulation ambiguity. This is consistent with other examples.

129-139: CUDA graph capture review passed: no new allocations in forward() and gradients are zeroed

forward() doesn’t allocate any wp.zeros/ones/array calls (verified by searching lines 140–150).

self.tape.zero() is invoked at line 184, so gradients won’t accumulate across replays.

Please manually confirm that model.spring_rest_length (and any other arrays used in the captured graph) live on the same device before capture to avoid device-mismatch errors.

78-79: Gravity kwarg in ModelBuilder is valid
The ModelBuilder.__init__ signature is:
def __init__(self, up_axis: AxisType = Axis.Z, gravity: float = -9.81):
so passing gravity=0.0 correctly overrides the default. No changes required.
newton/examples/diffsim/example_diffsim_bear.py (3)

180-187: LGTM: precomputing contacts once for soft contact is reasonable

Given soft contact parameters and ground plane, precomputing contacts is acceptable and aligns with the example’s performance goals.

294-304: Optimizer usage looks correct for Warp’s API (params registered, grads passed to step)

Adam is instantiated with the parameter; step() is provided the gradient array. This matches the optimizer pattern used elsewhere in the PR.

295-304: Ensure the initial state is reset between optimization iterations (you already do this — good pattern)

Resetting sim_time and re-allocating the initial state avoids contamination across iterations. This is consistent with a robust training loop.

Also applies to: 299-304

newton/examples/diffsim/example_diffsim_cloth.py (2)

155-162: LGTM: in-place gradient step on initial velocity is simple and effective

Directly updating particle_qd at t=0 fits the optimization goal and keeps the graph small.

119-128: CUDA graph capture is correctly scoped over forward/backward

Capturing both forward and backward ensures replay includes gradient recompute. Good for perf on CUDA.

newton/examples/diffsim/example_diffsim_drone.py (3)

520-534: Rollout sampling excludes one env (keeps a nominal trajectory) — intentional?

dim uses self.rollouts.trajectories.shape[0] - 1 so the last env isn’t randomized, then later you copy the best back and carry the nominal. This is a sensible tactic; just clarifying intent.

626-634: Double-buffering for the non-grad drone via state list swap is a neat workaround

Given the property getters ignore sim_tick when requires_grad is False, swapping states achieves the needed ping-pong without changing accessors.

Also applies to: 782-788

693-713: Good CUDA graph capture for the optimization step

Capturing forward + backward for the rollout optimizer significantly reduces per-iteration overhead. Zeroing the Tape after step is correct to avoid grad accumulation.

coderabbitai · 2025-08-18T08:20:01Z

🛠️ Refactor suggestion

Clamp rest lengths to non-negative during the optimizer step to avoid unstable/invalid springs

Unconstrained updates can push rest lengths negative, which can destabilize constraints/solver. Clamp at zero (or a small epsilon) in the update kernel.

Apply this diff:

- spring_rest_lengths[tid] -= spring_rest_lengths_grad[tid] * train_rate + # gradient descent update with non-negativity clamp + spring_rest_lengths[tid] = wp.max( + 0.0, spring_rest_lengths[tid] - spring_rest_lengths_grad[tid] * train_rate + )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

@wp.kernel(enable_backward=False)

def apply_gradient_kernel(

spring_rest_lengths_grad: wp.array(dtype=float),

train_rate: float,

spring_rest_lengths: wp.array(dtype=float),

):

tid = wp.tid()

spring_rest_lengths[tid] -= spring_rest_lengths_grad[tid] * train_rate

@wp.kernel(enable_backward=False)

def apply_gradient_kernel(

spring_rest_lengths_grad: wp.array(dtype=float),

train_rate: float,

spring_rest_lengths: wp.array(dtype=float),

):

tid = wp.tid()

# gradient descent update with non-negativity clamp

spring_rest_lengths[tid] = wp.max(

0.0, spring_rest_lengths[tid] - spring_rest_lengths_grad[tid] * train_rate

)

🤖 Prompt for AI Agents

In newton/examples/diffsim/example_diffsim_spring_cage.py around lines 40 to 49, the optimizer kernel updates spring_rest_lengths without enforcing a lower bound, which can drive rest lengths negative and destabilize the solver; modify the kernel to clamp each updated rest length to a non-negative value (e.g., max(updated_value, 0.0) or a small epsilon like 1e-6) before writing back to spring_rest_lengths so rest lengths never become negative.

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

newton/examples/diffsim/example_diffsim_bear.py (1)

234-240: Out-of-bounds risk in tile GEMM when tet_count is not a multiple of TILE_TETS

dim=ceil(tet_count/8) launches a partially filled tile for the last block. tile_load/store expect full tiles; without a mask you risk invalid reads/writes.

Apply this diff to launch only full tiles and handle the tail separately:

-            wp.launch_tiled(
-                kernel=network,
-                dim=math.ceil(self.model.tet_count / TILE_TETS),
-                inputs=[self.phases[frame].reshape((self.phase_count, 1)), self.weights],
-                outputs=[self.tet_activations[frame].reshape((self.model.tet_count, 1))],
-                block_dim=TILE_THREADS,
-            )
+            # Launch full tiles
+            tile_size = 8  # TILE_TETS
+            full_tiles = self.model.tet_count // tile_size
+            if full_tiles > 0:
+                wp.launch_tiled(
+                    kernel=network,
+                    dim=full_tiles,
+                    inputs=[self.phases[frame].reshape((self.phase_count, 1)), self.weights],
+                    outputs=[self.tet_activations[frame].reshape((self.model.tet_count, 1))],
+                    block_dim=TILE_THREADS,
+                )
+            # Handle remaining rows that don't fill a tile
+            tail = self.model.tet_count - full_tiles * tile_size
+            if tail > 0:
+                wp.launch(
+                    kernel=network_tail,
+                    dim=tail,
+                    inputs=[
+                        self.phases[frame],
+                        self.weights,  # [tet_count, phase_count]
+                        self.tet_activations[frame],
+                        full_tiles * tile_size,  # row offset
+                        self.phase_count,
+                    ],
+                    outputs=[],
+                )

Add this helper kernel near the existing network kernel:

@wp.kernel
def network_tail(
    phases: wp.array(dtype=float),
    weights: wp.array2d(dtype=float),
    tet_activations: wp.array(dtype=float),
    row_offset: int,
    phase_count: int,
):
    r = row_offset + wp.tid()  # global row index
    acc = float(0.0)
    for c in range(phase_count):
        acc += weights[r, c] * phases[c]
    tet_activations[r] = wp.tanh(acc) * ACTIVATION_STRENGTH

🧹 Nitpick comments (6)

newton/examples/diffsim/example_diffsim_bear.py (3)
39-40: Guard pxr/USD imports to avoid import-time crashes for users without USD

This example hard-imports pxr at module import time. If USD isn't installed, simply importing this module will crash. Make the dependency explicit and fail with a clear message.

Apply this diff:
-from pxr import Usd, UsdGeom
+try:
+    from pxr import Usd, UsdGeom
+except ImportError as e:
+    raise RuntimeError(
+        "OpenUSD (pxr) is required to run this example. Install USD to proceed."
+    ) from e
55-64: Use abs() instead of sqrt(x*x) for clarity and speed in the loss

sqrt(vyvy) and sqrt(vzvz) compute |vy| and |vz| with extra work and poor readability.

Apply this diff:
-    vx = com[tid][0]
-    vy = com[tid][1]
-    vz = com[tid][2]
-    delta = wp.sqrt(vy * vy) + wp.sqrt(vz * vz) - vx
+    v = com[tid]
+    # L1 penalty on lateral/vertical speed; reward forward speed
+    delta = wp.abs(v[1]) + wp.abs(v[2]) - v[0]
291-299: Resetting state0 is good; consider also zeroing tet_activations per frame (optional)

You already zero coms and loss. tet activations are overwritten each forward, so not strictly necessary, but zeroing can help avoid debugging confusion during partial runs.

No code change required if you prefer current behavior.
newton/examples/diffsim/example_diffsim_soft_body.py (3)
116-136: Validate material parameter lengths to match expected shapes

In the anisotropic branch you rely on tet_materials[:, :2].flatten() to produce exactly 2 * tet_count parameters; in unusual cases this assumption can break.

Consider asserting the expected length at construction time:
params = self.model.tet_materials.numpy()[:, :2].reshape(-1, 2)
assert params.shape[0] == self.model.tet_count, "Unexpected tet_materials shape"
self.material_params = wp.array(params.flatten(), dtype=wp.float32, requires_grad=True)
For isotropic mode the current approach (2 params) works with assign_param as written.

347-352: Guard inline pxr imports in render()

This will crash if USD isn't installed. Make rendering resilient when pxr is unavailable.

Apply this diff:
-                from pxr import Gf, UsdGeom  # noqa: PLC0415
-
-                particles_prim = self.renderer.stage.GetPrimAtPath("/root/particles")
-                particles = UsdGeom.Points.Get(self.renderer.stage, particles_prim.GetPath())
-                particles.CreateDisplayColorAttr().Set([Gf.Vec3f(1.0, 1.0, 1.0)], time=self.renderer.time)
+                try:
+                    from pxr import Gf, UsdGeom  # noqa: PLC0415
+                    particles_prim = self.renderer.stage.GetPrimAtPath("/root/particles")
+                    if particles_prim:
+                        particles = UsdGeom.Points.Get(self.renderer.stage, particles_prim.GetPath())
+                        particles.CreateDisplayColorAttr().Set([Gf.Vec3f(1.0, 1.0, 1.0)], time=self.renderer.time)
+                except ImportError:
+                    pass  # USD not available; skip coloring
268-276: Tape usage is fine; optional: prefer recreating over zeroing

You already recreate the tape per iteration when not using graphs. The later call to tape.zero() (Line 295) is redundant; consider dropping it to simplify.

📜 Review details

Configuration used: .coderabbit.yml
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 7c7da76e481c199640c397e3914e357103eef7e9 and be3c6587eba8bf210a0c3201a416b43d9774f480.

📒 Files selected for processing (2)

newton/examples/diffsim/example_diffsim_bear.py (1 hunks)
newton/examples/diffsim/example_diffsim_soft_body.py (1 hunks)

🧰 Additional context used

🧠 Learnings (4)

📓 Common learnings

Learnt from: shi-eric
PR: newton-physics/newton#521
File: newton/examples/example_cloth_hanging.py:36-36
Timestamp: 2025-08-12T05:17:34.423Z
Learning: The Newton migration guide (docs/migration.rst) is specifically for documenting how to migrate existing warp.sim functionality to Newton equivalents. New Newton-only features that didn't exist in warp.sim do not need migration documentation.

📚 Learning: 2025-08-12T05:17:34.423Z

Learnt from: shi-eric
PR: newton-physics/newton#521
File: newton/examples/example_cloth_hanging.py:36-36
Timestamp: 2025-08-12T05:17:34.423Z
Learning: The Newton migration guide (docs/migration.rst) is specifically for documenting how to migrate existing warp.sim functionality to Newton equivalents. New Newton-only features that didn't exist in warp.sim do not need migration documentation.

Applied to files:

newton/examples/diffsim/example_diffsim_soft_body.py

📚 Learning: 2025-08-12T18:07:17.267Z

Learnt from: nvlukasz
PR: newton-physics/newton#519
File: newton/utils.py:30-33
Timestamp: 2025-08-12T18:07:17.267Z
Learning: In newton/_src/utils/import_usd.py, the PXR/USD modules are lazily loaded using try-except blocks within the functions (lines 109-111 and 1289-1291) rather than at module import time, making it safe to import parse_usd at the module level without causing import-time failures for users without USD dependencies.

Applied to files:

newton/examples/diffsim/example_diffsim_soft_body.py

📚 Learning: 2025-08-12T18:07:17.267Z

Learnt from: nvlukasz
PR: newton-physics/newton#519
File: newton/utils.py:30-33
Timestamp: 2025-08-12T18:07:17.267Z
Learning: In newton/_src/utils/import_usd.py, the PXR/USD modules are lazily loaded using try-except blocks within the functions rather than at module import time, making it safe to import parse_usd at the module level without causing import-time failures for users without USD dependencies.

Applied to files:

newton/examples/diffsim/example_diffsim_soft_body.py

🧬 Code Graph Analysis (2)

newton/examples/diffsim/example_diffsim_bear.py (3)

newton/examples/__init__.py (1)

get_asset_directory (27-28)

newton/tests/test_cloth.py (1)

finalize (773-815)

newton/examples/diffsim/example_diffsim_drone.py (2)

control (449-450)

state (441-442)

newton/examples/diffsim/example_diffsim_soft_body.py (1)

newton/examples/diffsim/example_diffsim_drone.py (2)

state (441-442)

control (449-450)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: run-newton-tests / newton-unittests (ubuntu-latest)
GitHub Check: run-newton-tests / newton-unittests (windows-latest)

🔇 Additional comments (6)

newton/examples/diffsim/example_diffsim_bear.py (2)

186-187: Questionable soft_contact_margin=40.0; likely too large for this scale

A margin of 40.0 is enormous relative to geometry scale (radii ~0.05, positions ~O(1)). This can degrade contact performance/accuracy.

Consider reducing to something like 0.02–0.1 and recomputing contacts every substep (see separate comment).
-        self.contacts = self.model.collide(self.states[0], soft_contact_margin=40.0)
+        self.contacts = self.model.collide(self.states[0], soft_contact_margin=0.05)
If you expect units or mesh scale to justify 40.0, please confirm.

282-285: Optimizer usage looks correct

Using Adam on the flattened weights, stepping with the flattened grad aligns with the initialization. No issues spotted.
newton/examples/diffsim/example_diffsim_soft_body.py (4)

148-151: States created without requires_grad; confirm this is intended

For differentiable simulation, states typically participate in the tape. Your model is finalized with requires_grad=True, which may suffice, but some examples explicitly allocate states with requires_grad=True.

If needed, switch to:
-        for _i in range(self.sim_steps + 1):
-            self.states.append(self.model.state())
+        for _i in range(self.sim_steps + 1):
+            self.states.append(self.model.state(requires_grad=True))
232-238: assign_param kernel handles both anisotropic and isotropic; launch looks good

The modulo-based indexing broadcasts isotropic params across all tets and selects per-tet pairs otherwise. No issues here.

240-244: Good: collisions recomputed each step

Recomputing contacts per substep avoids stale manifolds.

171-231: Model setup and parameters look consistent

Soft grid, box obstacle, and ground are set up with reasonable material/contact parameters. finalize(requires_grad=True) and solver selection are correct.

coderabbitai

Actionable comments posted: 23

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (7)

newton/_src/geometry/types.py (1)

104-139: Fix variable overloading issue in the mesh initialization block.

The variable color is being used both as a parameter and as an imported module name (from ..utils.color import color), though that import is not visible in this file. This could lead to confusion and potential runtime errors.

Consider renaming the parameter to avoid the conflict:
     def __init__(
         self,
         vertices: Sequence[Vec3] | nparray,
         indices: Sequence[int] | nparray,
         normals: Sequence[Vec3] | nparray | None = None,
         uvs: Sequence[Vec2] | nparray | None = None,
         compute_inertia: bool = True,
         is_solid: bool = True,
         maxhullvert: int = MESH_MAXHULLVERT,
-        color: Vec3 | None = None,
+        mesh_color: Vec3 | None = None,
     ):
And update the internal storage accordingly:
-        self._color = color
+        self._color = mesh_color

newton/_src/utils/import_mjcf.py (3)

291-295: Double application of incoming_xform for world geoms.

incoming_xform is multiplied into tf twice when link == -1 (first block) and again unconditionally later. This over-transforms world shapes. Remove the first block and rely on the unconditional application.

Apply this diff:

-            if link == -1 and incoming_xform is not None:
-                tf = incoming_xform * tf
-                geom_pos = tf.p
-                geom_rot = tf.q
+            # worldframe transform is applied once below if provided

Also applies to: 310-312

581-587: Likely bug: base_joint["child"] = root should be link.

When base_joint is a dict, setting child=root (the XML root element) is wrong; it should be the current body index link.

Apply this diff:

-                    base_joint["parent"] = -1
-                    base_joint["child"] = root
+                    base_joint["parent"] = -1
+                    base_joint["child"] = link

339-343: Ensure correct NumPy conversion of Warp vectors for mesh_scale

parse_vec returns a wp.vec* type, while the fallback mesh_assets[…]["scale"] is already a NumPy array. Calling np.array(mesh_scale) on a wp.vec produces an object-dtype array, which will break downstream NumPy computations. Extract the components explicitly into a float32 array to unify both paths.

Locations to update:
• newton/_src/utils/import_mjcf.py:339–343

Apply this diff:

--- a/newton/_src/utils/import_mjcf.py
+++ b/newton/_src/utils/import_mjcf.py
@@ -337,7 +337,12 @@
                 m = trimesh.load(stl_file, force="mesh")
-                if "mesh" in geom_defaults:
-                    mesh_scale = parse_vec(geom_defaults["mesh"], "scale", mesh_assets[geom_attrib["mesh"]]["scale"])
-                else:
-                    mesh_scale = mesh_assets[geom_attrib["mesh"]]["scale"]
-                scaling = np.array(mesh_scale) * scale
+                if "mesh" in geom_defaults:
+                    mesh_scale = parse_vec(
+                        geom_defaults["mesh"],
+                        "scale",
+                        mesh_assets[geom_attrib["mesh"]]["scale"]
+                    )
+                    mesh_scale_np = np.array(
+                        [mesh_scale[0], mesh_scale[1], mesh_scale[2]],
+                        dtype=np.float32
+                    )
+                else:
+                    mesh_scale_np = mesh_assets[geom_attrib["mesh"]]["scale"]
+                scaling = mesh_scale_np * scale

newton/_src/geometry/inertia.py (1)

454-461: Bug: Hollow cone inertia/COM are computed incorrectly after changing compute_cone_inertia to return COM-centric inertia

The hollow cone branch subtracts inertias and reuses the solid COM. Because compute_cone_inertia now returns inertia about each shape’s COM (and the cone’s COM is not at the origin), you must:

compute the composite mass m = ms - mh,
compute composite COM c = (mscs - mhch)/m,
shift both inertias to a common origin with parallel axis,
subtract, then shift to the composite COM.

Otherwise the inertia and COM are inconsistent and physically incorrect.

Apply this diff to fix the hollow cone case:

     elif type == GeoType.CONE:
         r, h = scale[0], scale[1] * 2.0
-        solid = compute_cone_inertia(density, r, h)
-        if is_solid:
-            return solid
-        else:
-            assert isinstance(thickness, float), "thickness must be a float for a hollow cone geom"
-            hollow = compute_cone_inertia(density, r - thickness, h - 2.0 * thickness)
-            return solid[0] - hollow[0], solid[1], solid[2] - hollow[2]
+        ms, cs, Is = compute_cone_inertia(density, r, h)
+        if is_solid:
+            return ms, cs, Is
+        else:
+            assert isinstance(thickness, float), "thickness must be a float for a hollow cone geom"
+            # clamp to avoid negative dimensions
+            r_inner = max(r - thickness, 0.0)
+            h_inner = max(h - 2.0 * thickness, 0.0)
+            mh, ch, Ih = compute_cone_inertia(density, r_inner, h_inner)
+
+            m = ms - mh
+            if m <= 0.0:
+                return 0.0, wp.vec3(), wp.mat33()
+
+            # Shift each inertia to a common origin using the parallel axis theorem
+            # I_at_origin = I_com + m * (||c||^2 * I - c c^T)
+            I_s_origin = Is + ms * (wp.dot(cs, cs) * wp.mat33(np.eye(3)) - wp.outer(cs, cs))
+            I_h_origin = Ih + mh * (wp.dot(ch, ch) * wp.mat33(np.eye(3)) - wp.outer(ch, ch))
+            I_origin = I_s_origin - I_h_origin
+
+            # Composite COM of the hollow body
+            c = (ms * cs - mh * ch) / m
+
+            # Shift back to inertia about the composite COM
+            # I_composite = I_origin - m * (||c||^2 * I - c c^T)
+            I = I_origin - m * (wp.dot(c, c) * wp.mat33(np.eye(3)) - wp.outer(c, c))
+            return m, c, I

newton/_src/solvers/mujoco/solver_mujoco.py (1)

2239-2271: Derive full_shape_mapping from self.shape_map (handle static shapes explicitly)

The modulo-based expansion assumes contiguous layout and mis-handles static geoms. Instead, rebuild the mapping directly:

-        if separate_envs_to_worlds and model.num_envs > 1:
-            shapes_per_env = model.shape_count // model.num_envs
-            if model.shape_count % model.num_envs != 0:
-                warnings.warn(
-                    f"Total shape count {model.shape_count} is not divisible by number of environments {model.num_envs}. "
-                    "Shape mapping to MuJoCo geoms may be incorrect.",
-                    stacklevel=2,
-                )
-
-            full_shape_mapping = {}
-            # geom_to_shape_idx is a mapping from MuJoCo template geom index to Newton shape indices of the last environemnt
-            # `geom_to_shape_idx` provides the reverse mapping for the template env: {mj_geom_idx: newton_shape_idx}
-            for geom_idx, template_shape_idx in geom_to_shape_idx.items():
-                # The local index is consistent for a given part of the model across all environments.
-                local_shape_idx = template_shape_idx % shapes_per_env
-                for env_idx in range(model.num_envs):
-                    # Calculate the global Newton shape index for the current environment.
-                    global_shape_idx = env_idx * shapes_per_env + local_shape_idx
-                    # All corresponding shapes map to the same MuJoCo geom index (since mj_model is single-env).
-                    full_shape_mapping[global_shape_idx] = (env_idx, geom_idx)
+        if separate_envs_to_worlds and model.num_envs > 1:
+            # Build full_shape_mapping directly from self.shape_map
+            full_shape_mapping = {}
+            for shape_idx, (worldid, mj_geom) in self.shape_map.items():
+                if mj_geom is None:
+                    continue
+                if worldid >= 0:
+                    # dynamic shapes: keep their per-env assignment
+                    full_shape_mapping[shape_idx] = (worldid, mj_geom)
+                else:
+                    # static/world geoms live on the global body—map them into env 0
+                    full_shape_mapping[shape_idx] = (0, mj_geom)

• removes the brittle modulo logic
• uses the already-built self.shape_map for all dynamic shapes
• treats static (world) geoms explicitly by assigning them to env 0

newton/examples/diffsim/example_diffsim_ball.py (1)

270-296: --device CLI arg is parsed but not used.

Set the Warp device when provided to honor the CLI flag.

Apply this diff:

     args = parser.parse_known_args()[0]
 
-    example = Example(viewer_type=args.viewer, stage_path=args.stage_path, verbose=args.verbose)
+    if args.device:
+        wp.set_device(args.device)
+    example = Example(viewer_type=args.viewer, stage_path=args.stage_path, verbose=args.verbose)

♻️ Duplicate comments (9)

newton/examples/diffsim/example_diffsim_spring_cage.py (2)

40-49: Clamp spring rest lengths to non-negative after gradient step

Unconstrained updates can drive rest lengths negative, destabilizing the solver.

Apply clamp in the update kernel:

 @wp.kernel(enable_backward=False)
 def apply_gradient_kernel(
     spring_rest_lengths_grad: wp.array(dtype=float),
     train_rate: float,
     spring_rest_lengths: wp.array(dtype=float),
 ):
     tid = wp.tid()
-
-    spring_rest_lengths[tid] -= spring_rest_lengths_grad[tid] * train_rate
+    # gradient descent with non-negativity clamp
+    spring_rest_lengths[tid] = wp.max(
+        0.0, spring_rest_lengths[tid] - spring_rest_lengths_grad[tid] * train_rate
+    )

109-111: Enable gradients on states to ensure backprop through trajectory

States are created without requires_grad=True, which can drop gradients on some backends.

-        self.states = tuple(self.model.state() for _ in range(self.num_frames * self.sim_substep_count + 1))
+        self.states = tuple(
+            self.model.state(requires_grad=True) for _ in range(self.num_frames * self.sim_substep_count + 1)
+        )

newton/examples/diffsim/example_diffsim_soft_body.py (2)

147-151: States should be created with requires_grad=True

Without requires_grad=True, gradients through the trajectory and contact generation can be dropped.

-        self.states = []
-        for _i in range(self.sim_steps + 1):
-            self.states.append(self.model.state())
+        self.states = []
+        for _i in range(self.sim_steps + 1):
+            self.states.append(self.model.state(requires_grad=True))

306-314: Gradient/tape lifecycle and state reset between iterations

Tape: Reusing the same Tape and calling tape.zero() can be OK, but the common pattern is to recreate per iteration (or use CUDA graphs). Ensure you also clear parameter grads explicitly.
Reset: The simulation is not reset to initial conditions between iterations, which can bias optimization. If you’re using CUDA graphs, reset in-place to keep capture valid.

Example:

-            # clear grads for next iteration
-            self.tape.zero()
-            self.loss.zero_()
-            self.com.zero_()
-            self.pos_error.zero_()
+            # clear grads for next iteration
+            self.tape.zero()
+            self.material_params.grad.zero_()
+            self.loss.zero_()
+            self.com.zero_()
+            self.pos_error.zero_()
+
+            # reset trajectory initial conditions in-place (CUDA-graph friendly)
+            init = self.model.state(requires_grad=True)
+            wp.copy(init.particle_q, self.states[0].particle_q)
+            wp.copy(init.particle_qd, self.states[0].particle_qd)

newton/examples/diffsim/example_diffsim_bear.py (2)

251-257: Tiled GEMM tail can access out-of-bounds when tet_count % TILE_TETS != 0

dim=ceil(tet_count/8) with tile load/store assumes full tiles. The final partial tile will access/write past the end.

Split the work: full tiles via launch_tiled and tail rows via a scalar kernel.

Add the tail kernel:

@wp.kernel
def network_tail(
    phases: wp.array(dtype=float),
    weights: wp.array2d(dtype=float),
    tet_activations: wp.array(dtype=float),
    row_offset: int,
    phase_count: int,
):
    r = row_offset + wp.tid()
    acc = float(0.0)
    for c in range(phase_count):
        acc += weights[r, c] * phases[c]
    tet_activations[r] = wp.tanh(acc) * ACTIVATION_STRENGTH

And modify the launch:

-            wp.launch_tiled(
-                kernel=network,
-                dim=math.ceil(self.model.tet_count / TILE_TETS),
-                inputs=[self.phases[frame].reshape((self.phase_count, 1)), self.weights],
-                outputs=[self.tet_activations[frame].reshape((self.model.tet_count, 1))],
-                block_dim=TILE_THREADS,
-            )
+            full_tiles = int(self.model.tet_count // int(TILE_TETS))
+            if full_tiles > 0:
+                wp.launch_tiled(
+                    kernel=network,
+                    dim=full_tiles,
+                    inputs=[self.phases[frame].reshape((self.phase_count, 1)), self.weights],
+                    outputs=[self.tet_activations[frame].reshape((self.model.tet_count, 1))],
+                    block_dim=TILE_THREADS,
+                )
+            tail = int(self.model.tet_count - full_tiles * int(TILE_TETS))
+            if tail > 0:
+                wp.launch(
+                    kernel=network_tail,
+                    dim=tail,
+                    inputs=[
+                        self.phases[frame],
+                        self.weights,
+                        self.tet_activations[frame],
+                        full_tiles * int(TILE_TETS),
+                        self.phase_count,
+                    ],
+                    outputs=[],
+                )

260-270: Recompute contacts per substep; one-time contacts become stale as the mesh moves

Contacts are computed once in init and reused. For deformables, recompute each substep.

-            for i in range(self.sim_substeps):
-                self.states[frame * self.sim_substeps + i].clear_forces()
-                self.solver.step(
-                    self.states[frame * self.sim_substeps + i],
-                    self.states[frame * self.sim_substeps + i + 1],
-                    self.control,
-                    self.contacts,
-                    self.sim_dt,
-                )
+            for i in range(self.sim_substeps):
+                idx = frame * self.sim_substeps + i
+                self.states[idx].clear_forces()
+                self.contacts = self.model.collide(self.states[idx], soft_contact_margin=40.0)
+                self.solver.step(
+                    self.states[idx],
+                    self.states[idx + 1],
+                    self.control,
+                    self.contacts,
+                    self.sim_dt,
+                )

newton/examples/diffsim/example_diffsim_ball.py (2)

125-135: Non-CUDA path now initializes self.graph; good catch addressed.

The else branch setting self.graph = None prevents accidental AttributeError later. This resolves the earlier concern.

153-172: Accessing self.tape outside capture context is brittle; rely on x.grad instead.

When using CUDA graph replay, the Tape instance created during capture isn’t re-created per step. Accessing self.tape.gradients after capture_launch may be invalid or stale. Use x.grad consistently (Warp arrays carry .grad) and guard logging when gradients are available.

Apply this diff:

             # gradient descent step
             x = self.states[0].particle_qd
             wp.launch(step_kernel, dim=len(x), inputs=[x, x.grad, self.train_rate])

-            x_grad = self.tape.gradients[self.states[0].particle_qd]
+            # Prefer array-backed gradients; Tape dictionary may be invalid after CUDA graph replay
+            x_grad = x.grad if hasattr(x, "grad") else None

-            if self.verbose:
-                print(f"Train iter: {self.train_iter} Loss: {self.loss}")
-                print(f"    x: {x} g: {x_grad}")
+            if self.verbose:
+                print(f"Train iter: {self.train_iter} Loss: {self.loss}")
+                if x_grad is not None:
+                    print(f"    x: {x} g: {x_grad}")
+                else:
+                    print(f"    x: {x} (no grad available)")
 
             # clear grads for next iteration
             self.tape.zero()

newton/examples/diffsim/example_diffsim_drone.py (1)

136-137: Fix invalid cast: wp.float is not a valid Warp type (use wp.float32 or Python float)

As written, this will fail JIT compilation. Use wp.float32(...) or cast with float(...).
-    discount = 0.8 ** wp.float(horizon_length - step - 1) / wp.float(horizon_length) ** 2.0
+    discount = wp.pow(wp.float32(0.8), wp.float32(horizon_length - step - 1)) / (wp.float32(horizon_length) ** 2.0)

🧹 Nitpick comments (45)

newton/_src/viewer/gl/icon.py (8)
1-1: Filename/header mismatch – align module name with usage.

Header says "generate_icons.py" while the file is icon.py. Either rename the file to generate_icons.py or update the header comment to avoid confusion and broken references.

6-9: Clarify expectations in the docstring (size validation and platform font caveats).

Briefly document that size must be a positive integer and that color rendering depends on availability of a platform emoji font.

Apply this diff to add a simple input check:
 def create_and_save_emoji_png(character: str, size: int, filename: str):
     """
     Renders a Unicode character onto a transparent PNG and saves it.
     """
+    if size <= 0:
+        raise ValueError(f"size must be a positive integer, got {size}")
18-27: Avoid hardcoding a Windows-only font; add cross-platform font discovery with graceful fallback.

As written, most environments won’t find seguiemj.ttf, causing monochrome fallback. Provide OS-specific candidates and warn if none found.

Apply this diff:
-    font_path = "seguiemj.ttf"
-    font_size = int(size * 0.8)  # Adjust font size relative to image size
-
-    try:
-        font = ImageFont.truetype(font_path, size=font_size)
-    except OSError:
-        print(f"Warning: Font '{font_path}' not found. Using default font.")
-        print("The icon may not render in color.")
-        font = ImageFont.load_default()
+    font_size = int(size * 0.8)  # Adjust font size relative to image size
+
+    # Try common color-emoji fonts by platform
+    import sys
+    font = None
+    candidates = []
+    if sys.platform.startswith("win"):
+        candidates = [
+            r"C:\Windows\Fonts\seguiemj.ttf",
+            "seguiemj.ttf",
+        ]
+    elif sys.platform == "darwin":
+        candidates = ["/System/Library/Fonts/Apple Color Emoji.ttc"]
+    else:
+        candidates = [
+            "/usr/share/fonts/truetype/noto/NotoColorEmoji.ttf",
+            "/usr/share/fonts/NotoColorEmoji.ttf",
+            "/usr/local/share/fonts/NotoColorEmoji.ttf",
+        ]
+
+    for path in candidates:
+        try:
+            font = ImageFont.truetype(path, size=font_size)
+            break
+        except OSError:
+            continue
+
+    if font is None:
+        import warnings
+        warnings.warn(
+            "No color-emoji font found; falling back to default font. "
+            "Icons may render in monochrome."
+        )
+        font = ImageFont.load_default()
28-32: Remove no-op textbbox and “+2” fudge; center reliably.

The textbbox result is unused, and the +2 offset is a magic number. Rely on anchor="mm" for centering.

Apply this diff:
-    # Calculate position to center the character
-    draw.textbbox((0, 0), character, font=font, spacing=0)
-    x = size // 2
-    y = size // 2 + 2  # +2 fudge factor
+    # Center the character using midpoint anchoring
+    x = size // 2
+    y = size // 2
33-35: Be robust to older Pillow versions lacking embedded_color.

Some environments may not support the embedded_color parameter; fall back gracefully.

Apply this diff:
-    # Draw the character onto the image
-    draw.text((x, y), anchor="mm", text=character, font=font, embedded_color=True)
+    # Draw the character onto the image
+    try:
+        draw.text((x, y), text=character, font=font, anchor="mm", embedded_color=True)
+    except TypeError:
+        # Pillow without embedded_color support
+        draw.text((x, y), text=character, font=font, anchor="mm")
37-38: Avoid prints in library code for non-errors.

Either return the path and let the caller log, or use logging at INFO level. If you keep prints, consider making them optional.

Example minimal change:
-    image.save(filename, "PNG")
-    print(f"Successfully created {filename}")
+    image.save(filename, "PNG")
+    # Consider: use logging.info(...) or return filename instead of printing.
3-3: Optional: Lazy-import Pillow to avoid hard dependency on import.

If this module is imported as part of the viewer, top-level PIL imports can fail in environments without Pillow. Moving them inside the function localizes the dependency.

Replace the top import with local imports inside the function:
-from PIL import Image, ImageDraw, ImageFont
And at the start of the function body:
from PIL import Image, ImageDraw, ImageFont
41-47: Consider moving the CLI example to a scripts/ tool or guard it behind a small CLI.

Keeping main here is fine for a dev utility; if this becomes part of the installed package, a dedicated entry point (e.g., newton-generate-icon) improves UX and avoids stray prints.

Happy to scaffold a tiny argparse-based CLI if you want.
newton/_src/utils/import_mjcf.py (1)

344-345: Message/logic inconsistency around mesh size handling.

Comment says “ignore geom size attribute” but asserts “need to specify size for mesh geom”. Either remove the assert or clarify the comment (MJCF mesh size often unused; scale is taken from asset/geom.mesh scale).
newton/tests/test_import_mjcf.py (1)
73-119: Make the “default maxhullvert” assertion robust (avoid magic number 64).

Instead of asserting against a hardcoded 64, import the constant used by Mesh (MESH_MAXHULLVERT) to keep tests aligned with defaults.

Apply this diff:
-                self.assertEqual(meshes[2].maxhullvert, 64)  # Default value
+                from newton._src.geometry import MESH_MAXHULLVERT
+                self.assertEqual(meshes[2].maxhullvert, MESH_MAXHULLVERT)  # Default value
newton/examples/example_cartpole.py (1)
42-47: Scope USD import to the cartpole prim for robustness and performance

Specifying the root path limits parsing to only the /cartPole subtree (as confirmed in examples/assets/cartpole.usda), avoiding unrelated prims and speeding up load time.

Apply this optional refactor in newton/examples/example_cartpole.py (around lines 42–47):
 newton.utils.parse_usd(
-    newton.examples.get_asset("cartpole.usda"),
+    newton.examples.get_asset("cartpole.usda"),
     articulation_builder,
     enable_self_collisions=False,
     collapse_fixed_joints=True,
+    root_path="/cartPole",
 )
newton/_src/geometry/inertia.py (1)
42-43: Docstrings should reflect COM-centric returns for consistency across shapes

compute_cone_inertia correctly documents (mass, center of mass, inertia). The sphere, capsule, cylinder, and box docstrings still say “(mass, inertia) … around the origin” despite returning (m, wp.vec3(), I). Align these to reduce confusion.

Apply these docstring tweaks:
@@
-    Returns:
-
-        A tuple of (mass, inertia) with inertia specified around the origin
+    Returns:
+        A tuple of (mass, center of mass, inertia) with inertia specified around the center of mass.
+        For a sphere, the center of mass is at the origin.
@@
-    Returns:
-
-        A tuple of (mass, inertia) with inertia specified around the origin
+    Returns:
+        A tuple of (mass, center of mass, inertia) with inertia specified around the center of mass.
+        For this capsule (aligned with z), the center of mass is at the origin.
@@
-    Returns:
-
-        A tuple of (mass, inertia) with inertia specified around the origin
+    Returns:
+        A tuple of (mass, center of mass, inertia) with inertia specified around the center of mass.
+        For this cylinder (aligned with z), the center of mass is at the origin.
@@
-    Returns:
-
-        A tuple of (mass, inertia) with inertia specified around the origin
+    Returns:
+        A tuple of (mass, center of mass, inertia) with inertia specified around the center of mass.
+        For an axis-aligned box centered at the origin, the center of mass is at the origin.
Also applies to: 65-66, 94-95, 171-172
newton/_src/viewer/__init__.py (1)
41-49: Make viewer backend imports lazy to avoid hard import-time dependency on optional packages

Directly importing ViewerGL / ViewerRerun / ViewerUSD at module import time can fail if optional deps (e.g., pyglet, rerun, USD) aren’t installed. Prefer lazy attribute resolution so imports happen only on first use.

Apply this diff:
-from .viewer_gl import ViewerGL
-from .viewer_rerun import ViewerRerun
-from .viewer_usd import ViewerUSD
+def __getattr__(name):
+    if name == "ViewerGL":
+        from .viewer_gl import ViewerGL as _ViewerGL
+        return _ViewerGL
+    if name == "ViewerRerun":
+        from .viewer_rerun import ViewerRerun as _ViewerRerun
+        return _ViewerRerun
+    if name == "ViewerUSD":
+        from .viewer_usd import ViewerUSD as _ViewerUSD
+        return _ViewerUSD
+    raise AttributeError(name)
newton/viewer.py (1)

20-31: Top-level viewer re-exports are fine; consider guarding optional deps

Re-exporting the new viewer classes is good. If any backend has heavy/optional dependencies, consider mirroring the lazy import pattern here (via a module-level getattr) to avoid import-time failures for users who only need a subset.

Would you like a patch to add a getattr that lazily imports ViewerGL/ViewerRerun/ViewerUSD and emits helpful ImportError messages?
newton/examples/example_selection_cartpole.py (1)
56-62: Limit USD parsing to the /cartPole prim for faster imports

Verified that the cartpole.usda asset defines its bodies under the /cartPole prim. Passing root_path="/cartPole" to parse_usd will restrict parsing to the CartPole articulation, improving import robustness and performance on larger stages.

• File: newton/examples/example_selection_cartpole.py, lines 56–62
Add the root_path parameter to the parse_usd call:
         newton.utils.parse_usd(
             newton.examples.get_asset("cartpole.usda"),
             articulation_builder,
             xform=wp.transform((0.0, 0.0, 2.0), wp.quat_identity()),
             collapse_fixed_joints=COLLAPSE_FIXED_JOINTS,
             enable_self_collisions=False,
+            root_path="/cartPole",
         )
newton/tests/test_inertia.py (1)
253-323: Cone inertia mesh-vs-analytic test looks correct; consider minor robustness tweaks.

The mesh construction (apex at +half_height, base at -half_height) and comparisons against compute_cone_inertia are consistent. Two small suggestions:

Use a tolerance that scales with the matrix norm to treat off-diagonals (near zero) proportionally, e.g., atol = 1e-3 * np.linalg.norm(I_cone).

500 segments is fine but may be heavy in CI; consider parametrizing resolution if runtime becomes an issue.

Example for a norm-based tolerance:
tol_I = 1e-3 * np.linalg.norm(np.array(I_cone))
assert_np_equal(np.array(I_mesh), np.array(I_cone), tol=tol_I)
newton/tests/test_pendulum_revolute_vs_d6.py (1)

55-61: Minor: qd0_model is computed but unused.

Can be removed to keep the test tidy.

newton/tests/test_import_usd.py (1)

431-527: Force limits propagation test is good; small robustness nit.

Asserting joint_effort_limit via joint_qd_start is fine here since each joint exposes one relevant DOF. If future scenes include multi-DOF joints, consider locating the exact DOF by axis/type to avoid assuming the first DOF belongs to the drive being tested.
newton/_src/sim/model.py (1)
291-293: articulation_group attribute added consistently; consider adding attribute_frequency entries.

To fully integrate with attribute introspection (Model.get_attribute_frequency), add these group attributes to attribute_frequency. Without this, get_attribute_frequency("joint_group") would raise.

Suggested additions (place alongside existing mappings):
# attributes per particle
self.attribute_frequency["particle_group"] = "particle"

# attributes per shape
self.attribute_frequency["shape_group"] = "shape"

# attributes per body
self.attribute_frequency["body_group"] = "body"

# attributes per joint
self.attribute_frequency["joint_group"] = "joint"

# attributes per articulation
self.attribute_frequency["articulation_group"] = "articulation"
newton/tests/test_cone_orientation.py (1)
45-79: SDF tests are consistent with kernels.cone_sdf orientation and formula

The selected points (apex, base center, base edge, origin, mid-height edge) are well-chosen and match the current SDF implementation.

Minor nit: consider being explicit with Warp dtype (e.g., wp.float32) to avoid device-dependent defaults creeping in CI.
-        wp_sdf_values = wp.zeros(len(points), dtype=float)
+        wp_sdf_values = wp.zeros(len(points), dtype=wp.float32)
newton/_src/solvers/mujoco/solver_mujoco.py (3)
2262-2271: Potential type mismatch: assigning a wp.transform into a NumPy float array

shape_incoming_xform is an np.ndarray of shape (shape_count, 7). Assigning a wp.transform directly to a row relies on implicit conversion that may not be guaranteed. Be explicit to avoid dtype issues.
-                        shape_incoming_xform[global_shape_idx] = mjc_tf * wp.transform_inverse(original_tf)
+                        incoming = mjc_tf * wp.transform_inverse(original_tf)
+                        incoming_np = np.array(incoming, dtype=np.float32)
+                        shape_incoming_xform[global_shape_idx, :] = incoming_np
Apply the same change to the single-env path below.

2272-2280: Single-env path: also use explicit assignment for shape_incoming_xform

Mirror the explicit conversion to avoid ambiguous assignment behavior.
-                    shape_incoming_xform[shape_idx] = mjc_tf * wp.transform_inverse(original_tf)
+                    incoming = mjc_tf * wp.transform_inverse(original_tf)
+                    shape_incoming_xform[shape_idx, :] = np.array(incoming, dtype=np.float32)
2313-2332: Avoid assigning a dict then an array to model.to_mjc_geom_index

You set model.to_mjc_geom_index twice with incompatible types: first to a dict (Line 2313), then to a wp.array. The first assignment is redundant and may confuse downstream tooling.
-            model.to_mjc_geom_index = shape_mapping  # pyright: ignore[reportAttributeAccessIssue]
+            # model.to_mjc_geom_index will be set to a dense array below.
newton/examples/example_viewer.py (2)
123-156: Mismatch between comment and positions: you lay out along Y but comment says X

The code varies the second coordinate (Y) while the comment claims “line along X-axis”. Either fix the comment or move the positions to X for clarity.
-        # Clean layout: arrange objects in a line along X-axis
+        # Clean layout: arrange objects in a line along X-axis
...
-        base_left = -4.0
+        base_x = -4.0
...
-        sphere_pos = wp.vec3(0.0, base_left, base_height + 0.3 * abs(math.sin(1.2 * t)))
+        sphere_pos = wp.vec3(base_x, 0.0, base_height + 0.3 * abs(math.sin(1.2 * t)))
...
-        base_left += spacing
+        base_x += spacing
...
-        x_box_anim = wp.array([wp.transform([0.0, base_left, base_height], qx_slow)], dtype=wp.transform)
+        x_box_anim = wp.array([wp.transform([base_x, 0.0, base_height], qx_slow)], dtype=wp.transform)
-        base_left += spacing
+        base_x += spacing
...
-        x_cone_anim = wp.array([wp.transform([0.0, base_left, base_height], qy_slow)], dtype=wp.transform)
+        x_cone_anim = wp.array([wp.transform([base_x, 0.0, base_height], qy_slow)], dtype=wp.transform)
-        base_left += spacing
+        base_x += spacing
...
-        x_cyl_anim = wp.array([wp.transform([0.0, base_left, base_height], qz_slow)], dtype=wp.transform)
+        x_cyl_anim = wp.array([wp.transform([base_x, 0.0, base_height], qz_slow)], dtype=wp.transform)
-        base_left += spacing
+        base_x += spacing
...
-        capsule_pos = wp.vec3(0.3 * math.sin(0.8 * t), base_left, base_height)
+        capsule_pos = wp.vec3(base_x, 0.3 * math.sin(0.8 * t), base_height)
-        base_left += spacing
+        base_x += spacing
199-199: Typo in lines path: use a sensible entity name

"coordinate_self.axes" looks accidental. Suggest renaming to “/axes” or a similar path.
-        self.viewer.log_lines("coordinate_self.axes", self.axes_begins, self.axes_ends, self.axes_colors)
+        self.viewer.log_lines("/axes", self.axes_begins, self.axes_ends, self.axes_colors)
newton/_src/viewer/viewer_rerun.py (1)

208-214: is_running() relies on a viewer process that is never spawned

self._viewer_process is never set; as written, is_running() returns True indefinitely unless close() is called. If that’s intentional (e.g., driven by an outer loop), consider documenting the behavior. Otherwise, either spawn a process for the viewer or base liveness on rerun’s connection status.

Would you like me to wire a simple health check (e.g., maintain a “last frame time” and stop after N frames when in non-interactive rerun mode), or to spawn the web-viewer in a subprocess to derive a real process handle?
newton/_src/viewer/picking.py (3)
40-54: Device/host allocation for pick_body and pick_state: verify CUDA write semantics

pick_body is allocated as pinned host memory when on CUDA, while pick_state lives on model.device. Kernels write to both. Confirm Warp kernels can safely write to the pinned host array (pick_body) from CUDA in your target versions; if not, allocate both on model.device and mirror to host only when needed.

If needed, align device placement:
-        if model and model.device.is_cuda:
-            self.pick_body = wp.array([-1], dtype=int, pinned=True)
-        else:
-            self.pick_body = wp.array([-1], dtype=int, device="cpu")
+        # Keep pick_body on the simulation device for consistent kernel writes
+        self.pick_body = wp.array([-1], dtype=int, device=(model.device if model else "cpu"))
77-82: Avoid host round-trip just to check active pick

is_picking() calls .numpy() which forces a host read/sync. If pick_body lives on device (see earlier comment), this incurs a transfer every call.

Consider tracking a small host-side flag updated when pick/release happens, or keep pick_body as a device array and use it only inside kernels; if you need a quick host predicate, store a Python-side shadow boolean you update in pick()/release() (and optionally after compute_pick_state_kernel).

83-97: Type assumptions for ray inputs

update() assumes ray_start/ray_dir are wp.vec3. If backend passes tuples/np arrays, the kernel call may fail or incur implicit conversions. Minimal validation or conversion at the boundary would harden this.

Example:
-    def update(self, ray_start, ray_dir):
+    def update(self, ray_start, ray_dir):
+        # Coerce to wp.vec3 for robustness
+        ray_start = ray_start if isinstance(ray_start, wp.vec3) else wp.vec3(*ray_start)
+        ray_dir = ray_dir if isinstance(ray_dir, wp.vec3) else wp.vec3(*ray_dir)
newton/examples/diffsim/example_diffsim_cloth.py (2)
178-181: Print numeric loss value instead of array object

print(f"... Loss: {self.loss}") prints the array wrapper, not its value.
-                print(f"Train iter: {self.train_iter} Loss: {self.loss}")
+                print(f"Train iter: {self.train_iter} Loss: {self.loss.numpy()[0]:.6f}")
221-235: --device argument is unused

The CLI parses --device but never applies it. Align with other examples by scoping the device for the run.
-    example = Example(viewer_type=args.viewer, stage_path=args.stage_path, verbose=args.verbose)
+    with wp.ScopedDevice(args.device):
+        example = Example(viewer_type=args.viewer, stage_path=args.stage_path, verbose=args.verbose)
newton/examples/diffsim/example_diffsim_bear.py (1)
38-39: Unconditional pxr/USD import prevents running with non-USD setups

This example imports Usd/UsdGeom at module import and always loads a USD asset, even for GL/Rerun viewers. Enforce lazy/optional USD usage to avoid import-time failures for users without USD.

Leverage the project’s pattern of guarded USD imports:
-import warp.optim
-from pxr import Usd, UsdGeom
+import warp.optim
+try:
+    from pxr import Usd, UsdGeom
+except ImportError:
+    Usd = UsdGeom = None
Then guard usage in init:
-        asset_stage = Usd.Stage.Open(os.path.join(newton.examples.get_asset_directory(), "bear.usd"))
+        if Usd is None:
+            raise ImportError("usd-core package is required for this example. Install with: pip install usd-core")
+        asset_stage = Usd.Stage.Open(os.path.join(newton.examples.get_asset_directory(), "bear.usd"))
Alternatively, move the import and open into the block where you need them to avoid import-time failures entirely.
newton/_src/viewer/kernels.py (1)

191-265: Joint basis lines: supported joint types gated; colors and transforms look good.

The axis extraction and color coding are correct. Consider scaling by local geometry or collision radii later for better readability.

newton/_src/viewer/viewer_usd.py (2)

122-151: Prototype resolution is path-based; ensure mesh argument matches stored prototype path.

log_instances expects mesh to be a key in self._meshes. ViewerBase.log_shapes caches geometry and returns a path; as long as that same path is used here, this will work. If you pass a different alias/name, it raises at runtime. Add a short note to caller contract or normalize to the stored path.

248-272: end_frame is a no-op; consider flushing/saving periodically for long runs.

Not blocking, but giving users an option to SaveCheckpoint per N frames can be helpful for large recordings.

newton/_src/sim/builder.py (1)

3695-3746: Model finalization emits particle_group; max envs defaulted to at least 1.

Minor note: when only global (-1) entities exist, num_envs becomes 1. If true “no env” semantics are desired, consider leaving it at 0. Otherwise this is fine and simplifies downstream code.

newton/examples/assets/cartpole.usda (2)

47-66: Mass/inertia attributes on 'cart' without centerOfMass: ensure consistency

You authored diagonalInertia/mass/principalAxes on the cart but omitted physics:centerOfMass. If the importer assumes COM at the local origin, this is fine; otherwise inertia about a non-origin COM could lead to incorrect dynamics.

If COM is assumed at origin: no change needed.

If COM is estimated from geom: consider removing manual inertia and letting the importer compute.

If authoring custom inertia: consider also setting physics:centerOfMass explicitly to document intent.

139-148: New pole-to-pole joint: check limits/damping if instability observed

You added a second revolute joint between pole1 and pole2 with axis X and identity rotations. If the double-pendulum dynamics are too fast/unstable in your solver, consider adding drive or damping attributes, or tightening limits.
newton/_src/viewer/viewer.py (3)
98-106: Respect "show_contacts" option and clear lines when disabled

log_contacts disregards the viewer option. Add an early-out so UI toggles are effective and previously drawn lines get cleared.
 def log_contacts(self, contacts, state):
     """
     Creates line segments along contact normals for rendering.
@@
-        # Get contact count (handle case where it might be zero)
+        # Honor option toggle
+        if not self.options.get("show_contacts", False):
+            # Clear any previously drawn contact lines
+            self.log_lines("/contacts", None, None, None)
+            return
+
+        # Get contact count (handle case where it might be zero)
         num_contacts = contacts.rigid_contact_count.numpy()[0]
150-153: Color comment/code mismatch for contacts

Comment says orange-red, code uses green. Align for clarity.
-        # Use orange-red color for contact normals
-        line_colors = (0.0, 1.0, 0.0)
+        # Use orange-red color for contact normals
+        line_colors = (1.0, 0.4, 0.0)
83-91: Always pass instance attribute arrays to log_instances
Passing None for scales, colors, or materials can lead to runtime errors if a backend implementation assumes all inputs are array-like. Instead, always forward the arrays and let each backend decide whether to skip re-uploading unchanged data.

• File: newton/_src/viewer/viewer.py (lines 83–91)
-            self.log_instances(
-                shapes.name,
-                shapes.mesh,
-                shapes.world_xforms,
-                shapes.scales if self.model_changed else None,
-                shapes.colors if self.model_changed else None,
-                shapes.materials if self.model_changed else None,
-            )
+            self.log_instances(
+                shapes.name,
+                shapes.mesh,
+                shapes.world_xforms,
+                shapes.scales,
+                shapes.colors,
+                shapes.materials,
+            )
This change guarantees backends always receive valid arrays and can internally handle unchanged fields.
newton/examples/diffsim/example_diffsim_drone.py (1)

631-638: Solver.step signature: verify None placeholders match API

You pass None for intermediate arguments. Ensure SolverSemiImplicit.step matches (state_in, state_out, control, contacts, dt) or equivalent in current API. If not, wire control state explicitly.
newton/_src/viewer/viewer_gl.py (3)
198-206: Cache of control data is never populated; joint_f values will be unavailable

log_state caches self._last_state but not control, yet _render_attribute_values tries to read joint_f from _last_control. Consider adding a log_control(control) or passing control into log_state to cache it, or gate UI accordingly.

379-384: Remove debug print of ray on right-click

Printing each pick ray will spam stdout. Drop it or guard behind a debug flag.
-            ray_start, ray_dir = self.camera.get_world_ray(x, y)
-            print(f"ray: start {ray_start}, dir {ray_dir}")
-            self.picking.pick(self._last_state, ray_start, ray_dir)
+            ray_start, ray_dir = self.camera.get_world_ray(x, y)
+            self.picking.pick(self._last_state, ray_start, ray_dir)
313-316: Fix truncated docstring text ("ent")

The docstring line breaks mid-word; clean it up for readability.
-                 String examples: 'w', 'a', 's', 'd', 'space', 'escape', 'ent
+                 String examples: 'w', 'a', 's', 'd', 'space', 'escape', 'enter'

coderabbitai · 2025-08-20T16:00:08Z

⚠️ Potential issue

Bug: scene-mesh loop uses m instead of m_geom (wrong normals/color source).

Inside the trimesh.Scene branch you iterate for m_geom in m.geometry.values() but still pass m.vertex_normals and m.visual.main_color to Mesh. This yields incorrect data (or crashes) when m is a Scene. Use m_geom consistently and guard color defaults.

Apply this diff:

- m_mesh = Mesh( - m_vertices, - m_faces, - m.vertex_normals, - color=np.array(m.visual.main_color) / 255.0, - maxhullvert=maxhullvert, - ) + # Use per-geometry attributes; guard optional color + color = None + if getattr(m_geom, "visual", None) is not None and getattr(m_geom.visual, "main_color", None) is not None: + color = np.array(m_geom.visual.main_color, dtype=np.float32) / 255.0 + m_mesh = Mesh( + m_vertices, + m_faces, + m_geom.vertex_normals if hasattr(m_geom, "vertex_normals") else None, + color=color, + maxhullvert=maxhullvert, + )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

m_mesh = Mesh(

m_vertices,

m_faces,

m.vertex_normals,

color=np.array(m.visual.main_color) / 255.0,

maxhullvert=maxhullvert,

)

# Use per-geometry attributes; guard optional color

color = None

if getattr(m_geom, "visual", None) is not None and getattr(m_geom.visual, "main_color", None) is not None:

color = np.array(m_geom.visual.main_color, dtype=np.float32) / 255.0

m_mesh = Mesh(

m_vertices,

m_faces,

m_geom.vertex_normals if hasattr(m_geom, "vertex_normals") else None,

color=color,

maxhullvert=maxhullvert,

)

coderabbitai · 2025-08-20T16:00:09Z

🛠️ Refactor suggestion

Guard optional normals/color for single-mesh loads.

Not all OBJ/DAE meshes have visual.main_color; vertex_normals may also be absent until computed. Add safe defaults to avoid attribute errors.

Apply this diff:

- m_mesh = Mesh( - m_vertices, - m_faces, - m.vertex_normals, - color=np.array(m.visual.main_color) / 255.0, - maxhullvert=maxhullvert, - ) + color = None + if getattr(m, "visual", None) is not None and getattr(m.visual, "main_color", None) is not None: + color = np.array(m.visual.main_color, dtype=np.float32) / 255.0 + m_mesh = Mesh( + m_vertices, + m_faces, + m.vertex_normals if hasattr(m, "vertex_normals") else None, + color=color, + maxhullvert=maxhullvert, + )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

m_mesh = Mesh(

m_vertices,

m_faces,

m.vertex_normals,

color=np.array(m.visual.main_color) / 255.0,

maxhullvert=maxhullvert,

)

color = None

if getattr(m, "visual", None) is not None and getattr(m.visual, "main_color", None) is not None:

color = np.array(m.visual.main_color, dtype=np.float32) / 255.0

m_mesh = Mesh(

m_vertices,

m_faces,

m.vertex_normals if hasattr(m, "vertex_normals") else None,

color=color,

maxhullvert=maxhullvert,

)

🤖 Prompt for AI Agents

In newton/_src/utils/import_mjcf.py around lines 371-377, the Mesh constructor assumes m.vertex_normals and m.visual.main_color always exist; guard these to avoid attribute errors by using safe fallbacks: fetch normals = getattr(m, "vertex_normals", None) (or compute normals beforehand if required) and color_raw = getattr(getattr(m, "visual", None), "main_color", [255, 255, 255]) then set color = np.array(color_raw) / 255.0; pass these guarded values into Mesh so single-mesh loads without normals or color use sensible defaults.

coderabbitai · 2025-08-20T16:00:09Z

⚠️ Potential issue

Potential issue with vector normalization.

The line attempts to normalize a PyVec3 by dividing by a NumPy norm, which may not work as expected since PyVec3 likely doesn't support division by scalar the same way NumPy arrays do.

Consider using the PyVec3 normalize method instead:

- direction = direction / float(np.linalg.norm(direction)) + direction = direction.normalize()

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

direction = front + u * alpha * aspect_ratio * right + v * alpha * up

direction = direction / float(np.linalg.norm(direction))

direction = front + u * alpha * aspect_ratio * right + v * alpha * up

direction = direction.normalize()

🤖 Prompt for AI Agents

In newton/_src/viewer/camera.py around lines 178-179, the code divides a PyVec3 instance by a NumPy-computed norm which can fail because PyVec3 may not support NumPy-style scalar division; replace this with PyVec3's own normalization API — either call its normalize() method on the direction vector or compute its magnitude with the PyVec3 length/length_squared method and then divide using PyVec3's scalar division (or a provided normalized/normalized_copy helper) so the vector is normalized via the class's methods rather than numpy.linalg.norm.

coderabbitai · 2025-08-20T16:00:11Z

⚠️ Potential issue

NameError when viewer_type != "gl": ViewerGL is not imported in those branches

ViewerGL is referenced in isinstance(self.viewer, ViewerGL) even when it hasn’t been imported (USD/Rerun cases), causing a NameError.

Apply one of the following:

Guard by attribute presence (simple and decoupled from imports):

- if isinstance(self.viewer, ViewerGL): + if hasattr(self.viewer, "camera"): pos = type(self.viewer.camera.pos)(12.5, 0.0, 2.0) self.viewer.camera.pos = pos

Or, compare by string/duck-type (or import ViewerGL unconditionally if safe).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if isinstance(self.viewer, ViewerGL):

pos = type(self.viewer.camera.pos)(12.5, 0.0, 2.0)

self.viewer.camera.pos = pos

if hasattr(self.viewer, "camera"):

pos = type(self.viewer.camera.pos)(12.5, 0.0, 2.0)

self.viewer.camera.pos = pos

🤖 Prompt for AI Agents

In newton/examples/diffsim/example_diffsim_cloth.py around lines 131 to 134, the code calls isinstance(self.viewer, ViewerGL) but ViewerGL may not be imported in non-GL branches, causing a NameError; change the guard to not rely on the symbol ViewerGL (e.g., check for the attribute you need instead or compare the viewer's class name), then only set camera.pos when that attribute exists; alternatively, if you prefer the explicit type check, import ViewerGL inside a try/except and fall back to an attribute- or name-based check when import fails.

coderabbitai · 2025-08-20T16:00:11Z

🛠️ Refactor suggestion

Forward pass reuses mutated trajectory; reset to initial conditions each iteration (CUDA-graph safe)

The optimization updates initial velocity states[0].particle_qd but does not reset the trajectory buffers to the same initial conditions before re-simulating. This accumulates state across iterations and invalidates gradients for “initial velocity” optimization. With CUDA graph capture, you must reset in-place to preserve captured pointers.

Example in-place reset after the optimizer step and before the next iteration:

def step(self): with wp.ScopedTimer("step"): if self.graph: wp.capture_launch(self.graph) else: self.forward_backward() # gradient descent step x = self.states[0].particle_qd if self.verbose: print(f"Train iter: {self.train_iter} Loss: {self.loss}") wp.launch(step_kernel, dim=len(x), inputs=[x, x.grad, self.train_rate]) + # reset trajectory to initial conditions without reallocating buffers + init = self.model.state(requires_grad=True) + # Keep updated initial velocities (x already updated in-place) + wp.copy(init.particle_q, self.states[0].particle_q) + wp.copy(init.body_q, self.states[0].body_q) + wp.copy(init.body_qd, self.states[0].body_qd) + # Optionally zero downstream states; they will be overwritten by the next forward() + for i in range(1, self.sim_steps + 1): + self.states[i].clear_forces() + # If your State exposes q/qd/buffer arrays, zero/assign them here as needed # clear grads for next iteration self.tape.zero() self.train_iter = self.train_iter + 1

Note: Use wp.copy (or the equivalent) to preserve memory addresses for CUDA graph replay.

coderabbitai · 2025-08-20T16:00:11Z

🛠️ Refactor suggestion

Reset trajectory between iterations (and do it in-place if using CUDA graphs)

Each iteration replays from the previous iteration’s terminal state, skewing optimization. For CUDA graph replay, do not reallocate arrays; copy initial conditions into existing state buffers.

def step(self): with wp.ScopedTimer("step"): if self.use_cuda_graph: wp.capture_launch(self.graph) else: self.tape = wp.Tape() with self.tape: self.forward() self.tape.backward(loss=self.loss) wp.launch( apply_gradient_kernel, dim=self.model.spring_count, inputs=( self.model.spring_rest_length.grad, self.train_rate, ), outputs=(self.model.spring_rest_length,), ) - self.tape.zero() + # Reset trajectory to initial conditions without reallocating + init = self.model.state(requires_grad=True) + wp.copy(init.particle_q, self.states[0].particle_q) + wp.copy(init.particle_qd, self.states[0].particle_qd) + for i in range(1, len(self.states)): + self.states[i].clear_forces() + + self.tape.zero()

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2025-08-20T16:00:12Z

⚠️ Potential issue

Setting/reading a D6 joint “angle” via joint_q start index is likely incorrect.

For a D6 joint, joint_q typically encodes 7 coordinates (tx, ty, tz, qw, qx, qy, qz). Writing a single scalar at q_start (as done for revolute) does not set an angular displacement for D6, and reading q_cur[d6_qi] later returns a translation or quaternion component depending on layout. This makes the comparison against the revolute angle invalid.

Proposed fix: set the D6 quaternion sub-block to a rotation about the up-axis, and extract yaw from the quaternion when recording the trajectory.

Apply this diff:

- q_start_np = model.joint_q_start.numpy() - rev_qi = q_start_np[idx_rev] - d6_qi = q_start_np[idx_d6] + q_start_np = model.joint_q_start.numpy() + q_dim_np = np.diff(q_start_np) + rev_qi = int(q_start_np[idx_rev]) + d6_qi = int(q_start_np[idx_d6]) + d6_q_dim = int(q_dim_np[idx_d6]) # Set initial positions - q0_model[rev_qi] = initial_angle - q0_model[d6_qi] = initial_angle + q0_model[rev_qi] = initial_angle + if d6_q_dim >= 7: + # D6 stores (tx, ty, tz, qw, qx, qy, qz) in joint_q; set orientation about Z + dq = wp.quat_from_axis_angle(wp.vec3(0.0, 0.0, 1.0), initial_angle) + q0_model[d6_qi + 3 : d6_qi + 7] = [dq.w, dq.x, dq.y, dq.z] + else: + # Fallback: if the D6 exposes a single angular coord, set it directly + q0_model[d6_qi] = initial_angle model.joint_q.assign(q0_model)

And when sampling:

- traj[i, 1] = q_cur[d6_qi] + if d6_q_dim >= 7: + qw, qx, qy, qz = q_cur[d6_qi + 3 : d6_qi + 7] + # yaw(Z) from quaternion (Z-up) + siny_cosp = 2.0 * (qw * qz + qx * qy) + cosy_cosp = 1.0 - 2.0 * (qy * qy + qz * qz) + traj[i, 1] = np.arctan2(siny_cosp, cosy_cosp) + else: + traj[i, 1] = q_cur[d6_qi]

🤖 Prompt for AI Agents

In newton/tests/test_pendulum_revolute_vs_d6.py around lines 65 to 74, the test incorrectly writes a single scalar into the D6 joint q-start index (which stores tx,ty,tz,qw,qx,qy,qz) and later reads that index as if it were the revolute angle. Replace the single-scalar write with assigning the full 7-element D6 block: leave translations zero and set the quaternion (qw,qx,qy,qz) to represent a rotation about the up-axis by initial_angle; similarly, when sampling the trajectory, do not read q_cur[d6_qi] directly but extract the quaternion sub-block and compute yaw (e.g., via atan2/asin conversion or an axis-angle->quat->yaw conversion) to compare against the revolute angle. Ensure you use the same quaternion ordering as the model (tx,ty,tz,qw,qx,qy,qz) and update both the initial assignment and the sampling/extraction code accordingly.

Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

… into example_example_diffsim_soft_body.py Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

…ple_diffsim_cloth.py Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

…ffsim_drone.py and add its corresponding asset Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

Signed-off-by: Alain Denzler <adenzler@nvidia.com> Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

Signed-off-by: Antoine Richard <antoiner@nvidia.com> Signed-off-by: camevor <camevor@nvidia.com> Co-authored-by: camevor <camevor@nvidia.com> Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

…sics#516) Signed-off-by: Antoine Richard <antoiner@nvidia.com> Signed-off-by: Viktor Reutskyy <vreutskyy@nvidia.com> Co-authored-by: Viktor Reutskyy <vreutskyy@nvidia.com> Co-authored-by: Viktor Reutskyy <33062116+vreutskyy@users.noreply.github.com> Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

Signed-off-by: Alain Denzler <adenzler@nvidia.com> Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

Signed-off-by: Philipp Reist <preist@nvidia.com> Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

…n-physics#493 (newton-physics#584) Signed-off-by: Viktor Reutskyy <vreutskyy@nvidia.com> Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

Signed-off-by: Julia von Muralt <jvonmuralt@nvidia.com> Co-authored-by: Eric Heiden <eheiden@nvidia.com> Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

…ugh the builder / model. newton-physics#536 (newton-physics#581) Signed-off-by: Viktor Reutskyy <vreutskyy@nvidia.com> Co-authored-by: Eric Heiden <eheiden@nvidia.com> Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

Signed-off-by: Wenchao Huang <huangwenchao@linctex.com> Co-authored-by: Eric Shi <97630937+shi-eric@users.noreply.github.com> Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

Signed-off-by: mzamoramora-nvidia <mzamoramora@nvidia.com>

Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (5)

newton/examples/diffsim/example_diffsim_bear.py (3)

251-257: Out-of-bounds risk in tiled GEMM when tet_count % 8 != 0

dim=ceil(tet_count / TILE_TETS) schedules a full tile for the tail without masking; tile_load/store will read/write past the end. Split full tiles and handle the tail separately.

Apply this diff:

-            wp.launch_tiled(
-                kernel=network,
-                dim=math.ceil(self.model.tet_count / TILE_TETS),
-                inputs=[self.phases[frame].reshape((self.phase_count, 1)), self.weights],
-                outputs=[self.tet_activations[frame].reshape((self.model.tet_count, 1))],
-                block_dim=TILE_THREADS,
-            )
+            full_tiles = self.model.tet_count // 8  # TILE_TETS
+            if full_tiles > 0:
+                wp.launch_tiled(
+                    kernel=network,
+                    dim=full_tiles,
+                    inputs=[self.phases[frame].reshape((self.phase_count, 1)), self.weights],
+                    outputs=[self.tet_activations[frame].reshape((self.model.tet_count, 1))],
+                    block_dim=TILE_THREADS,
+                )
+            tail = self.model.tet_count - full_tiles * 8  # TILE_TETS
+            if tail > 0:
+                wp.launch(
+                    kernel=network_tail,
+                    dim=tail,
+                    inputs=[
+                        self.phases[frame],
+                        self.weights,  # [tet_count, phase_count]
+                        self.tet_activations[frame],
+                        full_tiles * 8,  # row offset
+                        self.phase_count,
+                    ],
+                    outputs=[],
+                )

Add this helper kernel somewhere above Example.forward:

@wp.kernel
def network_tail(
    phases: wp.array(dtype=float),
    weights: wp.array2d(dtype=float),         # [tet_count, phase_count]
    tet_activations: wp.array(dtype=float),    # [tet_count]
    row_offset: int,
    phase_count: int,
):
    r = row_offset + wp.tid()
    acc = float(0.0)
    for c in range(phase_count):
        acc += weights[r, c] * phases[c]
    tet_activations[r] = wp.tanh(acc) * ACTIVATION_STRENGTH

262-270: Contacts must be recomputed each substep; current code reuses stale manifolds

Contacts are computed once in __init__ (Line 192) and reused, which produces incorrect dynamics as the mesh moves.

Apply this diff:

-            for i in range(self.sim_substeps):
-                self.states[frame * self.sim_substeps + i].clear_forces()
-                self.solver.step(
-                    self.states[frame * self.sim_substeps + i],
-                    self.states[frame * self.sim_substeps + i + 1],
-                    self.control,
-                    self.contacts,
-                    self.sim_dt,
-                )
+            for i in range(self.sim_substeps):
+                idx = frame * self.sim_substeps + i
+                self.states[idx].clear_forces()
+                # Recompute contacts at current state
+                self.contacts = self.model.collide(self.states[idx], soft_contact_margin=0.05)
+                self.solver.step(
+                    self.states[idx],
+                    self.states[idx + 1],
+                    self.control,
+                    self.contacts,
+                    self.sim_dt,
+                )

303-306: CUDA graph correctness: don’t reassign state objects after capture

Reassigning self.states[0] breaks captured pointers; the graph will still write to old buffers. Reset in-place.

Apply this diff:

-        # reset sim
-        self.sim_time = 0.0
-        self.states[0] = self.model.state(requires_grad=True)
+        # reset sim (in-place to preserve captured pointers)
+        self.sim_time = 0.0
+        init = self.model.state(requires_grad=True)
+        wp.copy(init.particle_q,  self.states[0].particle_q)
+        wp.copy(init.particle_qd, self.states[0].particle_qd)
+        wp.copy(init.body_q,      self.states[0].body_q)
+        wp.copy(init.body_qd,     self.states[0].body_qd)

newton/examples/diffsim/example_diffsim_drone.py (2)

136-137: Replace invalid wp.float casts; use wp.float32 (or Python float) to avoid JIT failure

As-is, this breaks Warp JIT. Use wp.float32(...) or float(...).

Apply:

-    discount = 0.8 ** wp.float(horizon_length - step - 1) / wp.float(horizon_length) ** 2.0
+    discount = 0.8 ** wp.float32(horizon_length - step - 1) / (wp.float32(horizon_length) ** 2.0)

814-826: Fix rollout rendering: degenerate lines (start == end) and incorrect body index per env

Nothing is drawn because begins == ends; also indexing uses body index i instead of env i’s first body (i*bodies_per_env).

Apply:

-            positions = np.array([x.body_q.numpy()[:, :3] for x in self.rollouts.states])
+            # World translations for the first body of each rollout env across time
+            bodies_per_env = self.rollouts.body_count
+            positions = np.stack(
+                [s.body_q.numpy()[0 : self.rollout_count * bodies_per_env : bodies_per_env, :3]
+                 for s in self.rollouts.states],
+                axis=0,
+            )
@@
-            for i in range(self.rollout_count):
-                # Flip colors, so red means best trajectory, blue worst.
-                color = wp.render.bourke_color_map(-max_cost, -min_cost, -costs[i])
-                self.viewer.log_lines(
-                    f"/rollout_{i}",
-                    wp.array(positions[:, i], dtype=wp.vec3),
-                    wp.array(positions[:, i], dtype=wp.vec3),
-                    wp.array([color], dtype=wp.vec3),
-                )
+            for i in range(self.rollout_count):
+                # Flip colors, so red means best trajectory, blue worst.
+                color = wp.render.bourke_color_map(-max_cost, -min_cost, -costs[i])
+                path = positions[:, i]  # [T, 3]
+                if path.shape[0] >= 2:
+                    begins = wp.array(path[:-1], dtype=wp.vec3)
+                    ends = wp.array(path[1:], dtype=wp.vec3)
+                    self.viewer.log_lines(
+                        f"/rollout_{i}",
+                        begins,
+                        ends,
+                        wp.array([color], dtype=wp.vec3),
+                    )

🧹 Nitpick comments (18)

newton/examples/diffsim/example_diffsim_bear.py (6)
318-323: Guard viewer logging; USD/Rerun viewers may not expose log_state

ViewerUSD/ViewerRerun might not implement log_state; calling it unconditionally can crash. Guard or provide viewer-specific logging.

Apply this diff:
-            self.viewer.begin_frame(self.render_time)
-            self.viewer.log_state(self.states[i * self.sim_substeps])
-            self.viewer.end_frame()
+            self.viewer.begin_frame(self.render_time)
+            if hasattr(self.viewer, "log_state"):
+                self.viewer.log_state(self.states[i * self.sim_substeps])
+            else:
+                # TODO: implement per-viewer logging (e.g., log_mesh/log_instances)
+                pass
+            self.viewer.end_frame()
Please confirm that ViewerUSD and ViewerRerun expose log_state; if they do, we can drop the guard.

52-61: Smoother loss: prefer L2 lateral/vertical speed penalty

sqrt(vy*vy) + sqrt(vz*vz) is L1; switch to L2 (sqrt(vy*vy + vz*vz)) for smoother gradients at zero.

Apply this diff:
-    delta = wp.sqrt(vy * vy) + wp.sqrt(vz * vz) - vx
+    delta = wp.sqrt(vy * vy + vz * vz) - vx
48-50: Host/device constant split to avoid type gotchas

TILE_TETS as a wp.constant is used in host math (ceil/div). Use a Python int for host code; keep kernel shapes literal-8 or introduce a separate TILE_TETS_WP if desired.

Apply this diff:
-TILE_TETS = wp.constant(8)
-TILE_THREADS = 64
+TILE_TETS = 8           # host-side int; use literal 8 in kernel tile shapes
+TILE_THREADS = 64
183-189: Memory footprint: states allocation scales as O(num_frames * substeps)

Allocating ~24k states for defaults (300*80) is heavy. If training allows, consider a rolling buffer (sim_substeps+1 states) per frame or chunked unroll to reduce memory.

126-143: Asset robustness: handle missing bear.usd gracefully

Opening the USD without existence check will throw; consider checking path and raising a clear error or falling back to a procedural mesh for tests.

327-359: Tests and CLI ergonomics

Per PR objectives, example tests are still missing. Add a minimal smoke test that runs 1 training iter on CPU with tiny num_frames (e.g., 2) and sim_substeps (e.g., 2) to exercise forward/backward without viewers.

I can draft newton/tests/test_examples.py that:

Instantiates Example(viewer_type="gl", num_frames=2) with wp.ScopedDevice("cpu")

Calls example.step() once and asserts finite loss.

Skips if assets are unavailable.
newton/examples/diffsim/example_diffsim_drone.py (12)
249-257: Guard against out-of-bounds when interpolating final control point

If t reaches the last knot, t_id + 1 can index past the last element. It’s unlikely due to how sim_tick is advanced, but add a clamp for robustness.

Apply:
-    t_id = int(t)
-    frac = t - wp.floor(t)
-    control_left = control_points[env_id, t_id, control_id]
-    control_right = control_points[env_id, t_id + 1, control_id]
+    t_id = int(t)
+    frac = t - wp.floor(t)
+    t_right = wp.min(t_id + 1, control_points.shape[1] - 1)  # or pass max index in as a kernel arg
+    control_left = control_points[env_id, t_id, control_id]
+    control_right = control_points[env_id, t_right, control_id]
If arr.shape[...] isn’t supported in kernels for your Warp version, add control_points_count as an input and clamp against control_points_count - 1.

551-563: Headless flag is unused and cannot be set False via CLI

headless is accepted in Example.init but not used, and argparse sets it to True by default with store_true (so it can’t be disabled). Wire it to viewers that can run headless, or drop it.

Apply:
-    parser.add_argument(
-        "--headless",
-        default=True,
-        action="store_true",
-        help="Run in headless mode, suppressing the opening of any graphical windows.",
-    )
+    parser.add_argument(
+        "--headless",
+        action="store_true",
+        default=False,
+        help="Run in headless mode, suppressing the opening of any graphical windows.",
+    )
And in init, respect headless (e.g., pass through to ViewerGL/ViewerRerun if supported).

Also applies to: 456-464

739-757: Seed handling and optimizer capture: ensure determinism across frames when desired

Incrementing a single-seed array once per frame is fine, but CUDA graph capture will replay the same random sequence unless the seed is mutated between capture launches. Also, reusing a single Tape across many optim steps without re-capture is correct here, but be cautious if you later add data-dependent control flow.

If you need per-frame stochasticity with graph capture:

Include seed value as a graph input (or re-capture per target switch), or

Move increment_seed before capture and pass the updated seed into sample_gaussian.

Also applies to: 536-547

160-218: Collision SDF scaling for meshes is a placeholder; clarify or fix scaling behavior

The mesh branch scales by min_scale after sampling in scaled coordinates with a TODO note. This can mis-estimate distances for non-uniform scales.

Consider:

Uniform-only assumption with an assert, or

Properly apply scale via transforming the query and normalizing by the correct Jacobian term, or

Pre-bake scale into geometry and keep geo_scale = (1,1,1) for meshes.

798-809: Use current target value when rendering to avoid transient -1 index pitfalls

Render is called after step() in main, so this is safe, but defensively consider clamping or defaulting when self.target_idx == -1 (e.g., first ever render call outside main).

Example:
-            (0.05,),
-            wp.array([wp.transform(self.targets[self.target_idx], wp.quat_identity())], dtype=wp.transform),
+            (0.05,),
+            wp.array([wp.transform(self.targets[max(self.target_idx, 0)], wp.quat_identity())], dtype=wp.transform),
128-135: Control effort term: consider normalizing by 4 and/or using per-motor limits

Minor: with 4 props, wp.dot(control, control) grows with dimension. Either keep as-is and tune weights, or divide by control_dim so weights are scale-invariant across different rotor counts.

471-479: Consider using separate sim_dt for optimizer vs execution

Currently, optimizer and execution share sim_dt; sometimes a finer dt for optimization improves stability. Not required, but worth noting.

16-24: Docs/migration.rst: add a short note that this example is Newton-only, not a warp.sim parity item

Per our previous learning, migration docs should cover warp.sim-to-Newton parity; this example can be referenced as a “new example,” not a migrated API surface.

I can draft a brief migration note and a short “How to run” section for docs/examples if helpful.

831-887: CLI plumbing and main loop are serviceable; consider gating per-frame loss prints behind --verbose

Currently prints every frame regardless of --verbose.

Apply:
-            loss = np.min(example.rollout_costs.numpy())
-            print(f"[{example.frame:3d}/{example.num_frames}] loss={loss:.8f}")
+            if args.verbose:
+                loss = np.min(example.rollout_costs.numpy())
+                print(f"[{example.frame:3d}/{example.num_frames}] loss={loss:.8f}")
171-176: Indexing into body_q by env_id assumes exactly one body per env

This holds today (bodies_per_env == 1), but if the drone gains more links/bodies, these kernels will under-address. Document or generalize to env_id * bodies_per_env + body_idx.

62-84: Gaussian sampling with rejection is fine; you can avoid the loop by sampling truncated normal

Optional: replace rejection with a closed-form truncated normal sampler if bounds get tight.

771-785: Copying best rollout back to reference: verify intended index [-1] is reserved for nominal

You sample only rollouts.shape[0] - 1 envs and then assign [-1] from the reference. That matches the “reserve last rollout for nominal” pattern; just confirm tests cover this.

I can add a minimal test to assert rollouts.trajectories[-1] == drone.trajectories[0] post-step.

📜 Review details

Configuration used: .coderabbit.yml
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 64c346a921acd861152dd39010c1dd5925fced32 and 6d21e98.

📒 Files selected for processing (6)

newton/examples/diffsim/example_diffsim_ball.py (1 hunks)
newton/examples/diffsim/example_diffsim_bear.py (1 hunks)
newton/examples/diffsim/example_diffsim_cloth.py (1 hunks)
newton/examples/diffsim/example_diffsim_drone.py (1 hunks)
newton/examples/diffsim/example_diffsim_soft_body.py (1 hunks)
newton/examples/diffsim/example_diffsim_spring_cage.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (4)

newton/examples/diffsim/example_diffsim_soft_body.py
newton/examples/diffsim/example_diffsim_spring_cage.py
newton/examples/diffsim/example_diffsim_cloth.py
newton/examples/diffsim/example_diffsim_ball.py

🧰 Additional context used

🧠 Learnings (1)

📓 Common learnings

Learnt from: shi-eric
PR: newton-physics/newton#521
File: newton/examples/example_cloth_hanging.py:36-36
Timestamp: 2025-08-12T05:17:34.423Z
Learning: The Newton migration guide (docs/migration.rst) is specifically for documenting how to migrate existing warp.sim functionality to Newton equivalents. New Newton-only features that didn't exist in warp.sim do not need migration documentation.

🧬 Code Graph Analysis (2)

newton/examples/diffsim/example_diffsim_bear.py (6)

newton/examples/__init__.py (1)

get_asset_directory (27-28)

newton/_src/sim/builder.py (5)

add_soft_mesh (3488-3574)

add_ground_plane (2153-2173)

ShapeConfig (142-194)

particle_count (506-507)

tet_count (514-515)

newton/_src/sim/model.py (1)

collide (510-564)

newton/_src/viewer/viewer_usd.py (1)

ViewerUSD (14-321)

newton/_src/viewer/viewer_rerun.py (1)

ViewerRerun (30-247)

newton/_src/viewer/viewer_gl.py (1)

ViewerGL (33-1061)

newton/examples/diffsim/example_diffsim_drone.py (4)

newton/_src/geometry/kernels.py (7)

box_sdf (188-196)

capsule_sdf (230-237)

cone_sdf (268-272)

cylinder_sdf (252-255)

mesh_sdf (523-533)

plane_sdf (295-300)

sphere_sdf (178-179)

newton/examples/__init__.py (1)

get_asset_directory (27-28)

newton/_src/sim/builder.py (4)

ModelBuilder (67-4049)

add_shape_box (2208-2245)

ShapeConfig (142-194)

add_shape_capsule (2247-2287)

newton/_src/geometry/types.py (2)

GeoType (25-36)

SDF (43-70)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: run-newton-tests / newton-unittests (ubuntu-latest)
GitHub Check: run-newton-tests / newton-unittests (windows-latest)

🔇 Additional comments (7)

newton/examples/diffsim/example_diffsim_bear.py (2)

1-15: SPDX header looks correct

License block is complete and compliant.

206-207: Optimizer wiring: verify param/grads shapes and zeroing

Using Adam([self.weights.flatten()]) and passing self.weights.grad.flatten() is consistent, but please verify that tape.zero() resets self.weights.grad between iterations and no stale grads accumulate.

Minimal sanity check during development: print max-norm of grad before/after tape.zero() and ensure it resets to 0.

Also applies to: 296-298

newton/examples/diffsim/example_diffsim_drone.py (5)

412-438: Good separation between reference drone and rollout ensemble

Use of requires_grad on rollouts, per-env replication, and SGD on flattened trajectories is clean and idiomatic for Warp differentiable examples.

Also applies to: 522-548

335-363: ModelBuilder usage LGTM; densities and collision groups wired correctly

Two box shapes for crossbars with realistic density and per-env collision grouping look good.

Also applies to: 394-406

539-548: Tape/optimizer loop is correctly structured

Forward -> set cost grad -> backward -> optimizer.step -> clamp controls -> tape.zero() reads well.

Also applies to: 698-717

281-314: Propeller struct fields look complete and are used correctly in compute_prop_wrenches

No issues spotted.

220-228: Control limits enforcement is correct and efficient

In-place clamp over [env, t, control] is straightforward.

coderabbitai · 2025-08-20T22:20:18Z

+
+        self.control = self.model.control()
+
+        self.contacts = self.model.collide(self.states[0], soft_contact_margin=40.0)


⚠️ Potential issue

Unrealistic soft_contact_margin=40.0 will over-broaden contacts

A 40.0 margin effectively makes everything contact everything; it destabilizes gradients and dynamics. Use a small margin (e.g., 0.01–0.05).

Apply this diff:

- self.contacts = self.model.collide(self.states[0], soft_contact_margin=40.0) + self.contacts = self.model.collide(self.states[0], soft_contact_margin=0.05)

Note: With the per-substep recompute (see Lines 262-270 fix), the initial value matters less but should still be reasonable.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

self.contacts = self.model.collide(self.states[0], soft_contact_margin=40.0)

self.contacts = self.model.collide(self.states[0], soft_contact_margin=0.05)

🤖 Prompt for AI Agents

In newton/examples/diffsim/example_diffsim_bear.py around line 192, the soft_contact_margin is set to an unrealistic 40.0 which effectively makes everything contact everything and will destabilize gradients and dynamics; change it to a small reasonable value (e.g., 0.01–0.05) such as 0.02 to avoid over-broadening contacts, and ensure this initial margin is consistent with the per-substep recompute behavior mentioned later so gradients and dynamics remain stable.

coderabbitai · 2025-08-20T22:20:18Z

+        if isinstance(self.viewer, ViewerGL):
+            pos = type(self.viewer.camera.pos)(0.0, -25.0, 2.0)
+            self.viewer.camera.pos = pos
+            self.viewer.camera.yaw = 90.0
+


⚠️ Potential issue

Fix NameError: ViewerGL is undefined when viewer_type != "gl"

ViewerGL is imported only in the else-branch (Line 218). Referencing it here will raise NameError for "usd"/"rerun". Check viewer_type instead.

Apply this diff:

- if isinstance(self.viewer, ViewerGL): + if viewer_type == "gl": pos = type(self.viewer.camera.pos)(0.0, -25.0, 2.0) self.viewer.camera.pos = pos self.viewer.camera.yaw = 90.0

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if isinstance(self.viewer, ViewerGL):

pos = type(self.viewer.camera.pos)(0.0, -25.0, 2.0)

self.viewer.camera.pos = pos

self.viewer.camera.yaw = 90.0

if viewer_type == "gl":

pos = type(self.viewer.camera.pos)(0.0, -25.0, 2.0)

self.viewer.camera.pos = pos

self.viewer.camera.yaw = 90.0

🤖 Prompt for AI Agents

In newton/examples/diffsim/example_diffsim_bear.py around lines 222 to 226, the code checks isinstance(self.viewer, ViewerGL) but ViewerGL is only imported in the else-branch (line ~218), causing a NameError when viewer_type != "gl"; replace the isinstance check with a check on viewer_type (e.g., if viewer_type == "gl":) or otherwise ensure ViewerGL is imported unconditionally before this block so the conditional does not reference an undefined name.

coderabbitai · 2025-08-20T22:20:18Z

+    rps = max_rpm / fps
+    max_speed = rps * wp.TAU  # radians / sec
+    rps_square = rps**2
+
+    prop = Propeller()
+    prop.body = drone
+    prop.pos = pos
+    prop.dir = wp.vec3(0.0, 0.0, 1.0)
+    prop.thrust = thrust
+    prop.power = power
+    prop.diameter = diameter
+    prop.height = height
+    prop.max_rpm = max_rpm
+    prop.max_thrust = thrust * air_density * rps_square * diameter**4
+    prop.max_torque = power * air_density * rps_square * diameter**5 / wp.TAU
+    prop.turning_direction = turning_direction
+    prop.max_speed_square = max_speed**2
+


💡 Verification agent

🧩 Analysis chain

Propeller speed/units are inconsistent (rpm divided by fps); use rad/s derived from rpm

Current code computes “rps” as rpm/fps but comments and downstream formulas assume rad/s. This distorts thrust/torque scaling by ~fps^2.

Apply:

- rps = max_rpm / fps - max_speed = rps * wp.TAU # radians / sec - rps_square = rps**2 + rev_per_sec = max_rpm / 60.0 + omega = rev_per_sec * wp.TAU # radians / sec + omega2 = omega * omega @@ - prop.max_thrust = thrust * air_density * rps_square * diameter**4 - prop.max_torque = power * air_density * rps_square * diameter**5 / wp.TAU + prop.max_thrust = thrust * air_density * omega2 * diameter**4 + prop.max_torque = power * air_density * omega2 * diameter**5 / wp.TAU @@ - prop.max_speed_square = max_speed**2 + prop.max_speed_square = omega2

If the intention was “per-step” quantities, rename variables and consistently use rad/step (omega / fps) everywhere.

🏁 Script executed:

#!/bin/bash # Display context around fps and rps definitions to verify unit usage sed -n '250,350p' newton/examples/diffsim/example_diffsim_drone.py

Length of output: 3252

🏁 Script executed:

#!/bin/bash # Search for calls to define_propeller to see how fps is passed through grep -R "define_propeller" -n newton/examples/diffsim/

Length of output: 493

Convert propeller speed from rpm to radians/sec

The current implementation in newton/examples/diffsim/example_diffsim_drone.py uses

rps = max_rpm / fps # rev per frame, not rev per second max_speed = rps * wp.TAU # incorrectly labeled “radians / sec” rps_square = rps**2

but the standard thrust/torque formulas expect ω in rad/s. Using revs per frame underestimates both thrust and torque by a factor of fps².

Please update the block in define_propeller (around lines 295–312) as follows:

- rps = max_rpm / fps - max_speed = rps * wp.TAU # radians / sec - rps_square = rps**2 + # convert rpm → rev/s, then to angular speed ω (rad/s) + rev_per_sec = max_rpm / 60.0 + omega = rev_per_sec * wp.TAU # rad/s + omega2 = omega * omega @@ - prop.max_thrust = thrust * air_density * rps_square * diameter**4 - prop.max_torque = power * air_density * rps_square * diameter**5 / wp.TAU + prop.max_thrust = thrust * air_density * omega2 * diameter**4 + prop.max_torque = power * air_density * omega2 * diameter**5 / wp.TAU @@ - prop.max_speed_square = max_speed**2 + prop.max_speed_square = omega2

• File: newton/examples/diffsim/example_diffsim_drone.py
• Function: define_propeller (lines ≈295–312)

This ensures all downstream calculations use ω in rad/s, matching the documented units.

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (6)

newton/examples/diffsim/example_diffsim_cloth.py (2)

131-134: Fix potential NameError when viewer_type != "gl"

ViewerGL is referenced in isinstance(self.viewer, ViewerGL) even when it hasn't been imported (USD/Rerun cases), causing a NameError.

Apply this diff to avoid the NameError:

-        if isinstance(self.viewer, ViewerGL):
+        if hasattr(self.viewer, "camera"):
             pos = type(self.viewer.camera.pos)(12.5, 0.0, 2.0)
             self.viewer.camera.pos = pos

168-187: Reset trajectory to initial conditions after optimization step

The optimization updates initial velocity states[0].particle_qd but does not reset the trajectory buffers to the same initial conditions before re-simulating. This accumulates state across iterations and invalidates gradients for "initial velocity" optimization. With CUDA graph capture, you must reset in-place to preserve captured pointers.

Apply this diff to reset the trajectory without breaking CUDA graph capture:

     def step(self):
         with wp.ScopedTimer("step"):
             if self.graph:
                 wp.capture_launch(self.graph)
             else:
                 self.forward_backward()

             # gradient descent step
             x = self.states[0].particle_qd
             if self.verbose:
                 print(f"Train iter: {self.train_iter} Loss: {self.loss}")
             wp.launch(step_kernel, dim=len(x), inputs=[x, x.grad, self.train_rate])

+            # Reset trajectory to initial conditions without reallocating buffers
+            # Preserve the updated initial velocities (already modified in-place above)
+            init = self.model.state(requires_grad=True)
+            wp.copy(init.particle_q, self.states[0].particle_q)
+            wp.copy(init.body_q, self.states[0].body_q)  
+            wp.copy(init.body_qd, self.states[0].body_qd)
+            # Clear downstream states to ensure clean forward pass
+            for i in range(1, self.sim_steps + 1):
+                self.states[i].clear_forces()
+                wp.copy(init.particle_q, self.states[i].particle_q)
+                wp.copy(init.body_q, self.states[i].body_q)
+                wp.copy(init.body_qd, self.states[i].body_qd)
+                # particle_qd will be overwritten by solver.step

             # clear grads for next iteration
             self.tape.zero()
             self.train_iter = self.train_iter + 1

newton/examples/diffsim/example_diffsim_bear.py (1)

303-305: CUDA graph compatibility: reset state in-place instead of reassigning

After capture, the graph holds pointers to the original state arrays. Reassigning self.states[0] creates a new object not used by the captured graph; replay will still write into the old buffers.

Apply this diff to reset in-place:
         # reset sim
         self.sim_time = 0.0
-        self.states[0] = self.model.state(requires_grad=True)
+        init = self.model.state(requires_grad=True)
+        wp.copy(init.particle_q, self.states[0].particle_q)
+        wp.copy(init.particle_qd, self.states[0].particle_qd)
+        wp.copy(init.body_q, self.states[0].body_q)
+        wp.copy(init.body_qd, self.states[0].body_qd)

newton/examples/diffsim/example_diffsim_soft_body.py (2)

35-41: Consider validating the assign_param kernel for isotropic mode

The assign_param kernel uses modulo indexing with params.shape[0], which works for anisotropic mode where params has 2*tet_count entries. However, for isotropic mode, params only has 2 entries total, so the modulo operation will incorrectly assign the same two parameters repeatedly.

For isotropic mode, create a separate kernel that properly broadcasts the same parameters to all tets:

+@wp.kernel
+def assign_param_isotropic(params: wp.array(dtype=wp.float32), tet_materials: wp.array2d(dtype=wp.float32)):
+    tid = wp.tid()
+    tet_materials[tid, 0] = params[0]
+    tet_materials[tid, 1] = params[1]

Then update the forward method to use the appropriate kernel:

     def forward(self):
-        wp.launch(
-            kernel=assign_param,
-            dim=self.model.tet_count,
-            inputs=(self.material_params,),
-            outputs=(self.model.tet_materials,),
-        )
+        if self.material_behavior == "isotropic":
+            wp.launch(
+                kernel=assign_param_isotropic,
+                dim=self.model.tet_count,
+                inputs=(self.material_params,),
+                outputs=(self.model.tet_materials,),
+            )
+        else:
+            wp.launch(
+                kernel=assign_param,
+                dim=self.model.tet_count,
+                inputs=(self.material_params,),
+                outputs=(self.model.tet_materials,),
+            )

308-313: Reset trajectory states between training iterations

Currently, subsequent iterations start from the last trajectory's terminal state instead of the initial conditions. For consistent optimization, each iteration should replay from the same initial conditions.

Apply this diff to reset states without breaking CUDA graph capture:

             # clear grads for next iteration
             self.tape.zero()
             self.loss.zero_()
             self.com.zero_()
             self.pos_error.zero_()
+            
+            # Reset trajectory to initial conditions without reallocating
+            init = self.model.state()
+            for i in range(self.sim_steps + 1):
+                wp.copy(init.particle_q, self.states[i].particle_q)
+                wp.copy(init.particle_qd, self.states[i].particle_qd)
+                wp.copy(init.body_q, self.states[i].body_q)
+                wp.copy(init.body_qd, self.states[i].body_qd)
+                self.states[i].clear_forces()

             self.train_iter = self.train_iter + 1

newton/examples/diffsim/example_diffsim_ball.py (1)

164-164: Fix incorrect gradient access when using CUDA graphs

When using CUDA graphs, the tape is created inside the ScopedCapture context, but you're accessing self.tape.gradients outside that context. After graph replay, the tape doesn't contain valid gradient references.

Apply this diff to properly access gradients:
             # gradient descent step
             x = self.states[0].particle_qd
             wp.launch(step_kernel, dim=len(x), inputs=[x, x.grad, self.train_rate])

-            x_grad = self.tape.gradients[self.states[0].particle_qd]
+            # Access the gradient directly from the array
+            x_grad = x.grad

             if self.verbose:
                 print(f"Train iter: {self.train_iter} Loss: {self.loss}")
                 print(f"    x: {x} g: {x_grad}")

📜 Review details

Configuration used: .coderabbit.yml
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 64c346a921acd861152dd39010c1dd5925fced32 and 6d21e98.

📒 Files selected for processing (6)

newton/examples/diffsim/example_diffsim_ball.py (1 hunks)
newton/examples/diffsim/example_diffsim_bear.py (1 hunks)
newton/examples/diffsim/example_diffsim_cloth.py (1 hunks)
newton/examples/diffsim/example_diffsim_drone.py (1 hunks)
newton/examples/diffsim/example_diffsim_soft_body.py (1 hunks)
newton/examples/diffsim/example_diffsim_spring_cage.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

newton/examples/diffsim/example_diffsim_drone.py
newton/examples/diffsim/example_diffsim_spring_cage.py

🧰 Additional context used

🧠 Learnings (3)

📓 Common learnings

Learnt from: shi-eric
PR: newton-physics/newton#521
File: newton/examples/example_cloth_hanging.py:36-36
Timestamp: 2025-08-12T05:17:34.423Z
Learning: The Newton migration guide (docs/migration.rst) is specifically for documenting how to migrate existing warp.sim functionality to Newton equivalents. New Newton-only features that didn't exist in warp.sim do not need migration documentation.

📚 Learning: 2025-08-12T18:07:17.267Z

Learnt from: nvlukasz
PR: newton-physics/newton#519
File: newton/utils.py:30-33
Timestamp: 2025-08-12T18:07:17.267Z
Learning: In newton/_src/utils/import_usd.py, the PXR/USD modules are lazily loaded using try-except blocks within the functions (lines 109-111 and 1289-1291) rather than at module import time, making it safe to import parse_usd at the module level without causing import-time failures for users without USD dependencies.

Applied to files:

newton/examples/diffsim/example_diffsim_soft_body.py
newton/examples/diffsim/example_diffsim_ball.py

📚 Learning: 2025-08-12T18:07:17.267Z

Learnt from: nvlukasz
PR: newton-physics/newton#519
File: newton/utils.py:30-33
Timestamp: 2025-08-12T18:07:17.267Z
Learning: In newton/_src/utils/import_usd.py, the PXR/USD modules are lazily loaded using try-except blocks within the functions rather than at module import time, making it safe to import parse_usd at the module level without causing import-time failures for users without USD dependencies.

Applied to files:

newton/examples/diffsim/example_diffsim_soft_body.py
newton/examples/diffsim/example_diffsim_ball.py

🧬 Code Graph Analysis (4)

newton/examples/diffsim/example_diffsim_soft_body.py (4)

newton/_src/sim/builder.py (5)

ModelBuilder (67-4049)

add_soft_grid (3354-3486)

add_shape_box (2208-2245)

ShapeConfig (142-194)

add_ground_plane (2153-2173)

newton/_src/sim/model.py (1)

collide (510-564)

newton/_src/viewer/viewer_usd.py (1)

ViewerUSD (14-321)

newton/_src/viewer/viewer.py (1)

log_shapes (155-236)

newton/examples/diffsim/example_diffsim_ball.py (3)

newton/_src/sim/builder.py (5)

ModelBuilder (67-4049)

add_particle (2647-2681)

add_shape_box (2208-2245)

ShapeConfig (142-194)

add_ground_plane (2153-2173)

newton/_src/sim/model.py (1)

collide (510-564)

newton/_src/viewer/viewer.py (1)

log_shapes (155-236)

newton/examples/diffsim/example_diffsim_bear.py (2)

newton/examples/__init__.py (1)

get_asset_directory (27-28)

newton/_src/sim/builder.py (6)

ModelBuilder (67-4049)

add_soft_mesh (3488-3574)

add_ground_plane (2153-2173)

ShapeConfig (142-194)

particle_count (506-507)

tet_count (514-515)

newton/examples/diffsim/example_diffsim_cloth.py (5)

newton/_src/sim/builder.py (2)

add_cloth_grid (3081-3195)

particle_count (506-507)

newton/_src/viewer/viewer_usd.py (1)

ViewerUSD (14-321)

newton/_src/viewer/viewer_rerun.py (1)

ViewerRerun (30-247)

newton/_src/viewer/viewer.py (1)

log_shapes (155-236)

newton/_src/geometry/types.py (1)

GeoType (25-36)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: run-newton-tests / newton-unittests (ubuntu-latest)
GitHub Check: run-newton-tests / newton-unittests (windows-latest)

coderabbitai · 2025-08-20T22:28:49Z

+    def step(self):
+        with wp.ScopedTimer("step"):
+            if self.graph:
+                wp.capture_launch(self.graph)
+            else:
+                self.forward_backward()
+
+            # gradient descent step
+            x = self.states[0].particle_qd
+            wp.launch(step_kernel, dim=len(x), inputs=[x, x.grad, self.train_rate])
+
+            x_grad = self.tape.gradients[self.states[0].particle_qd]
+
+            if self.verbose:
+                print(f"Train iter: {self.train_iter} Loss: {self.loss}")
+                print(f"    x: {x} g: {x_grad}")
+
+            # clear grads for next iteration
+            self.tape.zero()
+
+            self.train_iter = self.train_iter + 1
+


🛠️ Refactor suggestion

Reset simulation state between training iterations for consistent optimization

Each training iteration should start from the same initial conditions to ensure proper gradient-based optimization. Currently, the simulation states accumulate across iterations.

Apply this diff to reset states after each step:

def step(self): with wp.ScopedTimer("step"): if self.graph: wp.capture_launch(self.graph) else: self.forward_backward() # gradient descent step x = self.states[0].particle_qd wp.launch(step_kernel, dim=len(x), inputs=[x, x.grad, self.train_rate]) x_grad = x.grad if self.verbose: print(f"Train iter: {self.train_iter} Loss: {self.loss}") print(f" x: {x} g: {x_grad}") + # Reset trajectory to initial conditions (keeping updated velocity) + init = self.model.state() + wp.copy(init.particle_q, self.states[0].particle_q) + for i in range(1, self.sim_steps + 1): + self.states[i].clear_forces() + wp.copy(init.particle_q, self.states[i].particle_q) + # particle_qd will be overwritten by solver # clear grads for next iteration self.tape.zero() self.train_iter = self.train_iter + 1

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In newton/examples/diffsim/example_diffsim_ball.py around lines 153-174, training iterations currently accumulate simulation state; to fix, capture a clean copy of the initial simulation state during setup (e.g., self.initial_states = deepcopy(self.states>) and after the gradient step (after self.tape.zero()) restore the runtime state from that copy so each iteration starts from identical initial conditions; ensure any derived tensors/arrays (like self.states[0].particle_qd and related buffers) are reinitialized or rebound to the restored state so gradients and launches use the reset state.

coderabbitai · 2025-08-20T22:28:49Z

+        if isinstance(self.viewer, ViewerGL):
+            pos = type(self.viewer.camera.pos)(0.0, -25.0, 2.0)
+            self.viewer.camera.pos = pos
+            self.viewer.camera.yaw = 90.0
+


⚠️ Potential issue

Fix potential NameError when viewer_type != "gl"

ViewerGL is referenced in isinstance(self.viewer, ViewerGL) even when it hasn't been imported (USD/Rerun cases), causing a NameError.

Apply this diff to avoid the NameError:

- if isinstance(self.viewer, ViewerGL): + if hasattr(self.viewer, "camera"): pos = type(self.viewer.camera.pos)(0.0, -25.0, 2.0) self.viewer.camera.pos = pos self.viewer.camera.yaw = 90.0

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if isinstance(self.viewer, ViewerGL):

pos = type(self.viewer.camera.pos)(0.0, -25.0, 2.0)

self.viewer.camera.pos = pos

self.viewer.camera.yaw = 90.0

if hasattr(self.viewer, "camera"):

pos = type(self.viewer.camera.pos)(0.0, -25.0, 2.0)

self.viewer.camera.pos = pos

self.viewer.camera.yaw = 90.0

🤖 Prompt for AI Agents

In newton/examples/diffsim/example_diffsim_bear.py around lines 222-226, replace the direct isinstance check that references ViewerGL (which can be undefined) with a safe runtime check that doesn't require the class to be imported; for example, use type(self.viewer).__name__ == "ViewerGL" (or getattr(type(self.viewer), "__name__", "") == "ViewerGL") before accessing .camera and setting pos/yaw so you avoid a NameError in non-GL viewers.

mzamoramora-nvidia · 2025-08-22T09:27:19Z

This PR is replaced by #616

…ysics#560) # Description In conda environments, import torch before importing AppLauncher triggers `version `GLIBCXX_3.4.30' not found` error. This change moves torch import to be after AppLauncher, which resolves the issue. ## Type of change  - Bug fix (non-breaking change which fixes an issue) ## Checklist - [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with `./isaaclab.sh --format` - [x] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have updated the changelog and the corresponding version in the extension's `config/extension.toml` file - [ ] I have added my name to the `CONTRIBUTORS.md` or my name already exists there

mzamoramora-nvidia and others added 3 commits August 15, 2025 14:24

Add copy of warp/warp/examples/optim/example_bounce.py into example_d…

b2fae80

…iffsim_ball.py Signed-off-by: Miguel Angel Zamora Mora <mzamoramora@nvidia.com>

Update example_diffsim_ball.py to use Newton API

5238826

Signed-off-by: Miguel Angel Zamora Mora <mzamoramora@nvidia.com>

Remove comments

ce5dce1

Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

mzamoramora-nvidia had a problem deploying to external-pr-approval August 15, 2025 13:36 — with GitHub Actions Error

coderabbitai Bot reviewed Aug 15, 2025

View reviewed changes

mzamoramora-nvidia temporarily deployed to external-pr-approval August 15, 2025 16:06 — with GitHub Actions Inactive

mzamoramora-nvidia had a problem deploying to external-pr-approval August 15, 2025 16:06 — with GitHub Actions Error

coderabbitai Bot reviewed Aug 15, 2025

View reviewed changes

mzamoramora-nvidia had a problem deploying to external-pr-approval August 18, 2025 07:59 — with GitHub Actions Error

coderabbitai Bot reviewed Aug 18, 2025

View reviewed changes

mzamoramora-nvidia had a problem deploying to external-pr-approval August 19, 2025 07:25 — with GitHub Actions Error

coderabbitai Bot reviewed Aug 19, 2025

View reviewed changes

Comment thread newton/examples/diffsim/example_diffsim_bear.py Outdated

Comment thread newton/examples/diffsim/example_diffsim_soft_body.py Outdated

mzamoramora-nvidia mentioned this pull request Aug 19, 2025

Add flag to enable/disable the evaluation of triangle contact forces for the SemiImplicit solver. #576

Closed

mzamoramora-nvidia marked this pull request as draft August 19, 2025 11:27

mzamoramora-nvidia had a problem deploying to external-pr-approval August 20, 2025 15:44 — with GitHub Actions Error

coderabbitai Bot reviewed Aug 20, 2025

View reviewed changes

mzamoramora-nvidia had a problem deploying to external-pr-approval August 20, 2025 16:01 — with GitHub Actions Error

Miguel Zamora added 7 commits August 20, 2025 23:28

Run: pre-commit run -a

45dd297

Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

Add a copy of warp/warp/examples/optim/example_softbody_properties.py…

f48c8d7

… into example_example_diffsim_soft_body.py Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

Update example_diffsim_soft_body.py to use Newton API

0e1b592

Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

Add copy of warp/warp/examples/optim/example_cloth_throw.py into exam…

295c0b9

…ple_diffsim_cloth.py Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

Update example_diffsim_cloth.py to use Newton API

0187038

Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

Add copy of warp/warp/examples/optim/example_drone.py into example_di…

88e6b85

…ffsim_drone.py and add its corresponding asset Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

Update example_diffsim_drone.py to use Newton API

5853ebe

Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

Miguel Zamora and others added 11 commits August 20, 2025 23:28

Refactor example_diffsim_bear to follow the examples template closer

f1040b3

Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

Fix quaternion inversion in USD importer (newton-physics#559)

f04c6db

Signed-off-by: Alain Denzler <adenzler@nvidia.com> Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

Parse joint drive force limits in parse_usd (newton-physics#512)

55815a7

Signed-off-by: Antoine Richard <antoiner@nvidia.com> Signed-off-by: camevor <camevor@nvidia.com> Co-authored-by: camevor <camevor@nvidia.com> Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

Fixes for incorrect inertia values in MuJoCoSolver (newton-physics#552)

078586e

Signed-off-by: Alain Denzler <adenzler@nvidia.com> Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

Rename SolverMuJoCo use_mujoco to use_mujoco_cpu (newton-physics#579)

8bed934

Signed-off-by: Philipp Reist <preist@nvidia.com> Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

Remove axis parameter from ModelBuilder.add_shape_*() functions newto…

c810641

…n-physics#493 (newton-physics#584) Signed-off-by: Viktor Reutskyy <vreutskyy@nvidia.com> Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

Add usd examples for h1 and anymal (newton-physics#575)

1bbf986

Signed-off-by: Julia von Muralt <jvonmuralt@nvidia.com> Co-authored-by: Eric Heiden <eheiden@nvidia.com> Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

Fix for contact rendering in example_quadruped.py (newton-physics#578)

010cde1

Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

Improve robustness of collision respond. (newton-physics#502)

e531d61

Signed-off-by: Wenchao Huang <huangwenchao@linctex.com> Co-authored-by: Eric Shi <97630937+shi-eric@users.noreply.github.com> Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

mzamoramora-nvidia force-pushed the feature/migration-diffsim-examples branch from 230be75 to e531d61 Compare August 20, 2025 21:30

mzamoramora-nvidia had a problem deploying to external-pr-approval August 20, 2025 21:30 — with GitHub Actions Error

Merge branch 'main' into feature/migration-diffsim-examples

531c891

Signed-off-by: mzamoramora-nvidia <mzamoramora@nvidia.com>

mzamoramora-nvidia had a problem deploying to external-pr-approval August 20, 2025 21:40 — with GitHub Actions Error

Run pre-commit

caa7c3e

Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

mzamoramora-nvidia had a problem deploying to external-pr-approval August 20, 2025 21:45 — with GitHub Actions Error

Undo changes to cartpole.usda

6d21e98

Signed-off-by: Miguel Zamora <mazamoramora@nvidia.com>

mzamoramora-nvidia had a problem deploying to external-pr-approval August 20, 2025 21:47 — with GitHub Actions Failure

coderabbitai Bot reviewed Aug 20, 2025

View reviewed changes

mzamoramora-nvidia mentioned this pull request Aug 22, 2025

Migrate diffsim examples #616

Merged

5 tasks

mzamoramora-nvidia closed this Aug 22, 2025

mzamoramora-nvidia deleted the feature/migration-diffsim-examples branch September 4, 2025 09:24

-                        m_mesh = Mesh(
-                            m_vertices,
-                            m_faces,
-                            m.vertex_normals,
-                            color=np.array(m.visual.main_color) / 255.0,
-                            maxhullvert=maxhullvert,
-                        )
+                        # Use per-geometry attributes; guard optional color
+                        color = None
+                        if getattr(m_geom, "visual", None) is not None and getattr(m_geom.visual, "main_color", None) is not None:
+                            color = np.array(m_geom.visual.main_color, dtype=np.float32) / 255.0
+                        m_mesh = Mesh(
+                            m_vertices,
+                            m_faces,
+                            m_geom.vertex_normals if hasattr(m_geom, "vertex_normals") else None,
+                            color=color,
+                            maxhullvert=maxhullvert,
+                        )

-        direction = front + u * alpha * aspect_ratio * right + v * alpha * up
-        direction = direction / float(np.linalg.norm(direction))
+        direction = front + u * alpha * aspect_ratio * right + v * alpha * up
+        direction = direction.normalize()


		self.control = self.model.control()

		self.contacts = self.model.collide(self.states[0], soft_contact_margin=40.0)

Conversation

mzamoramora-nvidia commented Aug 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Newton Migration Guide

Before your PR is "Ready for review"

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Aug 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

Status, Documentation and Community

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Aug 15, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Aug 15, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Aug 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

mzamoramora-nvidia commented Aug 15, 2025 •

edited

Loading

coderabbitai Bot commented Aug 15, 2025 •

edited

Loading