Add inverse kinematics module with interactive and benchmark examples by dylanturpin · Pull Request #394 · newton-physics/newton

dylanturpin · 2025-07-13T21:14:01Z

Description

Adds a modular inverse kinematics (IK) system to Newton, inspired by PyRoki.

Core features:

Levenberg-Marquardt solver leveraging Warp's tile API for cholesky linear solve
(with adaptive damping, step accept/reject logic)
Analytic Jacobians for speed (option to use autodiff or mix instead)
Modular objectives: position, rotation, and joint limit objectives included

API:

import newton.core.ik as ik

ik_solver = ik.IKSolver(
    model=model,
    joint_q=joint_q, # (n_problems, n_coords) 
    objectives=[
        ik.PositionObjective(link_index=16, target_positions=targets, ...),
        ik.RotationObjective(link_index=16, target_rotations=rotations, ...),
        ik.JointLimitObjective(weight=0.1, ...)
    ],
    jacobian_mode=JacobianMode.ANALYTIC # or MIXED/AUTODIFF
)
ik_solver.solve(iterations=10)

Interactive example:

newton/examples/example_ik_interactive.py - drag gizmos to control H1 robot end-effectors
Supports "tied controls" mode where dragging one robot controls all environments
GUI utilities in newton/examples/gizmo_utils.py

Benchmark example:

newton/examples/example_ik_benchmark.py --robot [franka/h1]
Runs many seeds and selects best
Generates benchmark tables like below

Benchmarks

Accuracy stats averaged over 2M samples.

Franka (Simple Arm): 3x Faster, 99.994% Success Rate

Results (98th percentile errors for successful solves)

Solver	Pos Err (mm)	Ori Err (rad)	Success (%)	time (ms) for batch size
Solver	Pos Err (mm)	Ori Err (rad)	Success (%)	b = 1	b = 10	b = 100	b = 1 k	b = 2 k
cuRobo	0.004482	6.96e-6	99.931	7.952	7.761	11.811	53.965	85.420
PyroKI	0.000624	6.21e-7	99.865	4.634	5.275	7.337	34.851	78.240
Newton	0.000217	2.60e-7	99.994	2.404	0.977	2.242	13.588	26.193

H1 (Humanoid): 10x Faster, 99.996% Success Rate

Note: Success requires ALL 4 end effectors to reach their targets
Results (98th percentile errors for successful solves)

Solver	Pos Err (mm)	Ori Err (rad)	Success (%)	time (ms) for batch size
Solver	Pos Err (mm)	Ori Err (rad)	Success (%)	b = 1	b = 10	b = 100	b = 1 k
PyroKI	0.014312	7.26e-6	98.583	11.599	20.615	108.481	1525.460
Newton	0.004949	4.43e-7	99.996	1.809	3.306	23.344	151.591

Limitations

Analytic jacobians support only FIXED, REVOLUTE and PRISMATIC joints (more to be added soon)

Please ensure the migration guide for warp.sim users is up-to-date with the changes made in this MR.

The migration guide in docs/migration.rst is up-to date

Before your PR is "Ready for review"

All commits are signed-off to indicate that your contribution adheres to the Developer Certificate of Origin requirements
I understand that GitHub does not perform any GPU testing of this pull request
Necessary tests have been added
Documentation is up-to-date
Code passes formatting and linting checks with pre-commit run -a

Summary by CodeRabbit

New Features
- Introduced a high-performance, GPU-accelerated inverse kinematics (IK) solver supporting multiple Jacobian modes and batch problem solving.
- Added example scripts for benchmarking IK performance and interactive IK manipulation with 3D visualization and multi-environment batching.
- Provided configurable robot models, multi-seed initialization, and detailed IK error evaluation and reporting.
- Enabled user-friendly 3D manipulation of robot end-effectors with optional tied or independent target control across environments.
Tests
- Added comprehensive unit tests verifying IK solver convergence, Jacobian correctness, and forward kinematics parity across joint types and devices.

coderabbitai · 2025-07-13T21:14:08Z

📝 Walkthrough

Walkthrough

New modules and example scripts are introduced to provide a GPU-accelerated inverse kinematics (IK) solver framework with batched solving, analytic and autodiff Jacobian options, and comprehensive testing. Example scripts demonstrate benchmarking and interactive IK manipulation. Extensive unit tests validate solver convergence, Jacobian correctness, and forward kinematics parity across joint types and devices.

Changes

File(s)	Change Summary
newton/examples/example_ik_benchmark.py	New script for benchmarking batched, multi-seed, GPU-accelerated IK solving with reporting for multiple robots.
newton/examples/example_ik_interactive.py	New interactive IK playground with 3D gizmo manipulation and multi-environment batching for H1 robots.
newton/sim/ik.py	New modular, extensible, batched IK solver with analytic/autodiff Jacobians, LM optimization, and core objectives.
newton/tests/test_ik.py	New unit tests for IK solver: convergence, Jacobian correctness, and multi-device coverage.
newton/tests/test_ik_fk_kernels.py	New tests for FK parity between two-pass and reference FK methods across joint types and devices.

Sequence Diagram(s)

IK Benchmarking and Solving (example_ik_benchmark.py)

sequenceDiagram
    participant User
    participant Example
    participant IKSolver
    participant WarpGPU

    User->>Example: Select robot, batch sizes, repeats
    Example->>Example: Initialize robot model and IK system
    loop For each batch size and repeat
        Example->>Example: Generate target poses and IK seeds
        Example->>IKSolver: Solve batched IK problems (multi-seed)
        IKSolver->>WarpGPU: Run LM optimization (analytic/autodiff Jacobian)
        WarpGPU-->>IKSolver: Return optimized joint configs
        IKSolver-->>Example: Return best solutions and errors
        Example->>Example: Evaluate and record results
    end
    Example->>User: Print formatted benchmark results

Interactive IK Playground (example_ik_interactive.py)

sequenceDiagram
    participant User
    participant Example
    participant Renderer
    participant Gizmo
    participant IKSolver

    User->>Renderer: Manipulate 3D gizmo (mouse/keyboard)
    Renderer->>Gizmo: Detect drag event
    Gizmo->>Example: Notify position/rotation drag
    Example->>Example: Update IK targets
    Example->>IKSolver: Solve IK for updated targets
    IKSolver->>Example: Return new joint configs
    Example->>Renderer: Render updated robot pose
    loop Until window closed
        Renderer->>User: Display frame
    end

IK Solver Internal Flow (newton/sim/ik.py)

sequenceDiagram
    participant IKSolver
    participant Objectives
    participant WarpGPU

    IKSolver->>Objectives: Compute residuals and Jacobians
    Objectives-->>IKSolver: Return residuals/Jacobians
    IKSolver->>WarpGPU: Run LM optimization step
    WarpGPU-->>IKSolver: Update joint coordinates
    IKSolver->>Objectives: Check convergence
    Objectives-->>IKSolver: Return status
    alt Not converged
        IKSolver->>Objectives: Repeat residual/Jacobian computation
    end
    IKSolver->>Caller: Return optimized joint configs

Test Execution (newton/tests/test_ik.py & test_ik_fk_kernels.py)

sequenceDiagram
    participant TestRunner
    participant IKTest
    participant IKSolver
    participant Model

    TestRunner->>IKTest: Start test (for each device/joint type)
    IKTest->>Model: Build test model
    IKTest->>IKSolver: Configure objectives and solve
    IKSolver->>Model: Run FK/IQ as needed
    Model-->>IKSolver: Return transforms/solutions
    IKSolver-->>IKTest: Return results
    IKTest->>TestRunner: Assert correctness and report

✨ Finishing Touches

📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (10)

newton/sim/ik.py (2)
29-31: Address the TODO comment about enable_backward

This TODO suggests a temporary workaround. Please clarify why enable_backward needs to be enabled globally and consider a more localized solution if possible.

Would you like me to investigate alternative approaches to avoid setting this globally, or open an issue to track this technical debt?

454-454: Remove unused noqa directive

The noqa: PLC0415 directive appears to be unnecessary in the current linting configuration.
-        from newton.solvers.featherstone.kernels import (  # noqa: PLC0415
+        from newton.solvers.featherstone.kernels import (
newton/examples/example_ik_benchmark.py (4)
76-77: Consider renaming df to derivative for clarity.

The variable name df could be misleading as it commonly refers to DataFrames in data science contexts.
-        df = (dim + 1) * x**dim - 1.0
-        x_next = x - f / df
+        derivative = (dim + 1) * x**dim - 1.0
+        x_next = x - f / derivative
89-89: Use float('inf') instead of magic number for clarity.

Using float('inf') is more idiomatic and self-documenting than a large magic number.
-    best_cost = float(1.0e30)
+    best_cost = float('inf')
246-257: Extract quaternion multiplication to a utility function.

The inline quaternion multiplication implementation could be moved to a reusable utility function for better modularity.

Consider extracting this to a utility function in the newton.utils module or at the module level:
def quaternion_multiply(a, b):
    """Multiply two quaternions represented as [w, x, y, z] arrays."""
    w1, x1, y1, z1 = np.moveaxis(a, -1, 0)
    w2, x2, y2, z2 = np.moveaxis(b, -1, 0)
    return np.stack(
        (
            w1 * w2 - x1 * x2 - y1 * y2 - z1 * z2,
            w1 * x2 + x1 * w2 + y1 * z2 - z1 * y2,
            w1 * y2 - x1 * z2 + y1 * w2 + z1 * x2,
            w1 * z2 + x1 * y2 - y1 * x2 + z1 * w2,
        ),
        axis=-1,
    )
261-261: Add explanatory comment for rotation error calculation.

The rotation error calculation using arctan2 is mathematically correct but could benefit from an explanation.
+        # Calculate rotation error as the angle of the relative rotation quaternion
+        # Using 2*atan2(||v||, |w|) formula for quaternion angle extraction
         rot_err = (2 * np.arctan2(np.linalg.norm(rel[..., 1:], axis=-1), np.abs(rel[..., 0]))).max(axis=-1)
newton/examples/gizmo_utils.py (4)
268-269: Maintain consistency in sentinel values.

The code uses 1e9 as a threshold here but initializes with 1e10 elsewhere. These values should be consistent.
-    if min_capsule_dist < 1e9:
+    if min_capsule_dist < 1e10:
373-379: Move function definition outside the loop for better performance.

Defining a function inside a loop creates a new function object on each iteration, which can impact performance.

Move the arc_point function outside the loop and pass the required parameters:
def _arc_point(self, t, from_world, to_world, visual_pos, radius):
    sin_t1 = math.sin((1 - t) * (math.pi / 2))
    sin_t2 = math.sin(t * (math.pi / 2))
    direction = sin_t1 * from_world + sin_t2 * to_world
    return visual_pos + direction * radius
643-644: Consider making array size limits configurable.

The hardcoded limits could restrict usage in larger scenes. Consider making these configurable.
-        self.MAX_GIZMOS = 800
-        self.MAX_CAPSULES = self.MAX_GIZMOS * 27
+        self.MAX_GIZMOS = max_gizmos if hasattr(self, 'max_gizmos') else 800
+        self.MAX_CAPSULES = self.MAX_GIZMOS * 27
Or add parameters to the constructor:
def __init__(self, renderer, scale_factor=1.0, rotation_sensitivity=0.01, max_gizmos=800):
333-333: Add comment explaining the underscore convention.

The fill_ method uses an underscore suffix, which is a PyTorch/Warp convention for in-place operations but may be unfamiliar to some Python developers.
+                    # fill_ is an in-place operation (PyTorch/Warp convention)
                     off_d.fill_(sb_start)

📜 Review details

Configuration used: .coderabbit.yml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f26eb37 and 37128f18333d3e407a7d14a08e036b895b09a1b7.

📒 Files selected for processing (7)

newton/examples/example_ik_benchmark.py (1 hunks)
newton/examples/example_ik_interactive.py (1 hunks)
newton/examples/gizmo_utils.py (1 hunks)
newton/sim/__init__.py (3 hunks)
newton/sim/ik.py (1 hunks)
newton/tests/test_ik.py (1 hunks)
newton/tests/test_ik_fk_kernels.py (1 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (1)

newton/sim/__init__.py (1)

newton/sim/ik.py (6)

IK (39-606)

IKObjective (857-874)

JacobianMode (33-36)

JointLimitObjective (1222-1317)

PositionObjective (993-1148)

RotationObjective (1453-1615)

🪛 Ruff (0.11.9)

newton/sim/ik.py

454-454: Unused noqa directive (non-enabled: PLC0415)

Remove unused noqa directive

(RUF100)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: run-newton-tests / newton-unittests (windows-latest)
GitHub Check: run-newton-tests / newton-unittests (ubuntu-latest)
GitHub Check: run-asv-benchmarks

🔇 Additional comments (5)

newton/sim/ik.py (1)

39-606: Excellent IK solver implementation!

The modular design with dynamic specialization, CUDA stream parallelization, and support for multiple Jacobian backends is well-architected. The comprehensive documentation and error handling make this production-ready.

newton/sim/__init__.py (1)

30-37: Clean API integration!

The IK module entities are properly imported and exposed in the package API with correct alphabetical ordering maintained.

Also applies to: 57-95

newton/tests/test_ik_fk_kernels.py (1)

1-246: Comprehensive FK parity testing!

Well-structured tests that validate FK consistency across joint types and computation methods. The helper functions for model building and joint randomization are clean and reusable.

newton/tests/test_ik.py (1)

1-267: Thorough IK solver testing!

Excellent test coverage for convergence and Jacobian correctness across all modes. The two-link planar robot provides a simple but effective test case.

newton/examples/example_ik_interactive.py (1)

60-530: Well-crafted interactive IK example!

The implementation demonstrates best practices with clear separation of concerns, efficient CUDA graph usage, and intuitive gizmo-based interaction. The tied/untied target modes provide flexible control options.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

newton/examples/gizmo_utils.py (1)

197-200: Good variable naming improvement!

The variable names axis_length_sq, axis_ray_dot, and axis_origin_dot are clear and descriptive, properly addressing the previous review feedback about cryptic names.

🧹 Nitpick comments (5)

newton/examples/gizmo_utils.py (1)

584-584: Clarify the double scaling factor for rotation sensitivity.

The rotation delta is computed as -tangent_movement * rotation_sensitivity * 0.01, which applies two separate scaling factors. Consider either documenting why both factors are necessary or consolidating them into a single, well-documented parameter.
newton/sim/ik.py (2)
455-455: Remove unused noqa directive.

The noqa: PLC0415 directive appears to be unnecessary as indicated by static analysis. If the import-outside-toplevel rule is not enabled in your linter configuration, this directive can be safely removed.
-        from newton.solvers.featherstone.kernels import (  # noqa: PLC0415
+        from newton.solvers.featherstone.kernels import (
29-30: Address the TODO about backward mode configuration.

The TODO comment indicates uncertainty about why wp.config.enable_backward = True is necessary. This should be investigated and either:

Document why backward mode is required (e.g., for autodiff Jacobian computation)

Remove it if not needed, as it may impact performance

Enable it only when needed (e.g., when jacobian_mode includes AUTODIFF)

Would you like me to help investigate whether backward mode is required for all use cases or only for specific Jacobian modes?
newton/examples/example_ik_benchmark.py (2)
141-146: Consider using pathlib for more robust path handling.

The current check for path separators using "/" in self.cfg.asset or "\\" in self.cfg.asset is platform-specific. Consider using pathlib for more robust cross-platform path handling.
-        asset_path = (
-            Path(self.cfg.asset) / self.cfg.file
-            if ("/" in self.cfg.asset or "\\" in self.cfg.asset)
-            else newton.utils.download_asset(self.cfg.asset) / self.cfg.file
-        )
+        asset_path_obj = Path(self.cfg.asset)
+        if asset_path_obj.parts and len(asset_path_obj.parts) > 1:
+            # It's a path
+            asset_path = asset_path_obj / self.cfg.file
+        else:
+            # It's an asset name
+            asset_path = newton.utils.download_asset(self.cfg.asset) / self.cfg.file
367-407: Consider using a table formatting library for cleaner code.

The manual ASCII table construction works but could be simplified significantly using a library like tabulate or rich. This would improve maintainability and reduce code complexity.

Example with tabulate:
from tabulate import tabulate

def print_results(self):
    headers = ["", "robot", "batch", "newton-time (ms)", "newton-success (%)", 
               "newton-pos-err (mm)", "newton-ori-err (rad)"]
    
    table_data = [
        [i, r, b, f"{t:.6g}", f"{s:.3f}", 
         "nan" if np.isnan(pe) else f"{pe:.6g}",
         "nan" if np.isnan(oe) else f"{oe:.6g}"]
        for i, (r, b, t, s, pe, oe) in enumerate(self.results)
    ]
    
    print("\nReported errors are 98-percentile of successful solves\n")
    print(tabulate(table_data, headers=headers, tablefmt="grid"))

📜 Review details

Configuration used: .coderabbit.yml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 37128f18333d3e407a7d14a08e036b895b09a1b7 and 2b794b399247975708d6a5528b84699b25e657f3.

📒 Files selected for processing (7)

newton/examples/example_ik_benchmark.py (1 hunks)
newton/examples/example_ik_interactive.py (1 hunks)
newton/examples/gizmo_utils.py (1 hunks)
newton/sim/__init__.py (3 hunks)
newton/sim/ik.py (1 hunks)
newton/tests/test_ik.py (1 hunks)
newton/tests/test_ik_fk_kernels.py (1 hunks)

✅ Files skipped from review due to trivial changes (1)

newton/examples/example_ik_interactive.py

🚧 Files skipped from review as they are similar to previous changes (3)

newton/sim/init.py
newton/tests/test_ik.py
newton/tests/test_ik_fk_kernels.py

🧰 Additional context used

🪛 Ruff (0.11.9)

newton/sim/ik.py

454-454: Unused noqa directive (non-enabled: PLC0415)

Remove unused noqa directive

(RUF100)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: run-newton-tests / newton-unittests (ubuntu-latest)
GitHub Check: run-newton-tests / newton-unittests (windows-latest)
GitHub Check: run-asv-benchmarks

🔇 Additional comments (1)

newton/sim/ik.py (1)

77-92: Excellent design pattern for performance optimization!

The specialization mechanism that creates optimized IK subclasses based on problem dimensions is a clever approach. The caching prevents redundant specializations and the tile-based solver can leverage compile-time constants for better GPU performance.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

newton/sim/ik.py (2)
29-30: Address the TODO comment and configuration setting.

The TODO comment indicates uncertainty about why wp.config.enable_backward = True is necessary. This suggests a potential workaround that should be investigated and documented properly.

Do you want me to help investigate why this configuration is needed and provide proper documentation, or create an issue to track this technical debt?

454-454: Remove unused noqa directive.

The static analysis tool correctly identifies that the noqa directive for PLC0415 is unused and should be removed.
-        from newton.solvers.featherstone.kernels import (  # noqa: PLC0415
+        from newton.solvers.featherstone.kernels import (

📜 Review details

Configuration used: .coderabbit.yml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2b794b399247975708d6a5528b84699b25e657f3 and 26a519a7bd18a8d238a0a012e7a95cdf7bd628b1.

📒 Files selected for processing (7)

newton/examples/example_ik_benchmark.py (1 hunks)
newton/examples/example_ik_interactive.py (1 hunks)
newton/examples/gizmo_utils.py (1 hunks)
newton/sim/__init__.py (3 hunks)
newton/sim/ik.py (1 hunks)
newton/tests/test_ik.py (1 hunks)
newton/tests/test_ik_fk_kernels.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (5)

newton/sim/init.py
newton/tests/test_ik_fk_kernels.py
newton/tests/test_ik.py
newton/examples/example_ik_interactive.py
newton/examples/example_ik_benchmark.py

🧰 Additional context used

🧠 Learnings (1)

newton/examples/gizmo_utils.py (1)

Learnt from: dylanturpin
PR: newton-physics/newton#394
File: newton/examples/gizmo_utils.py:889-889
Timestamp: 2025-07-13T21:36:00.273Z
Learning: In Newton's gizmo system (newton/examples/gizmo_utils.py), the return value `return self.drag_state.mode == "rotate"` in the mouse drag handler is intentional design. It returns False for translation mode to allow the underlying pan/view callback to fire (so the view follows gizmo movement), and True for rotation mode to prevent the pan callback from firing (avoiding awkward view changes during rotation).

🧬 Code Graph Analysis (1)

newton/sim/ik.py (8)

newton/sim/state.py (3)

joint_coord_count (98-102)

joint_dof_count (105-109)

requires_grad (75-81)

newton/tests/test_ik_fk_kernels.py (5)

residual_dim (46-47)

supports_analytic (58-59)

compute_residuals (49-52)

compute_jacobian_autodiff (54-55)

compute_jacobian_analytic (61-62)

newton/utils/selection.py (1)

get (120-121)

newton/sim/builder.py (2)

joint_count (441-442)

articulation_count (469-470)

newton/examples/example_ik_interactive.py (1)

main (501-525)

newton/examples/example_ik_benchmark.py (1)

main (409-422)

newton/solvers/featherstone/kernels.py (2)

jcalc_motion (228-327)

jcalc_transform (128-223)

newton/tests/unittest_utils.py (1)

end (171-193)

🪛 Ruff (0.11.9)

newton/sim/ik.py

454-454: Unused noqa directive (non-enabled: PLC0415)

Remove unused noqa directive

(RUF100)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: run-unit-tests-on-aws
GitHub Check: run-newton-tests / newton-unittests (windows-latest)
GitHub Check: run-newton-tests / newton-unittests (ubuntu-latest)
GitHub Check: run-asv-benchmarks

🔇 Additional comments (6)

newton/sim/ik.py (2)

122-130: Verify analytic Jacobian joint type validation.

The joint type validation for analytic/mixed modes looks correct, but ensure this validation covers all edge cases and provides clear guidance to users.

The validation logic correctly restricts analytic Jacobians to supported joint types and provides a clear error message with workaround guidance.

81-91: Approve the dynamic class specialization pattern.

The __new__ method implementing dynamic class specialization based on problem dimensions is an elegant solution for compile-time optimization of tiled linear algebra operations.

This pattern allows the solver to generate specialized kernels at runtime based on the specific problem dimensions, which should provide significant performance benefits for the tiled Cholesky operations.

newton/examples/gizmo_utils.py (4)

53-88: Approve the quaternion rotation computation.

The get_rotation_quaternion function correctly handles edge cases including nearly parallel vectors, anti-parallel vectors, and zero-length vectors. The implementation follows standard quaternion mathematics.

The function properly normalizes input vectors, handles the anti-parallel case by finding an orthogonal axis, and correctly computes the quaternion from axis-angle representation.

166-243: Approve the ray-capsule intersection algorithm.

The test_all_capsule_hits kernel implements a mathematically sound ray-capsule intersection test, correctly handling both the cylindrical body and spherical caps of the capsule.

The implementation properly transforms rays to local space, computes discriminants for intersection testing, and handles edge cases for cap intersection when the ray misses the cylindrical portion.

549-607: Approve the rotation drag implementation.

The rotation update logic correctly projects the gizmo to screen space, computes tangential mouse movement, and applies view-dependent rotation corrections. The accumulated angle approach prevents rotation snapping.

The implementation handles screen-space projection, tangent computation, and view direction consideration to provide intuitive rotation behavior regardless of camera angle.

894-895: Approve the intentional return value behavior.

The conditional return self.drag_state.mode == "rotate" is intentional design to control event propagation - allowing view following during translation but preventing it during rotation.

Based on the retrieved learnings, this return value correctly implements the desired UX where translation allows view following while rotation blocks it to prevent awkward view changes.

shi-eric · 2025-07-14T04:27:42Z

A little surprised that the time to run unit tests increased by 33 20-30% from this pull request. Is this expected?

dylanturpin · 2025-07-14T16:03:44Z

@shi-eric
Hmm, good catch on test speed. I would not expect that big an increase.

For (CPU only) CI, I see:

newton-unittests (ubuntu-latest) 13m 7s
newton-unittests (windows-latest) 13m 37s

Comparing to a run on a different PR... I see...

newton-unittests (ubuntu-latest) 14m 33s
newton-unittests (windows-latest) 13m 47s

Locally I see...

CPU only, I see Ran 153 tests in 60.002s --> Ran 160 tests in 66.747s
CPU + GPU, I see Ran 238 tests in 439.554s --> Ran 252 tests in 469.712s

So the worst of those is ... local CPU+GPU which is up... ~7% (which could still be seen as a lot for a single feature).

Let me know

Where do you see the 20-30% number?
What's a good target? I could definitely scale back in a few different places.

shi-eric · 2025-07-14T16:41:40Z

This concerns the GPU tests: https://github.com/newton-physics/newton/actions/runs/16253729343/job/45886926861 compared with the latest main https://github.com/newton-physics/newton/actions/runs/16271215514/job/45939069497

2139.204s vs 1715.865s = 24.7% increase, 423 seconds

I'll re-run the AWS EC2 job in your pull request to check if it was a temporary fluke. The main difference is because when we run on CI/CD systems, there is an outstanding bug that prevents us from running with parallelization, so we have to serialize the test execution (in a single process, too!), which results in some differences.

It's hard to say what a good target should be. Looking at the rspec.xml file that was generated (todo action for me is to make these available as artifacts on GitHub) the timings are:

<testcase classname="TestIKModes" name="test_convergence_analytic_cpu" time="5.785"/>
<testcase classname="TestIKModes" name="test_convergence_analytic_cuda_0" time="43.399"/>
<testcase classname="TestIKModes" name="test_convergence_autodiff_cpu" time="0.071"/>
<testcase classname="TestIKModes" name="test_convergence_autodiff_cuda_0" time="0.093"/>
<testcase classname="TestIKModes" name="test_convergence_mixed_cpu" time="0.034"/>
<testcase classname="TestIKModes" name="test_convergence_mixed_cuda_0" time="0.043"/>
<testcase classname="TestIKModes" name="test_joint_limit_jacobian_compare_cpu" time="5.859"/>
<testcase classname="TestIKModes" name="test_joint_limit_jacobian_compare_cuda_0" time="21.665"/>
<testcase classname="TestIKModes" name="test_position_jacobian_compare_cpu" time="0.013"/>
<testcase classname="TestIKModes" name="test_position_jacobian_compare_cuda_0" time="0.015"/>
<testcase classname="TestIKModes" name="test_rotation_jacobian_compare_cpu" time="0.013"/>
<testcase classname="TestIKModes" name="test_rotation_jacobian_compare_cuda_0" time="0.015"/>
<testcase classname="TestIKFKKernels" name="test_fk_two_pass_parity_cpu" time="25.419"/>
<testcase classname="TestIKFKKernels" name="test_fk_two_pass_parity_cuda_0" time="56.250"/>

This only accounts for ~160 seconds, which doesn't seem so bad. But it seemed to have shifted test times around...

test_box_box_contact went from 23 seconds on main to 360 seconds on the PR despite your PR having nothing to do with that, while test_shapes_on_plane_mujoco_warp_cuda_0 went from 496 seconds on main to 328 seconds here. This isn't an apples-to-apples comparison since main is using newer versions of MJWarp/Warp. So it is probably not your pull request changes per se, but due to the single-process execution of the tests... I can also confirm by checking how long it takes to run TestIKModes or TestIKFKKernels as standlone tests takes.

shi-eric · 2025-07-14T16:52:25Z

Running the tests individually on my system, test_ik_fk_kernels.py took 137 secs and test_ik.py took 128.4 secs (using the times reported by unittest). The module load times and recompilations are a bit diabolical:

Module newton.sim.ik 54ecea2 load on device 'cpu' took 5404.95 ms  (compiled)
Module newton.sim.ik 44a6e7e load on device 'cpu' took 5191.23 ms  (compiled)
Module newton.sim.ik d7f2af7 load on device 'cpu' took 5328.65 ms  (compiled)
Module newton.sim.ik aef92ac load on device 'cpu' took 5585.66 ms  (compiled)

Module newton.sim.ik 183cd3f load on device 'cuda:0' took 105012.24 ms  (compiled)

Module newton.sim.ik fee9900 load on device 'cuda:0' took 38290.72 ms  (compiled)
Module newton.sim.ik d8bf09c load on device 'cuda:0' took 38255.77 ms  (compiled)

Module newton.sim.ik 2524012 load on device 'cuda:0' took 41002.01 ms  (compiled)

It would be helpful if you could take a look at newton.sim.ik and see what's causing these recompilations and how they can be eliminated.

dylanturpin · 2025-07-14T17:10:07Z

That's super helpful! The linear solve kernel I wrote (that calls tile_cholesky) is specialized to a given problem size and needs to recompile each time you hit a new size. I need to make sure it's caching properly and also... whether I'm using reasonable sizes in those tests.

shi-eric · 2025-07-14T17:21:07Z

That's super helpful! The linear solve kernel I wrote (that calls tile_cholesky) is specialized to a given problem size and needs to recompile each time you hit a new size. I need to make sure it's caching properly and also... whether I'm using reasonable sizes in those tests.

Okay, maybe some @wp.kernel(module="unique") might help if it's just a single kernel being recompiled? I'm sure we're not being efficient about recompiling elsewhere in the code so your pull request became the first one where these things become noticeable.

dylanturpin · 2025-07-14T17:36:08Z

So, my mistake appears to be that I keyed the cache on problem dimension only. It also needs to take CPU / CUDA into account.

So I'd compile once on CPU (for a given problem size), later run on GPU, see a (spurious) cache hit, lazy recompile to CUDA when the kernel is actually called (maybe replacing the CPU cached version?) and so on and so on.

Should be an easy fix. Can I just call wp.get_device().is_cuda?

This all assumes that a given IK object is only used on one of CPU or CUDA throughout it's lifetime. I think that's pretty reasonable.

shi-eric · 2025-07-14T17:38:54Z

So, my mistake appears to be that I keyed the cache on problem dimension only. It also needs to take CPU / CUDA into account.

So I'd compile once on CPU (for a given problem size), later run on GPU, see a (spurious) cache hit, lazy recompile to CUDA when the kernel is actually called (maybe replacing the CPU cached version?) and so on and so on.

Should be an easy fix. Can I just call wp.get_device().is_cuda?

Where's the part of the code doing this? wp.get_device().is_cuda should be the recommended way to figure it out.

dylanturpin · 2025-07-16T11:13:49Z

@mzamoramora-nvidia Thank you, good catch! Fixed in #423

Signed-off-by: Dylan Turpin <dylanturpin@gmail.com>

…newton-physics#394) Signed-off-by: Dylan Turpin <dylanturpin@gmail.com> Co-authored-by: Eric Heiden <eheiden@nvidia.com>

Signed-off-by: Dylan Turpin <dylanturpin@gmail.com>

…newton-physics#394) Signed-off-by: Dylan Turpin <dylanturpin@gmail.com> Co-authored-by: Eric Heiden <eheiden@nvidia.com>

Signed-off-by: Dylan Turpin <dylanturpin@gmail.com>

dylanturpin had a problem deploying to external-pr-approval July 13, 2025 21:14 — with GitHub Actions Error

dylanturpin mentioned this pull request Jul 13, 2025

[DRAFT] Add inverse kinematics module with interactive example #190

Closed

6 tasks

coderabbitai Bot reviewed Jul 13, 2025

View reviewed changes

Comment thread newton/sim/ik.py Outdated

Comment thread newton/examples/example_ik_interactive.py Outdated

Comment thread newton/examples/gizmo_utils.py Outdated

dylanturpin had a problem deploying to external-pr-approval July 13, 2025 21:27 — with GitHub Actions Error

coderabbitai Bot reviewed Jul 13, 2025

View reviewed changes

Comment thread newton/examples/gizmo_utils.py Outdated

dylanturpin temporarily deployed to external-pr-approval July 13, 2025 21:40 — with GitHub Actions Inactive

coderabbitai Bot reviewed Jul 13, 2025

View reviewed changes

dylanturpin temporarily deployed to external-pr-approval July 14, 2025 16:24 — with GitHub Actions Inactive

dylanturpin commented Jul 14, 2025

View reviewed changes

Comment thread newton/sim/ik.py Outdated

dylanturpin had a problem deploying to external-pr-approval July 14, 2025 19:44 — with GitHub Actions Error

dylanturpin temporarily deployed to external-pr-approval July 14, 2025 19:45 — with GitHub Actions Inactive