Skip to content

Updated ASV benchmark to include all P0 envs#461

Closed
Kenny-Vilella wants to merge 8 commits into
newton-physics:mainfrom
Kenny-Vilella:dev/kvilella/update_benchmark_scripts
Closed

Updated ASV benchmark to include all P0 envs#461
Kenny-Vilella wants to merge 8 commits into
newton-physics:mainfrom
Kenny-Vilella:dev/kvilella/update_benchmark_scripts

Conversation

@Kenny-Vilella

@Kenny-Vilella Kenny-Vilella commented Jul 23, 2025

Copy link
Copy Markdown
Member

Description

  • Updated the existing humanoid and cartpole asv benchmark to use the newly added example script.
  • Added asv benchmark for H1, G1 and Ant environments
  • Added InitializeModel benchmark for the humanoid

Newton Migration Guide

Please ensure the migration guide for warp.sim users is up-to-date with the changes made in this MR.

  • The migration guide in docs/migration.rst is up-to date

Before your PR is "Ready for review"

  • All commits are signed-off to indicate that your contribution adheres to the Developer Certificate of Origin requirements
  • Necessary tests have been added and new examples are tested (see newton/tests/test_examples.py)
  • Documentation is up-to-date
  • Code passes formatting and linting checks with pre-commit run -a

Summary by CodeRabbit

  • New Features
    • Added new performance benchmarks for Ant, Cartpole, G1, H1, and Humanoid robot simulation environments, including model initialization and simulation step timing.
    • Benchmarks support varying environment counts and include both CPU and GPU (CUDA) configurations where applicable.
    • Simulation benchmarks are automatically skipped if no CUDA device is detected.
    • Introduced KPI-level benchmarks for all environments with fixed large-scale parameters (8192 environments) to measure initialization and simulation performance at scale.

Signed-off-by: Kenny Vilella <kvilella@nvidia.com>
Signed-off-by: Kenny Vilella <kvilella@nvidia.com>
@coderabbitai

coderabbitai Bot commented Jul 23, 2025

Copy link
Copy Markdown
Contributor

Note

Currently processing new changes in this PR. This may take a few minutes, please wait...

📥 Commits

Reviewing files that changed from the base of the PR and between 012411c and b4707cf.

📒 Files selected for processing (3)
  • asv/benchmarks/KPI/example_cartpole.py (1 hunks)
  • asv/benchmarks/KPI/example_humanoid.py (1 hunks)
  • asv/benchmarks/envs/example_humanoid.py (1 hunks)
 ________________________________________________________________________________________________________________________________________
< Use saboteurs to test your testing. Introduce bugs on purpose in a separate copy of the source to verify that testing will catch them. >
 ----------------------------------------------------------------------------------------------------------------------------------------
  \
   \   (\__/)
       (•ㅅ•)
       /   づ
📝 Walkthrough

Walkthrough

Six new benchmark files are introduced under asv/benchmarks/envs/, each targeting a different robot simulation environment (Ant, Cartpole, G1, H1, Humanoid). Each file defines two classes: one for benchmarking model initialization with varying environment counts, and another for benchmarking simulation step performance over multiple frames. The benchmarks use the Newton framework and Warp library, and are integrated with the ASV benchmarking framework.

Additionally, corresponding KPI benchmark files are added under asv/benchmarks/KPI/ for these environments, focusing on large-scale fixed environment counts (8192) and extended simulation frames, with detailed parameters for benchmarking runs and repeats.

Changes

File(s) Change Summary
asv/benchmarks/envs/example_ant.py Added Ant environment benchmarks: model initialization and simulation steps using Newton/Warp.
asv/benchmarks/envs/example_cartpole.py Added Cartpole environment benchmarks: model initialization and simulation steps using Newton/Warp.
asv/benchmarks/envs/example_g1.py Added G1 environment benchmarks: model initialization and simulation steps using Newton/Warp.
asv/benchmarks/envs/example_h1.py Added H1 environment benchmarks: model initialization and simulation steps using Newton/Warp.
asv/benchmarks/envs/example_humanoid.py Added Humanoid environment benchmarks: model initialization and simulation steps using Newton/Warp.
asv/benchmarks/KPI/example_ant.py Added KPI benchmarks for Ant environment with fixed large env count (8192) and extended simulation frames.
asv/benchmarks/KPI/example_cartpole.py Added KPI benchmarks for Cartpole environment with fixed large env count (8192) and extended simulation frames.
asv/benchmarks/KPI/example_g1.py Added KPI benchmarks for G1 environment with fixed large env count (8192) and extended simulation frames.
asv/benchmarks/KPI/example_h1.py Added KPI benchmarks for H1 environment with fixed large env count (8192) and extended simulation frames.
asv/benchmarks/KPI/example_humanoid.py Added KPI benchmarks for Humanoid environment with fixed large env count (8192) and extended simulation frames.

Sequence Diagram(s)

sequenceDiagram
    participant ASV as ASV Benchmark Framework
    participant Warp as Warp Library
    participant Example as Example Simulation (Newton)
    participant Device as Device (CUDA/CPU)
    
    ASV->>Warp: wp.init()
    ASV->>Example: Initialize Example(robot, num_envs, ...)
    Note right of Example: For simulation benchmarks
    loop For each simulation frame (e.g., 50 or 100)
        ASV->>Example: step()
    end
    ASV->>Device: synchronize()
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~15 minutes

Possibly related PRs

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Comment thread asv/benchmarks/examples/example_cartpole.py
@Kenny-Vilella Kenny-Vilella requested a review from shi-eric July 23, 2025 06:23
@Kenny-Vilella

Copy link
Copy Markdown
Member Author

It looks like that the time_load benchmarks are taken a lot of time.
It is true that it is great to track the compile time of mujoco_warp, but maybe not necessary to run all of them for every PR.
Should I remove some of them? Or should we run only a subset of benchmark in CI?

@shi-eric

Copy link
Copy Markdown
Member

As I hinted at over Slack, please don't modify the files I have in asv/benchmark/examples. Instead, put the benchmarks for tracking "P0 envs" in a different folder, which makes them easier to filter.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: .coderabbit.yml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d452571 and 6633de5.

📒 Files selected for processing (5)
  • asv/benchmarks/examples/example_ant.py (1 hunks)
  • asv/benchmarks/examples/example_cartpole.py (5 hunks)
  • asv/benchmarks/examples/example_g1.py (1 hunks)
  • asv/benchmarks/examples/example_h1.py (1 hunks)
  • asv/benchmarks/examples/example_humanoid.py (3 hunks)
🧠 Learnings (2)
asv/benchmarks/examples/example_cartpole.py (1)

Learnt from: Kenny-Vilella
PR: #398
File: newton/examples/example_mujoco.py:352-352
Timestamp: 2025-07-14T03:57:29.670Z
Learning: The use_mujoco option in newton/examples/example_mujoco.py is currently unsupported and causes crashes. The code automatically disables this option with a warning message when users attempt to enable it. This is intentionally kept as a placeholder for future implementation.

asv/benchmarks/examples/example_humanoid.py (1)

Learnt from: Kenny-Vilella
PR: #398
File: newton/examples/example_mujoco.py:352-352
Timestamp: 2025-07-14T03:57:29.670Z
Learning: The use_mujoco option in newton/examples/example_mujoco.py is currently unsupported and causes crashes. The code automatically disables this option with a warning message when users attempt to enable it. This is intentionally kept as a placeholder for future implementation.

🧰 Additional context used
🧠 Learnings (2)
asv/benchmarks/examples/example_cartpole.py (1)

Learnt from: Kenny-Vilella
PR: #398
File: newton/examples/example_mujoco.py:352-352
Timestamp: 2025-07-14T03:57:29.670Z
Learning: The use_mujoco option in newton/examples/example_mujoco.py is currently unsupported and causes crashes. The code automatically disables this option with a warning message when users attempt to enable it. This is intentionally kept as a placeholder for future implementation.

asv/benchmarks/examples/example_humanoid.py (1)

Learnt from: Kenny-Vilella
PR: #398
File: newton/examples/example_mujoco.py:352-352
Timestamp: 2025-07-14T03:57:29.670Z
Learning: The use_mujoco option in newton/examples/example_mujoco.py is currently unsupported and causes crashes. The code automatically disables this option with a warning message when users attempt to enable it. This is intentionally kept as a placeholder for future implementation.

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Run GPU Unit Tests on AWS EC2 (Pull Request)
  • GitHub Check: run-asv-benchmarks
🔇 Additional comments (15)
asv/benchmarks/examples/example_humanoid.py (4)

22-22: LGTM: Clean migration to unified module approach.

The import change from robot-specific module to the unified example_mujoco module with explicit robot parameter is a good architectural improvement for maintainability.


25-36: Well-implemented initialization benchmark.

The InitializeModel class correctly:

  • Uses parameterized environment counts for comprehensive testing
  • Disables CUDA graph to exclude kernel compilation overhead from timing measurements
  • Properly initializes Warp in setup method

55-64: Proper subprocess configuration for load benchmarking.

The command line arguments correctly specify the robot type and required parameters for consistent benchmarking across different robot models.


77-86: Comprehensive simulation benchmark setup.

The configuration includes all necessary parameters for realistic simulation benchmarking:

  • Increased frames to 200 for better measurement accuracy
  • Randomized actuation for varied workload testing
  • CUDA graph enabled for performance optimization
asv/benchmarks/examples/example_h1.py (2)

25-36: Consistent benchmark implementation.

The InitializeModel class follows the established pattern correctly with appropriate parameter ranges and CUDA graph disabled to exclude compilation overhead.


52-64: Proper command line argument structure.

The subprocess command correctly specifies the H1 robot and all necessary parameters for load benchmarking.

asv/benchmarks/examples/example_g1.py (3)

25-36: Excellent consistency with established benchmark pattern.

The implementation perfectly matches the pattern established in other robot benchmark files, ensuring consistent measurement methodology across different robot models.


52-64: Correct subprocess configuration for G1 robot.

The command line arguments properly specify the G1 robot while maintaining consistency with other benchmark files.


76-87: Well-configured simulation benchmark.

The setup includes all necessary parameters for comprehensive performance testing with the G1 robot model.

asv/benchmarks/examples/example_ant.py (3)

25-36: Consistent and well-structured benchmark implementation.

The InitializeModel class correctly follows the established pattern with appropriate environment count parameters and CUDA graph disabled for accurate timing measurements.


48-69: Proper load benchmark configuration.

The MuJoCoSolverLoad class correctly implements the subprocess approach with appropriate cache clearing and CUDA device validation.


76-87: Comprehensive simulation benchmark setup.

The configuration includes all necessary parameters for realistic ant robot simulation benchmarking with proper Warp initialization and device synchronization.

asv/benchmarks/examples/example_cartpole.py (3)

22-22: LGTM: Successful migration to unified module approach.

The import change to example_mujoco is consistent with the overall architectural improvement across all benchmark files.


66-71: Proper subprocess command structure.

The command line arguments correctly specify the cartpole robot and required parameters for consistent benchmarking.


84-98: Good benchmark configuration updates.

The changes include:

  • Reduced repeat count to 5 (consistent with other files)
  • Added wp.init() call for proper initialization
  • Comprehensive parameter setup with randomization and CUDA graph enabled

Comment thread asv/benchmarks/examples/example_cartpole.py Outdated
Comment thread asv/benchmarks/examples/example_h1.py
@shi-eric

Copy link
Copy Markdown
Member

It looks like that the time_load benchmarks are taken a lot of time. It is true that it is great to track the compile time of mujoco_warp, but maybe not necessary to run all of them for every PR. Should I remove some of them? Or should we run only a subset of benchmark in CI?

Yes, for the benchmarks in asv/benchmarks/envs or whatever, I wouldn't add the module load or memory tracking benchmarks yet, let's just get the time_simulate methods added.

@Kenny-Vilella

Copy link
Copy Markdown
Member Author

As I hinted at over Slack, please don't modify the files I have in asv/benchmark/examples. Instead, put the benchmarks for tracking "P0 envs" in a different folder, which makes them easier to filter.

Ah my apologies, will change it.
Out of curiosity, why would we keep two benchmarks almost identical?

Yes, for the benchmarks in asv/benchmarks/envs or whatever, I wouldn't add the module load or memory tracking benchmarks yet, let's just get the time_simulate methods added.

Got it !
KPI are simulation time + cloning time so let's focus on these two test cases.

Signed-off-by: Kenny Vilella <kvilella@nvidia.com>
Signed-off-by: Kenny Vilella <kvilella@nvidia.com>
Signed-off-by: Kenny Vilella <kvilella@nvidia.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: .coderabbit.yml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6633de5 and 00a173a.

📒 Files selected for processing (5)
  • asv/benchmarks/envs/example_ant.py (1 hunks)
  • asv/benchmarks/envs/example_cartpole.py (1 hunks)
  • asv/benchmarks/envs/example_g1.py (1 hunks)
  • asv/benchmarks/envs/example_h1.py (1 hunks)
  • asv/benchmarks/envs/example_humanoid.py (1 hunks)
🧬 Code Graph Analysis (5)
asv/benchmarks/envs/example_cartpole.py (4)
asv/benchmarks/envs/example_h1.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_g1.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_humanoid.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-50)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-56)
  • time_simulate (53-56)
asv/benchmarks/envs/example_ant.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_humanoid.py (4)
asv/benchmarks/envs/example_h1.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_cartpole.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_g1.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_ant.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_g1.py (4)
asv/benchmarks/envs/example_h1.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_cartpole.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_humanoid.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-50)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-56)
  • time_simulate (53-56)
asv/benchmarks/envs/example_ant.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_h1.py (4)
asv/benchmarks/envs/example_cartpole.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_g1.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_humanoid.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-50)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-56)
  • time_simulate (53-56)
asv/benchmarks/envs/example_ant.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_ant.py (4)
asv/benchmarks/envs/example_h1.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_cartpole.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_g1.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_humanoid.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-50)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-56)
  • time_simulate (53-56)
🧰 Additional context used
🧬 Code Graph Analysis (5)
asv/benchmarks/envs/example_cartpole.py (4)
asv/benchmarks/envs/example_h1.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_g1.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_humanoid.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-50)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-56)
  • time_simulate (53-56)
asv/benchmarks/envs/example_ant.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_humanoid.py (4)
asv/benchmarks/envs/example_h1.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_cartpole.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_g1.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_ant.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_g1.py (4)
asv/benchmarks/envs/example_h1.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_cartpole.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_humanoid.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-50)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-56)
  • time_simulate (53-56)
asv/benchmarks/envs/example_ant.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_h1.py (4)
asv/benchmarks/envs/example_cartpole.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_g1.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_humanoid.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-50)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-56)
  • time_simulate (53-56)
asv/benchmarks/envs/example_ant.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_ant.py (4)
asv/benchmarks/envs/example_h1.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_cartpole.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_g1.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_humanoid.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-50)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-56)
  • time_simulate (53-56)
🔇 Additional comments (5)
asv/benchmarks/envs/example_cartpole.py (1)

40-42: Missing wp.init() call in setup method.

The MuJoCoSolverSimulate class is missing the wp.init() call in its setup method, which is present in other environment benchmarks (example_ant.py, example_g1.py, example_h1.py). This inconsistency could lead to initialization issues.

     def setup(self):
+        wp.init()
         self.num_frames = 200

Likely an incorrect or invalid review comment.

asv/benchmarks/envs/example_g1.py (1)

23-57: LGTM! Benchmark implementation follows established patterns.

The benchmark classes are correctly implemented with proper Warp initialization, consistent parameter configuration, and appropriate CUDA handling. The implementation aligns well with the other environment benchmarks in the suite.

asv/benchmarks/envs/example_humanoid.py (1)

31-33: Benchmark implementation is well-structured.

The initialization benchmark correctly uses use_cuda_graph=False to exclude kernel compilation overhead, and the simulation benchmark properly handles CUDA device availability checks and synchronization.

asv/benchmarks/envs/example_h1.py (1)

23-57: LGTM! Benchmark implementation is consistent and well-structured.

The benchmark classes correctly follow the established patterns with proper Warp initialization, appropriate parameter configuration, and correct CUDA device handling. The implementation aligns perfectly with the benchmarking framework requirements.

asv/benchmarks/envs/example_ant.py (1)

23-57: LGTM! Benchmark implementation follows established patterns correctly.

The benchmark classes are properly implemented with consistent Warp initialization, appropriate parameter settings, and correct CUDA device handling. The implementation aligns well with the benchmarking framework standards.

Comment thread asv/benchmarks/envs/example_cartpole.py Outdated
Comment thread asv/benchmarks/envs/example_humanoid.py Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
asv/benchmarks/envs/example_cartpole.py (1)

31-33: Inconsistent initialization approach compared to other benchmark files.

This file uses wp.ScopedDevice("cpu") and omits the use_cuda_graph=False parameter, while all other benchmark files (example_ant.py, example_g1.py, example_h1.py, example_humanoid.py) explicitly set use_cuda_graph=False without the scoped device context. Consider aligning with the established pattern for consistency.

 def time_initialize_model(self, num_envs):
-    with wp.ScopedDevice("cpu"):
-        _example = Example(stage_path=None, robot="cartpole", headless=True, num_envs=num_envs)
+    # use_cuda_graph is False to exclude kernel compilation
+    _example = Example(stage_path=None, robot="cartpole", headless=True, num_envs=num_envs, use_cuda_graph=False)
📜 Review details

Configuration used: .coderabbit.yml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6633de5 and 00a173a.

📒 Files selected for processing (5)
  • asv/benchmarks/envs/example_ant.py (1 hunks)
  • asv/benchmarks/envs/example_cartpole.py (1 hunks)
  • asv/benchmarks/envs/example_g1.py (1 hunks)
  • asv/benchmarks/envs/example_h1.py (1 hunks)
  • asv/benchmarks/envs/example_humanoid.py (1 hunks)
🧬 Code Graph Analysis (4)
asv/benchmarks/envs/example_humanoid.py (4)
asv/benchmarks/envs/example_cartpole.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_ant.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_g1.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_h1.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_ant.py (4)
asv/benchmarks/envs/example_cartpole.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_g1.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_h1.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_humanoid.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-50)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-56)
  • time_simulate (53-56)
asv/benchmarks/envs/example_cartpole.py (4)
asv/benchmarks/envs/example_ant.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_g1.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_h1.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_humanoid.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-50)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-56)
  • time_simulate (53-56)
asv/benchmarks/envs/example_h1.py (4)
asv/benchmarks/envs/example_cartpole.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_ant.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_g1.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_humanoid.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-50)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-56)
  • time_simulate (53-56)
🧰 Additional context used
🧬 Code Graph Analysis (4)
asv/benchmarks/envs/example_humanoid.py (4)
asv/benchmarks/envs/example_cartpole.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_ant.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_g1.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_h1.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_ant.py (4)
asv/benchmarks/envs/example_cartpole.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_g1.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_h1.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_humanoid.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-50)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-56)
  • time_simulate (53-56)
asv/benchmarks/envs/example_cartpole.py (4)
asv/benchmarks/envs/example_ant.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_g1.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_h1.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_humanoid.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-50)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-56)
  • time_simulate (53-56)
asv/benchmarks/envs/example_h1.py (4)
asv/benchmarks/envs/example_cartpole.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_ant.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_g1.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-51)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-57)
  • time_simulate (54-57)
asv/benchmarks/envs/example_humanoid.py (6)
  • InitializeModel (23-33)
  • setup (28-29)
  • setup (40-50)
  • time_initialize_model (31-33)
  • MuJoCoSolverSimulate (36-56)
  • time_simulate (53-56)
🔇 Additional comments (6)
asv/benchmarks/envs/example_humanoid.py (2)

23-34: Consider adding consistency improvements for initialization benchmarking.

The initialization benchmark implementation is consistent with the pattern, though other files show minor variations in approach (e.g., example_cartpole.py uses wp.ScopedDevice("cpu")). The current implementation correctly excludes kernel compilation time by setting use_cuda_graph=False.


52-56: Simulation benchmark implementation follows established pattern.

The simulation timing methodology is correctly implemented with proper device synchronization and CUDA availability checking.

asv/benchmarks/envs/example_ant.py (1)

23-57: Benchmark implementation follows established pattern correctly.

The implementation is consistent with other benchmark files in the PR, properly handling Warp initialization, CUDA graph usage, and device synchronization. The ant robot benchmarking setup is correctly configured.

asv/benchmarks/envs/example_cartpole.py (1)

36-57: Simulation benchmark implementation is correct.

The MuJoCoSolverSimulate class follows the established pattern correctly with proper Warp initialization, CUDA graph usage, and device synchronization.

asv/benchmarks/envs/example_g1.py (1)

23-57: Well-implemented benchmark following established pattern.

The G1 robot benchmark implementation correctly follows the established pattern with proper Warp initialization, appropriate CUDA graph usage settings, and clear explanatory comments. The implementation is consistent with the other benchmark files.

asv/benchmarks/envs/example_h1.py (1)

23-57: Correct benchmark implementation following established conventions.

The H1 robot benchmark implementation properly follows the established pattern with appropriate Warp initialization, correct CUDA graph usage configuration, and proper device synchronization. The code is well-structured and consistent.

Comment thread asv/benchmarks/envs/example_humanoid.py Outdated
@coderabbitai coderabbitai Bot mentioned this pull request Jul 23, 2025
5 tasks
@shi-eric

Copy link
Copy Markdown
Member

(just taking notes) currently on my computer, I got these times for how long it takes to run the new benchmarks:

[100.00%] ·· ============================================================= ================
                                       benchmark                            total duration 
             ------------------------------------------------------------- ----------------
                   envs.example_g1.MuJoCoSolverSimulate.time_simulate           7.74m      
                   envs.example_h1.MuJoCoSolverSimulate.time_simulate           6.89m      
                 envs.example_h1.InitializeModel.time_initialize_model          3.03m      
                 envs.example_g1.InitializeModel.time_initialize_model          2.98m      
                 envs.example_ant.InitializeModel.time_initialize_model         1.18m      
                envs.example_humanoid.MuJoCoSolverSimulate.time_simulate        1.10m      
              envs.example_cartpole.InitializeModel.time_initialize_model       1.09m      
              envs.example_humanoid.InitializeModel.time_initialize_model       1.07m      
                  envs.example_ant.MuJoCoSolverSimulate.time_simulate           1.06m      
                envs.example_cartpole.MuJoCoSolverSimulate.time_simulate        53.6s      
                                         total                                  27.0m      
             ============================================================= ================

@shi-eric

Copy link
Copy Markdown
Member

Before I forget to provide an update @Kenny-Vilella, this is where I have things. I am focusing on example_g1 since the time_simulate is really long, and I want to figure out if we can take fewer samples and still measure the data that's important.

I changed the benchmark to be:

class MuJoCoSolverSimulate:
    params = [4, 8, 16]
    param_names = ["num_envs"]

    repeat = 3
    number = 1

    def setup(self, num_envs):
        wp.init()
        self.num_frames = 50
        self.example = Example(
            stage_path=None,
            robot="g1",
            randomize=True,
            headless=True,
            actuation="random",
            num_envs=num_envs,
            use_cuda_graph=True,
        )

    @skip_benchmark_if(wp.get_cuda_device_count() == 0)
    def track_simulate(self, num_envs):
        steps = self.num_frames * self.example.sim_substeps * self.example.num_envs
        start_time = time.time()
        for _ in range(self.num_frames):
            self.example.step()
        wp.synchronize_device()
        end_time = time.time()

        return (end_time - start_time) * 1000 / steps

    track_simulate.unit = "ms/env-step"

And I get reasonably stable results on my computer like:

[100.00%] ··· envs.example_g1.MuJoCoSolverSimulate.track_simulate                                                                 ok
[100.00%] ··· ========== ====================
               num_envs                      
              ---------- --------------------
                  4       3.4309967756271362 
                  8       1.8521174788475037 
                  16      0.9501362144947052 
              ========== ====================

[100.00%] ·· ======================================================= ================
                                    benchmark                         total duration 
             ------------------------------------------------------- ----------------
              envs.example_g1.InitializeModel.time_initialize_model       1.57m      
               envs.example_g1.MuJoCoSolverSimulate.track_simulate        51.0s      
                                      total                               2.42m      
             ======================================================= ================

For the track_ benchmark, we need a unit where lower is better for ASV, so that's why I'm using the more unconventional time per env-steps.

@Kenny-Vilella

Copy link
Copy Markdown
Member Author

@shi-eric Thanks for taking a look.
FYI, the number of envs are just temporary.
I asked in #301 what value we should investigate but did not yet receive answer from all stakeholders.
The minimum will probably to run for 8192 envs.
On my workstation the example_g1 takes 15min with 8192 envs (repeat = 3, number = 1), but I think it timed out so the real number may even be longer....
It seems very slow compare to my experience in mujoco_warp, not sure if it is asv overhead or newton is currently not optimized enough. Will take a quick look.

@Kenny-Vilella

Copy link
Copy Markdown
Member Author

Some information about example_g1 with 8192 envs using the script you provided earlier on horde (L40).
For MuJoCoSolverSimulate:
The benchmark takes 6min to run.
I tried to comment the simulation loop to get the whole overhead, and it took 5min to run.
So most of the time is spent on the setup phase.
For InitializeModel:
The benchmark takes 15min to run with repeat and number equal 1.
The benchmark itself reports ~3-4min.
There is something wrong here, we should not have a so long overhead. Will investigate where the time is spent.

@shi-eric

Copy link
Copy Markdown
Member

Some information about example_g1 with 8192 envs using the script you provided earlier on horde (L40). For MuJoCoSolverSimulate: The benchmark takes 6min to run. I tried to comment the simulation loop to get the whole overhead, and it took 5min to run. So most of the time is spent on the setup phase. For InitializeModel: The benchmark takes 15min to run with repeat and number equal 1. The benchmark itself reports ~3-4min. There is something wrong here, we should not have a so long overhead. Will investigate where the time is spent.

Okay, so it seems that we might not use 8192 envs initially for the benchmark. Instead we want to use a lower number that still exhibits the qualitative trends we want to use to measure progress so developers don't need to wait so long.

@Kenny-Vilella

Copy link
Copy Markdown
Member Author

It looks like that a nightly (or weekly?) run to follow KPI progress + a lighter run for PRs would be a good solution.

@shi-eric

Copy link
Copy Markdown
Member

It looks like that a nightly (or weekly?) run to follow KPI progress + a lighter run for PRs would be a good solution.

Okay, how about adding a class MuJoCoSolverSimulateKPI: that runs for the large env counts. We want a uniform name that can be used to exclude the tests from running in merge request pipelines.

The class InitializeModel: will be run at smaller values, and using two env values so we can get a sense of scaling as Alain said.

The current class MuJoCoSolverSimulate: class will be used for CI/CD purposes and the goal is to find values that get decent performance measurements within 2 minutes (excluding compilation time).

This shouldn't take a long time, let me know if you want to break this up into two pull requests (one set for KPI runs and one set for CI/CD runs) and if I can help with either.

@Kenny-Vilella

Copy link
Copy Markdown
Member Author

It makes sense, but we will probably needs a class InitializeModelKPI as well, since KPI mentioned both init time and sim time.

I can start doing it.
If you want to do some modifications, feel free to push to this branch.
The only question is whether we put the KPI runs in a separate folder or we put all the class in the same file.

Also, I noticed that there are rounds and min_run_count parameters that affects the number of runs.
If we only set number = 1 and repeat = 1, the benchmark actually runs 6 times.

Signed-off-by: Kenny Vilella <kvilella@nvidia.com>
Signed-off-by: Kenny Vilella <kvilella@nvidia.com>
@Kenny-Vilella

Copy link
Copy Markdown
Member Author

OK so I created two folders:

  • KPI with the KPI benchmarks, all of them having 8192 envs. On my workstation it takes 19min to run
  • envs, for the benchmark to run with every PR. On my workstation it takes 5min to run

All of them follows the example you have given earlier in this thread.

I mainly changed the value of rounds and kept number and repeat to 1.
But not sure what is the best to get consistent values, so feel free to change the parameters.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
asv/benchmarks/KPI/example_humanoid.py (1)

48-48: Consider standardizing frame count across KPI benchmarks.

This benchmark uses 50 frames while other KPI benchmarks (ant, cartpole, g1, h1) use 100 frames. Consider standardizing to ensure consistent measurement methodology across robot types.

-    self.num_frames = 50
+    self.num_frames = 100
📜 Review details

Configuration used: .coderabbit.yml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 00a173a and 012411c.

📒 Files selected for processing (10)
  • asv/benchmarks/KPI/example_ant.py (1 hunks)
  • asv/benchmarks/KPI/example_cartpole.py (1 hunks)
  • asv/benchmarks/KPI/example_g1.py (1 hunks)
  • asv/benchmarks/KPI/example_h1.py (1 hunks)
  • asv/benchmarks/KPI/example_humanoid.py (1 hunks)
  • asv/benchmarks/envs/example_ant.py (1 hunks)
  • asv/benchmarks/envs/example_cartpole.py (1 hunks)
  • asv/benchmarks/envs/example_g1.py (1 hunks)
  • asv/benchmarks/envs/example_h1.py (1 hunks)
  • asv/benchmarks/envs/example_humanoid.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • asv/benchmarks/envs/example_humanoid.py
  • asv/benchmarks/envs/example_g1.py
  • asv/benchmarks/envs/example_cartpole.py
  • asv/benchmarks/envs/example_h1.py
  • asv/benchmarks/envs/example_ant.py
🧰 Additional context used
🧠 Learnings (5)
asv/benchmarks/KPI/example_cartpole.py (1)

Learnt from: shi-eric
PR: #461
File: asv/benchmarks/envs/example_humanoid.py:40-41
Timestamp: 2025-07-23T14:36:42.135Z
Learning: In Warp benchmarks, explicit wp.init() calls are not needed in most circumstances since the first Warp API call that requires initialization will automatically call wp.init(). Explicit wp.init() in setup() methods is helpful when the ASV benchmark is measuring a Warp API call, as wp.init() has non-trivial overhead that should be excluded from the benchmark timing.

asv/benchmarks/KPI/example_ant.py (1)

Learnt from: shi-eric
PR: #461
File: asv/benchmarks/envs/example_humanoid.py:40-41
Timestamp: 2025-07-23T14:36:42.135Z
Learning: In Warp benchmarks, explicit wp.init() calls are not needed in most circumstances since the first Warp API call that requires initialization will automatically call wp.init(). Explicit wp.init() in setup() methods is helpful when the ASV benchmark is measuring a Warp API call, as wp.init() has non-trivial overhead that should be excluded from the benchmark timing.

asv/benchmarks/KPI/example_g1.py (1)

Learnt from: shi-eric
PR: #461
File: asv/benchmarks/envs/example_humanoid.py:40-41
Timestamp: 2025-07-23T14:36:42.135Z
Learning: In Warp benchmarks, explicit wp.init() calls are not needed in most circumstances since the first Warp API call that requires initialization will automatically call wp.init(). Explicit wp.init() in setup() methods is helpful when the ASV benchmark is measuring a Warp API call, as wp.init() has non-trivial overhead that should be excluded from the benchmark timing.

asv/benchmarks/KPI/example_humanoid.py (1)

Learnt from: shi-eric
PR: #461
File: asv/benchmarks/envs/example_humanoid.py:40-41
Timestamp: 2025-07-23T14:36:42.135Z
Learning: In Warp benchmarks, explicit wp.init() calls are not needed in most circumstances since the first Warp API call that requires initialization will automatically call wp.init(). Explicit wp.init() in setup() methods is helpful when the ASV benchmark is measuring a Warp API call, as wp.init() has non-trivial overhead that should be excluded from the benchmark timing.

asv/benchmarks/KPI/example_h1.py (1)

Learnt from: shi-eric
PR: #461
File: asv/benchmarks/envs/example_humanoid.py:40-41
Timestamp: 2025-07-23T14:36:42.135Z
Learning: In Warp benchmarks, explicit wp.init() calls are not needed in most circumstances since the first Warp API call that requires initialization will automatically call wp.init(). Explicit wp.init() in setup() methods is helpful when the ASV benchmark is measuring a Warp API call, as wp.init() has non-trivial overhead that should be excluded from the benchmark timing.

🧬 Code Graph Analysis (4)
asv/benchmarks/KPI/example_cartpole.py (3)
asv/benchmarks/KPI/example_humanoid.py (6)
  • InitializeModelKPI (25-38)
  • setup (33-34)
  • setup (47-57)
  • time_initialize_model (36-38)
  • MuJoCoSolverSimulateKPI (41-70)
  • track_simulate (60-68)
asv/benchmarks/KPI/example_h1.py (6)
  • InitializeModelKPI (25-38)
  • setup (33-34)
  • setup (47-58)
  • time_initialize_model (36-38)
  • MuJoCoSolverSimulateKPI (41-71)
  • track_simulate (61-69)
asv/benchmarks/envs/example_cartpole.py (4)
  • setup (33-34)
  • setup (50-61)
  • time_initialize_model (36-38)
  • track_simulate (64-72)
asv/benchmarks/KPI/example_ant.py (3)
asv/benchmarks/KPI/example_cartpole.py (6)
  • InitializeModelKPI (25-38)
  • setup (33-34)
  • setup (47-58)
  • time_initialize_model (36-38)
  • MuJoCoSolverSimulateKPI (41-71)
  • track_simulate (61-69)
asv/benchmarks/KPI/example_humanoid.py (6)
  • InitializeModelKPI (25-38)
  • setup (33-34)
  • setup (47-57)
  • time_initialize_model (36-38)
  • MuJoCoSolverSimulateKPI (41-70)
  • track_simulate (60-68)
asv/benchmarks/envs/example_ant.py (4)
  • setup (33-34)
  • setup (50-61)
  • time_initialize_model (36-38)
  • track_simulate (64-72)
asv/benchmarks/KPI/example_g1.py (5)
asv/benchmarks/KPI/example_cartpole.py (6)
  • InitializeModelKPI (25-38)
  • setup (33-34)
  • setup (47-58)
  • time_initialize_model (36-38)
  • MuJoCoSolverSimulateKPI (41-71)
  • track_simulate (61-69)
asv/benchmarks/KPI/example_ant.py (6)
  • InitializeModelKPI (25-38)
  • setup (33-34)
  • setup (47-58)
  • time_initialize_model (36-38)
  • MuJoCoSolverSimulateKPI (41-71)
  • track_simulate (61-69)
asv/benchmarks/KPI/example_humanoid.py (6)
  • InitializeModelKPI (25-38)
  • setup (33-34)
  • setup (47-57)
  • time_initialize_model (36-38)
  • MuJoCoSolverSimulateKPI (41-70)
  • track_simulate (60-68)
asv/benchmarks/KPI/example_h1.py (6)
  • InitializeModelKPI (25-38)
  • setup (33-34)
  • setup (47-58)
  • time_initialize_model (36-38)
  • MuJoCoSolverSimulateKPI (41-71)
  • track_simulate (61-69)
asv/benchmarks/envs/example_g1.py (4)
  • setup (33-34)
  • setup (50-61)
  • time_initialize_model (36-38)
  • track_simulate (64-72)
asv/benchmarks/KPI/example_h1.py (2)
asv/benchmarks/KPI/example_cartpole.py (6)
  • InitializeModelKPI (25-38)
  • setup (33-34)
  • setup (47-58)
  • time_initialize_model (36-38)
  • MuJoCoSolverSimulateKPI (41-71)
  • track_simulate (61-69)
asv/benchmarks/envs/example_h1.py (4)
  • setup (33-34)
  • setup (50-61)
  • time_initialize_model (36-38)
  • track_simulate (64-72)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Run GPU Unit Tests on AWS EC2 (Pull Request)
🔇 Additional comments (7)
asv/benchmarks/KPI/example_ant.py (2)

25-71: LGTM: Consistent KPI benchmark implementation.

The benchmark classes follow the established pattern with appropriate ASV parameters, proper CUDA graph usage, and correct timing methodology. The initialization benchmark excludes kernel compilation overhead and the simulation benchmark includes proper device synchronization.


47-58: Add missing wp.init() call in setup method.

The setup() method is missing the wp.init() call, which is inconsistent with other KPI benchmark files. Based on the retrieved learning, explicit wp.init() calls in setup methods help exclude Warp initialization overhead from benchmark timing.

 def setup(self):
+    wp.init()
     self.num_frames = 100
     self.example = Example(
⛔ Skipped due to learnings
Learnt from: shi-eric
PR: newton-physics/newton#461
File: asv/benchmarks/envs/example_humanoid.py:40-41
Timestamp: 2025-07-23T14:36:42.135Z
Learning: In Warp benchmarks, explicit wp.init() calls are not needed in most circumstances since the first Warp API call that requires initialization will automatically call wp.init(). Explicit wp.init() in setup() methods is helpful when the ASV benchmark is measuring a Warp API call, as wp.init() has non-trivial overhead that should be excluded from the benchmark timing.
asv/benchmarks/KPI/example_cartpole.py (2)

37-37: Consider consistency with other KPI benchmarks regarding CPU scoping.

This file uses wp.ScopedDevice("cpu") in the initialization benchmark, while other KPI benchmark files don't use this scoping. Verify if this CPU scoping is intentionally specific to cartpole or should be standardized across all KPI benchmarks.


41-71: LGTM: Proper simulation benchmark implementation.

The simulation benchmark correctly initializes Warp, uses appropriate CUDA graph settings, includes device synchronization, and follows the established timing methodology.

asv/benchmarks/KPI/example_humanoid.py (1)

25-38: LGTM: Correct initialization benchmark implementation.

The initialization benchmark properly initializes Warp in setup, uses use_cuda_graph=False to exclude kernel compilation overhead, and follows the established pattern.

asv/benchmarks/KPI/example_g1.py (1)

25-71: LGTM: Well-implemented KPI benchmark.

This benchmark file correctly follows the established pattern with proper Warp initialization, appropriate CUDA graph usage, consistent frame count (100), and correct timing methodology including device synchronization.

asv/benchmarks/KPI/example_h1.py (1)

25-71: LGTM: Consistent and correct KPI benchmark implementation.

This benchmark file properly implements the KPI pattern with correct Warp initialization, appropriate CUDA graph settings, standard frame count (100), and proper timing methodology with device synchronization.

Comment thread asv/benchmarks/KPI/example_cartpole.py Outdated
Comment thread asv/benchmarks/KPI/example_humanoid.py
@adenzler-nvidia

Copy link
Copy Markdown
Member

Just following along - @Kenny-Vilella I think it makes sense to report these initialization findings/problematic benchmarks in #55 such that they can be looked at.

Signed-off-by: Kenny Vilella <kvilella@nvidia.com>
@shi-eric

Copy link
Copy Markdown
Member

OK so I created two folders:

  • KPI with the KPI benchmarks, all of them having 8192 envs. On my workstation it takes 19min to run
  • envs, for the benchmark to run with every PR. On my workstation it takes 5min to run

All of them follows the example you have given earlier in this thread.

I mainly changed the value of rounds and kept number and repeat to 1. But not sure what is the best to get consistent values, so feel free to change the parameters.

Thanks! I did some restructuring today of the KPI folder. Now I need to figure out how to push my commits... Maybe it'll be easier to create a new pull request with your commits? I basically refactored the model builder creation out of the Example.__init__() so we can create and reuse it. This allows us to get more samples without paying the crazy cost for creating the builder each time, but this means we get randomization in each run (if we adjust the seed) but not across samples. I think it's a decent tradeoff...

class G1:
    params = [4096, 8192]
    param_names = ["num_envs"]
    num_frames = 50
    robot = "g1"
    timeout = 1200

    @skip_benchmark_if(wp.get_cuda_device_count() == 0)
    def track_simulate(self, num_envs):
        samples = 4
        builder = Example.create_model_builder(self.robot, num_envs, randomize=True, seed=123)

        total_time = 0.0
        for _iter in range(samples):
            example = Example(
                stage_path=None,
                robot=self.robot,
                randomize=True,
                headless=True,
                actuation="random",
                num_envs=num_envs,
                use_cuda_graph=True,
                builder=builder,
            )

            wp.synchronize_device()
            start_time = time.time()
            for _ in range(self.num_frames):
                example.step()
            wp.synchronize_device()
            total_time += time.time() - start_time

        return total_time * 1000 / (self.num_frames * example.sim_substeps * num_envs * samples)

    track_simulate.unit = "ms/env-step"

@Kenny-Vilella

Copy link
Copy Markdown
Member Author

I guess you can either create a new pull request with my commits or create a pull request to this branch.
Both is totally fine for me.

@shi-eric

Copy link
Copy Markdown
Member

I think we can get better statistics by taking the median of all Example.step() times instead of an average, but if you like this idea, we should probably defer it for a future pull request.

@Kenny-Vilella

Copy link
Copy Markdown
Member Author

I think we can get better statistics by taking the median of all Example.step() times instead of an average, but if you like this idea, we should probably defer it for a future pull request.

Median vs average seems to be a complicated choice.
When optimizing mujoco_warp, I did notice that some changes were making performance much more unstable.

@shi-eric

Copy link
Copy Markdown
Member

Merged (with changes) in #475

@shi-eric shi-eric closed this Jul 26, 2025
@Kenny-Vilella Kenny-Vilella deleted the dev/kvilella/update_benchmark_scripts branch September 5, 2025 03:53
vidurv-nvidia pushed a commit to vidurv-nvidia/newton that referenced this pull request Mar 6, 2026
# Description

<!--
Thank you for your interest in sending a pull request. Please make sure
to check the contribution guidelines.

Link:
https://isaac-sim.github.io/IsaacLab/main/source/refs/contributing.html
-->

Fixes a bug where the XR kit file settings were split by app.python kit
settings which caused the AVP retargeting to perform incorrectly.

The new XR settings have been shifted to not split the settings.

Fixes # (issue)

<!-- As a practice, it is recommended to open an issue to have
discussions on the proposed pull request.
This makes it easier for the community to keep track of what is being
developed or added, and if a given feature
is demanded by more than one party. -->

## Type of change

<!-- As you go through the list, delete the ones that are not
applicable. -->

- Bug fix (non-breaking change which fixes an issue)


## Checklist

- [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./isaaclab.sh --format`
- [ ] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file
- [ ] I have added my name to the `CONTRIBUTORS.md` or my name already
exists there

<!--
As you go through the checklist above, you can mark something as done by
putting an x character in it

For example,
- [x] I have done this task
- [ ] I have not done this task
-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants