Improve SolverMuJoCo setup performance#778
Conversation
Signed-off-by: Alain Denzler <adenzler@nvidia.com>
Signed-off-by: Alain Denzler <adenzler@nvidia.com>
Signed-off-by: Alain Denzler <adenzler@nvidia.com>
Signed-off-by: Alain Denzler <adenzler@nvidia.com>
Signed-off-by: Alain Denzler <adenzler@nvidia.com>
…/solver-creation-benchmarks
Signed-off-by: Alain Denzler <adenzler@nvidia.com>
Signed-off-by: Alain Denzler <adenzler@nvidia.com>
Signed-off-by: Alain Denzler <adenzler@nvidia.com>
Signed-off-by: Alain Denzler <adenzler@nvidia.com>
Signed-off-by: Alain Denzler <adenzler@nvidia.com>
Signed-off-by: Alain Denzler <adenzler@nvidia.com>
Signed-off-by: Alain Denzler <adenzler@nvidia.com>
Signed-off-by: Alain Denzler <adenzler@nvidia.com>
Signed-off-by: Alain Denzler <adenzler@nvidia.com>
Signed-off-by: Alain Denzler <adenzler@nvidia.com>
Signed-off-by: Alain Denzler <adenzler@nvidia.com>
… setup_cache Signed-off-by: Alain Denzler <adenzler@nvidia.com>
Signed-off-by: Alain Denzler <adenzler@nvidia.com>
Signed-off-by: Alain Denzler <adenzler@nvidia.com>
Signed-off-by: Alain Denzler <adenzler@nvidia.com>
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (4)
newton/examples/example_mujoco.py (1)
339-345: Consider using a dictionary for parameter defaults to reduce redundancy.The current approach of checking and setting each parameter individually is correct but verbose. A more maintainable approach would consolidate these into a single pattern.
- solver_iteration = solver_iteration if solver_iteration is not None else 100 - ls_iteration = ls_iteration if ls_iteration is not None else 50 - solver = solver if solver is not None else ROBOT_CONFIGS[robot]["solver"] - integrator = integrator if integrator is not None else ROBOT_CONFIGS[robot]["integrator"] - njmax = njmax if njmax is not None else ROBOT_CONFIGS[robot]["njmax"] - nconmax = nconmax if nconmax is not None else ROBOT_CONFIGS[robot]["nconmax"] - ls_parallel = ls_parallel if ls_parallel is not None else ROBOT_CONFIGS[robot]["ls_parallel"] + defaults = { + 'solver_iteration': 100, + 'ls_iteration': 50, + 'solver': ROBOT_CONFIGS[robot]["solver"], + 'integrator': ROBOT_CONFIGS[robot]["integrator"], + 'njmax': ROBOT_CONFIGS[robot]["njmax"], + 'nconmax': ROBOT_CONFIGS[robot]["nconmax"], + 'ls_parallel': ROBOT_CONFIGS[robot]["ls_parallel"], + } + + params = { + 'iterations': solver_iteration if solver_iteration is not None else defaults['solver_iteration'], + 'ls_iterations': ls_iteration if ls_iteration is not None else defaults['ls_iteration'], + 'solver': solver if solver is not None else defaults['solver'], + 'integrator': integrator if integrator is not None else defaults['integrator'], + 'njmax': njmax if njmax is not None else defaults['njmax'], + 'ncon_per_env': nconmax if nconmax is not None else defaults['nconmax'], + 'ls_parallel': ls_parallel if ls_parallel is not None else defaults['ls_parallel'], + }asv/benchmarks/setup/bench_model.py (1)
68-73: Consider renamingsetup_cachetosetupfor clarity.The
setup_cachemethod in theFastInitializeModelclass seems to function identically to thesetupmethod inKpiInitializeModel. Using consistent naming would improve readability.Consider using
setupinstead ofsetup_cachefor consistency withKpiInitializeModel, unless there's a specific ASV requirement for caching behavior that I'm not aware of.newton/_src/solvers/mujoco/solver_mujoco.py (2)
1196-1197: Remove unusednoqadirectives.The
noqa: PLC0415directives are marked as unused by the linter, likely because the PLC0415 rule isn't enabled in your configuration.- import mujoco # noqa: PLC0415 - import mujoco_warp # noqa: PLC0415 + import mujoco + import mujoco_warp
1647-1647: Remove unnecessary int() cast.The value is already an integer from
len(to_color_shape_index).- num_nodes=int(len(to_color_shape_index)), + num_nodes=len(to_color_shape_index),
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
.gitignore(2 hunks)asv/benchmarks/setup/bench_model.py(4 hunks)newton/_src/solvers/mujoco/solver_mujoco.py(10 hunks)newton/examples/example_mujoco.py(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- .gitignore
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-08-19T12:27:30.629Z
Learnt from: preist-nvidia
PR: newton-physics/newton#579
File: newton/examples/example_mujoco.py:350-354
Timestamp: 2025-08-19T12:27:30.629Z
Learning: In Newton examples, there's a distinction between solver parameters and Example class attributes. The Example class can have its own use_mujoco attribute for controlling example-level behavior (like CUDA graphs, rendering logic), while the solver uses use_mujoco_cpu for backend selection. These serve different purposes and should not be conflated during API renames.
Applied to files:
newton/examples/example_mujoco.pynewton/_src/solvers/mujoco/solver_mujoco.py
📚 Learning: 2025-08-19T12:27:30.629Z
Learnt from: preist-nvidia
PR: newton-physics/newton#579
File: newton/examples/example_mujoco.py:350-354
Timestamp: 2025-08-19T12:27:30.629Z
Learning: In Newton examples, Example.use_mujoco is a high-level attribute that controls whether to use MuJoCo solver vs other solvers (like XPBD), while SolverMuJoCo.use_mujoco_cpu controls the backend within MuJoCo (CPU vs Warp). These are separate concepts serving different purposes - the PR rename only applies to the solver parameter, not the Example class attributes.
Applied to files:
newton/examples/example_mujoco.pynewton/_src/solvers/mujoco/solver_mujoco.py
🧬 Code graph analysis (2)
newton/examples/example_mujoco.py (1)
newton/_src/solvers/mujoco/solver_mujoco.py (1)
SolverMuJoCo(1135-2748)
asv/benchmarks/setup/bench_model.py (1)
newton/examples/example_mujoco.py (2)
create_model_builder(295-324)create_solver(327-357)
🪛 Ruff (0.12.2)
asv/benchmarks/setup/bench_model.py
54-54: Unused method argument: num_envs
(ARG002)
95-95: Unused method argument: num_envs
(ARG002)
newton/_src/solvers/mujoco/solver_mujoco.py
1196-1196: Unused noqa directive (non-enabled: PLC0415)
Remove unused noqa directive
(RUF100)
1197-1197: Unused noqa directive (non-enabled: PLC0415)
Remove unused noqa directive
(RUF100)
1202-1204: Avoid specifying long messages outside the exception class
(TRY003)
1647-1647: Value being cast to int is already an integer
Remove unnecessary int call
(RUF046)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Run GPU Unit Tests on AWS EC2 (Pull Request)
- GitHub Check: Run GPU Benchmarks (Pull Request)
- GitHub Check: run-newton-tests / newton-unittests (ubuntu-latest)
- GitHub Check: run-newton-tests / newton-unittests (windows-latest)
🔇 Additional comments (8)
newton/examples/example_mujoco.py (1)
235-246: Extract the new factory method reduces coupling and improves reusability.The new
create_solverstatic factory method nicely separates solver creation logic from initialization, enabling external benchmarks and tests to reuse the solver setup without creating an entire Example instance. This aligns well with the performance benchmarking needs.asv/benchmarks/setup/bench_model.py (2)
40-43: Good approach to prebuild the model for benchmark accuracy.Prebuilding the model in
setupand reusing it intime_initialize_solverensures the solver initialization timing is isolated from model building costs.
54-56: The unusednum_envsparameter is acceptable for benchmarking consistency.The
num_envsparameter is part of the ASV benchmark parameter matrix. While it's not directly used intime_initialize_solver, it determines which prebuilt model is used (viaself._modelfromsetup), so it's correctly structured for the benchmarking framework.newton/_src/solvers/mujoco/solver_mujoco.py (5)
1187-1205: Class-level caching for MuJoCo modules is a solid optimization.The change from module-level to class-level caching with
@classmethodprovides better encapsulation while maintaining the performance benefit of one-time import. The error handling with descriptive message is appropriate.
1625-1656: Excellent optimization using dense indexing for graph coloring!The introduction of
to_color_shape_indexmapping brilliantly reduces the graph size from potentiallymax(selected_shapes) + 1nodes to exactlylen(selected_shapes)nodes. This is particularly effective when shape indices have gaps, significantly improving the performance for complex models like the G1 humanoid (32.3s → 2.6s).The implementation correctly:
- Maps selected shapes to dense indices for graph construction
- Uses dense indices for edge creation
- Maps colors back to original shape indices
This is a key contributor to the reported performance improvements.
1262-1263: Good defensive access pattern for cached modules.Using
self.import_mujoco()to get the cachedmujocomodule and ignoring themujoco_warpreturn when not needed is clean.
1338-1350: Module references properly updated throughout the step method.All references to MuJoCo modules have been correctly updated to use the cached class variables (
self._mujocoandself._mujoco_warp).
1688-1688: Incorrect review comment — access pattern is valid. import_mujoco caches the MuJoCo module on cls._mujoco (def at ~1192–1200) and some code binds the returned module to a local name (e.g., line 1755); update_contacts uses self._mujoco at line 1688 accordingly — no change required.Likely an incorrect or invalid review comment.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (6)
newton/examples/example_mujoco.py (1)
326-358: Harden defaults and error handling in create_solver
- Guard unknown robots to raise a clear error.
- Minor: keep color/flag defaults sourced from ROBOT_CONFIGS unless overridden.
Apply:
@staticmethod def create_solver( model, robot, use_mujoco_cpu, solver=None, integrator=None, solver_iteration=None, ls_iteration=None, njmax=None, nconmax=None, ls_parallel=None, ): + if robot not in ROBOT_CONFIGS: + raise ValueError(f"Unknown robot '{robot}'. Valid: {', '.join(sorted(ROBOT_CONFIGS))}")Optional (outside this hunk): let CLI defer to per-robot default by using None.
- parser.add_argument( - "--ls-parallel", default=True, action=argparse.BooleanOptionalAction, help="Use parallel line search." - ) + parser.add_argument( + "--ls-parallel", default=None, action=argparse.BooleanOptionalAction, help="Use parallel line search." + )asv/benchmarks/setup/bench_model.py (2)
53-57: Ensure solver object is released; silence linter paramKeep the solver alive until sync, then drop it; also mark num_envs unused to quiet ARG002.
Apply:
@skip_benchmark_if(wp.get_cuda_device_count() == 0) def time_initialize_solver(self, robot, num_envs): - Example.create_solver(self._model, robot, use_mujoco_cpu=False) + del num_envs # unused; parametrization via setup() + s = Example.create_solver(self._model, robot, use_mujoco_cpu=False) wp.synchronize_device() + del s
94-98: Mirror fix: release solver; silence linter paramSame concerns as KPI method.
Apply:
@skip_benchmark_if(wp.get_cuda_device_count() == 0) def time_initialize_solver(self, robot, num_envs): - Example.create_solver(self._model, robot, use_mujoco_cpu=False) + del num_envs + s = Example.create_solver(self._model, robot, use_mujoco_cpu=False) wp.synchronize_device() + del sOptional (outside this hunk): free temporary model in time_initialize_model to reduce peak mem.
_model = builder.finalize() wp.synchronize_device() +del _modelnewton/_src/solvers/mujoco/solver_mujoco.py (3)
1187-1206: Drop unused noqa; tighten ImportError messageRuff flags PLC0415 noqa here; also keep message concise.
Apply:
def import_mujoco(cls): """Import the MuJoCo Warp dependencies and cache them as class variables.""" if cls._mujoco is None or cls._mujoco_warp is None: try: - import mujoco # noqa: PLC0415 - import mujoco_warp # noqa: PLC0415 + import mujoco + import mujoco_warp cls._mujoco = mujoco cls._mujoco_warp = mujoco_warp except ImportError as e: - raise ImportError( - "MuJoCo backend not installed. Please refer to https://github.com/google-deepmind/mujoco_warp for installation instructions." - ) from e + raise ImportError("MuJoCo backend not installed; see mujoco_warp installation instructions.") from e return cls._mujoco, cls._mujoco_warp
1625-1656: Fix graph visualization nodes; remove unnecessary int(); 0‑based colors
- plot_graph should receive node ids consistent with graph_edges (dense 0..N-1).
- int(len(...)) is redundant.
- Color indexing should start at 0 to preserve 32 usable bits (0..31) for contype.
Apply:
if len(graph_edges) > 0: if visualize_graph: - plot_graph(selected_shapes, graph_edges) + plot_graph(list(range(len(to_color_shape_index))), graph_edges) color_groups = color_graph( - num_nodes=int(len(to_color_shape_index)), + num_nodes=len(to_color_shape_index), graph_edge_indices=wp.array(graph_edges, dtype=wp.int32), balance_colors=False, ) shape_color = np.zeros(model.shape_count, dtype=np.int32) - num_colors = 0 - for group in color_groups: - num_colors += 1 - shape_color[selected_shapes[group]] = num_colors + for color_idx, group in enumerate(color_groups): + shape_color[selected_shapes[group]] = color_idx else: - # no edges in the graph, all shapes can collide with each other + # no edges in the graph; assign a single color shape_color = np.zeros(model.shape_count, dtype=np.int32)
1688-1691: Nit: drop unnecessary int() castThe enum is already an int.
Apply:
- self.mjw_model.opt.cone == int(self._mujoco.mjtCone.mjCONE_PYRAMIDAL), + self.mjw_model.opt.cone == self._mujoco.mjtCone.mjCONE_PYRAMIDAL,
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
.gitignore(2 hunks)asv/benchmarks/setup/bench_model.py(4 hunks)newton/_src/solvers/mujoco/solver_mujoco.py(10 hunks)newton/examples/example_mujoco.py(2 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-08-19T12:27:30.629Z
Learnt from: preist-nvidia
PR: newton-physics/newton#579
File: newton/examples/example_mujoco.py:350-354
Timestamp: 2025-08-19T12:27:30.629Z
Learning: In Newton examples, there's a distinction between solver parameters and Example class attributes. The Example class can have its own use_mujoco attribute for controlling example-level behavior (like CUDA graphs, rendering logic), while the solver uses use_mujoco_cpu for backend selection. These serve different purposes and should not be conflated during API renames.
Applied to files:
newton/examples/example_mujoco.pynewton/_src/solvers/mujoco/solver_mujoco.py
📚 Learning: 2025-08-19T12:27:30.629Z
Learnt from: preist-nvidia
PR: newton-physics/newton#579
File: newton/examples/example_mujoco.py:350-354
Timestamp: 2025-08-19T12:27:30.629Z
Learning: In Newton examples, Example.use_mujoco is a high-level attribute that controls whether to use MuJoCo solver vs other solvers (like XPBD), while SolverMuJoCo.use_mujoco_cpu controls the backend within MuJoCo (CPU vs Warp). These are separate concepts serving different purposes - the PR rename only applies to the solver parameter, not the Example class attributes.
Applied to files:
newton/examples/example_mujoco.pynewton/_src/solvers/mujoco/solver_mujoco.py
🧬 Code graph analysis (1)
asv/benchmarks/setup/bench_model.py (1)
newton/examples/example_mujoco.py (3)
Example(191-357)create_model_builder(295-324)create_solver(327-357)
🪛 Ruff (0.12.2)
asv/benchmarks/setup/bench_model.py
54-54: Unused method argument: num_envs
(ARG002)
95-95: Unused method argument: num_envs
(ARG002)
newton/_src/solvers/mujoco/solver_mujoco.py
1196-1196: Unused noqa directive (non-enabled: PLC0415)
Remove unused noqa directive
(RUF100)
1197-1197: Unused noqa directive (non-enabled: PLC0415)
Remove unused noqa directive
(RUF100)
1202-1204: Avoid specifying long messages outside the exception class
(TRY003)
1647-1647: Value being cast to int is already an integer
Remove unnecessary int call
(RUF046)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Run GPU Unit Tests on AWS EC2 (Pull Request)
- GitHub Check: Run GPU Benchmarks (Pull Request)
- GitHub Check: run-newton-tests / newton-unittests (windows-latest)
🔇 Additional comments (6)
.gitignore (2)
14-14: Add .thumbs/ ignore — LGTMNo concerns. Matches other tool-cache ignores.
41-45: Good: ignoring MuJoCo log and test outputsMUJOCO_LOG.TXT and newton/tests/outputs/ are safe to ignore; header rename is clearer.
newton/examples/example_mujoco.py (1)
235-246: Centralizing solver creation — LGTMCleaner init; preserves separation between Example.use_mujoco and SolverMuJoCo.use_mujoco_cpu (per prior guidance).
asv/benchmarks/setup/bench_model.py (2)
40-44: Model caching in setup — LGTMPrebuilding the model for solver-time KPI is appropriate.
68-74: Fast path cache — LGTMMatches KPI setup; clear and symmetric.
newton/_src/solvers/mujoco/solver_mujoco.py (1)
1263-1264: Module caching usage — LGTMImport once; subsequent calls reuse cached modules. Good for startup time.
Signed-off-by: Eric Heiden <eric-heiden@outlook.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
asv/benchmarks/setup/bench_model.py (2)
40-43: Sync after finalize to avoid polluted solver timings; prep a solver stash.Without a device sync here, outstanding model-finalization work can leak into the first solver timing. Also, we’ll need a place to hold solver refs across repeats to keep destructor work out of the timed region.
Apply:
# finalize model - self._model = builder.finalize() + self._model = builder.finalize() + # ensure model build doesn't overlap with solver timing + wp.synchronize_device() + # keep solvers alive across repeats; freed in teardown + self._solvers = []
68-73: Mirror KPI fixes: sync after finalize and prep solver stash.Same concerns as above for the fast benchmark setup.
Apply:
- def setup(self, robot, num_envs): + def setup(self, robot, num_envs): wp.init() builder = Example.create_model_builder(robot, num_envs, randomize=True, seed=123) # finalize model - self._model = builder.finalize() + self._model = builder.finalize() + wp.synchronize_device() + self._solvers = []And add:
def teardown(self, robot, num_envs): self._solvers.clear() wp.synchronize_device()
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
asv/benchmarks/setup/bench_model.py(4 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
asv/benchmarks/setup/bench_model.py (1)
newton/examples/example_mujoco.py (3)
Example(191-357)create_model_builder(295-324)create_solver(327-357)
🪛 Ruff (0.12.2)
asv/benchmarks/setup/bench_model.py
54-54: Unused method argument: num_envs
(ARG002)
95-95: Unused method argument: num_envs
(ARG002)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Run GPU Benchmarks (Pull Request)
- GitHub Check: Run GPU Unit Tests on AWS EC2 (Pull Request)
- GitHub Check: run-newton-tests / newton-unittests (ubuntu-latest)
- GitHub Check: run-newton-tests / newton-unittests (windows-latest)
|
Looking at the benchmark results from the GPU runner, the error bars are reasonable except for the H1. We might want to take more samples to reduce false positives, but we can also leave it as-is. |
Signed-off-by: Eric Heiden <eric-heiden@outlook.com>
Signed-off-by: Eric Heiden <eric-heiden@outlook.com>
Signed-off-by: Eric Heiden <eric-heiden@outlook.com>
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (2)
asv/benchmarks/setup/bench_model.py (2)
71-79: Keep solvers alive across repeats to avoid destructor skew; also silence ARG002.Overwriting
self._solvereach run risks freeing the previous solver inside the next timed call. Retain all created solvers and free them in teardown; mark unused params per Ruff.@skip_benchmark_if(wp.get_cuda_device_count() == 0) -def time_initialize_solver(self, robot, num_envs): - self._solver = Example.create_solver(self._model, robot, use_mujoco_cpu=False) - wp.synchronize_device() +def time_initialize_solver(self, robot, _num_envs): # noqa: ARG002 + solver = Example.create_solver(self._model, robot, use_mujoco_cpu=False) + wp.synchronize_device() + self._solvers.append(solver) -def teardown(self, robot, num_envs): - del self._solver - del self._model +def teardown(self, _robot, _num_envs): # noqa: ARG002 + self._solvers.clear() + del self._model + wp.synchronize_device()
136-144: Retain solvers and cleanly teardown; silence ARG002.@skip_benchmark_if(wp.get_cuda_device_count() == 0) -def time_initialize_solver(self, robot, num_envs): - self._solver = Example.create_solver(self._model, robot, use_mujoco_cpu=False) - wp.synchronize_device() +def time_initialize_solver(self, robot, _num_envs): # noqa: ARG002 + solver = Example.create_solver(self._model, robot, use_mujoco_cpu=False) + wp.synchronize_device() + self._solvers.append(solver) -def teardown(self, robot, num_envs): - del self._solver - del self._model +def teardown(self, _robot, _num_envs): # noqa: ARG002 + self._solvers.clear() + del self._model + wp.synchronize_device()
🧹 Nitpick comments (4)
asv/benchmarks/setup/bench_model.py (4)
50-59: Annotate ASV class attributes with ClassVar to silence RUF012 and clarify intent.These are framework-level class attributes, not per‑instance state. Add typing annotations.
class KpiInitializeSolver: - params = (["humanoid", "g1", "h1", "cartpole", "ant", "quadruped"], [4096, 8192]) - param_names = ["robot", "num_envs"] - - rounds = 1 - number = 1 - repeat = 3 - min_run_count = 1 - timeout = 3600 + params: ClassVar[tuple[list[str], list[int]]] = ( + ["humanoid", "g1", "h1", "cartpole", "ant", "quadruped"], + [4096, 8192], + ) + param_names: ClassVar[list[str]] = ["robot", "num_envs"] + + rounds: ClassVar[int] = 1 + number: ClassVar[int] = 1 + repeat: ClassVar[int] = 3 + min_run_count: ClassVar[int] = 1 + timeout: ClassVar[int] = 3600Add the missing import outside this hunk:
from typing import ClassVar
60-70: Stabilize measurements: create a solver-retention list and sync after model finalize.Avoid overlap from asynchronous model finalize into the first timing iteration; also prep a list for keeping solvers alive across repeats (see next comment).
def setup(self, robot, num_envs): if robot == "h1": # use more samples for H1 to reduce variance self.repeat = 6 - wp.init() + self._solvers = [] + wp.init() builder = Example.create_model_builder(robot, num_envs, randomize=True, seed=123) # finalize model - self._model = builder.finalize() + self._model = builder.finalize() + wp.synchronize_device()
116-124: Same ClassVar cleanup for FastInitializeSolver.class FastInitializeSolver: - params = (["humanoid", "g1", "h1", "cartpole", "ant", "quadruped"], [128, 256]) - param_names = ["robot", "num_envs"] - - rounds = 1 - number = 1 - repeat = 3 - min_run_count = 1 + params: ClassVar[tuple[list[str], list[int]]] = ( + ["humanoid", "g1", "h1", "cartpole", "ant", "quadruped"], + [128, 256], + ) + param_names: ClassVar[list[str]] = ["robot", "num_envs"] + + rounds: ClassVar[int] = 1 + number: ClassVar[int] = 1 + repeat: ClassVar[int] = 3 + min_run_count: ClassVar[int] = 1
125-135: Mirror setup fixes: precreate retention list and sync after finalize.def setup(self, robot, num_envs): if robot == "h1": # use more samples for H1 to reduce variance self.repeat = 6 - wp.init() + self._solvers = [] + wp.init() builder = Example.create_model_builder(robot, num_envs, randomize=True, seed=123) # finalize model - self._model = builder.finalize() + self._model = builder.finalize() + wp.synchronize_device()
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
asv/benchmarks/setup/bench_model.py(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
asv/benchmarks/setup/bench_model.py (1)
newton/examples/example_mujoco.py (3)
Example(191-357)create_model_builder(295-324)create_solver(327-357)
🪛 Ruff (0.12.2)
asv/benchmarks/setup/bench_model.py
52-52: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
72-72: Unused method argument: num_envs
(ARG002)
76-76: Unused method argument: robot
(ARG002)
76-76: Unused method argument: num_envs
(ARG002)
118-118: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
137-137: Unused method argument: num_envs
(ARG002)
141-141: Unused method argument: robot
(ARG002)
141-141: Unused method argument: num_envs
(ARG002)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Run GPU Unit Tests on AWS EC2 (Pull Request)
- GitHub Check: Run GPU Benchmarks (Pull Request)
- GitHub Check: run-newton-tests / newton-unittests (windows-latest)
- GitHub Check: run-newton-tests / newton-unittests (ubuntu-latest)
🔇 Additional comments (1)
asv/benchmarks/setup/bench_model.py (1)
154-155: Registry additions look good.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (5)
asv/benchmarks/setup/bench_model.py (5)
116-124: Annotate params with ClassVar (same RUF012).-class FastInitializeSolver: - params = (["humanoid", "g1", "h1", "cartpole", "ant", "quadruped"], [128, 256]) +class FastInitializeSolver: + params: ClassVar[tuple[list[str], list[int]]] = (["humanoid", "g1", "h1", "cartpole", "ant", "quadruped"], [128, 256])
126-129: Same note on per-param repeat for H1; please verify ASV behavior.
141-144: Teardown: clear retained solvers, sync, and silence ARG002.- def teardown(self, robot, num_envs): - del self._solver - del self._model + def teardown(self, _robot, _num_envs): # noqa: ARG002 + if hasattr(self, "_solvers"): + self._solvers.clear() + del self._model + wp.synchronize_device() + gc.collect()
136-140: Same destructor/timing issue and unused arg nit.- def time_initialize_solver(self, robot, num_envs): - self._solver = Example.create_solver(self._model, robot, use_mujoco_cpu=False) - wp.synchronize_device() + def time_initialize_solver(self, robot, _num_envs): # noqa: ARG002 + solver = Example.create_solver(self._model, robot, use_mujoco_cpu=False) + wp.synchronize_device() + self._solvers.append(solver)
71-75: Avoid destructor skew; keep solver alive through sync and silence ARG002.Overwriting
self._solvereach run can trigger the previous solver’s destructor during timing. Retain instances until teardown; also mark the unused arg.- def time_initialize_solver(self, robot, num_envs): - self._solver = Example.create_solver(self._model, robot, use_mujoco_cpu=False) - wp.synchronize_device() + def time_initialize_solver(self, robot, _num_envs): # noqa: ARG002 + solver = Example.create_solver(self._model, robot, use_mujoco_cpu=False) + wp.synchronize_device() + # retain to keep destructor out of the timing window + self._solvers.append(solver)
🧹 Nitpick comments (4)
asv/benchmarks/setup/bench_model.py (4)
50-59: Annotate params with ClassVar to appease Ruff (RUF012).
paramscontains mutable lists; mark it as a class variable.-class KpiInitializeSolver: - params = (["humanoid", "g1", "h1", "cartpole", "ant", "quadruped"], [4096, 8192]) +class KpiInitializeSolver: + params: ClassVar[tuple[list[str], list[int]]] = (["humanoid", "g1", "h1", "cartpole", "ant", "quadruped"], [4096, 8192])Add the import outside this hunk:
from typing import ClassVar
65-70: Prep retention for solvers created during timing.Initialize a holder so previous solvers aren’t destroyed inside the timed region.
wp.init() + self._solvers = [] builder = Example.create_model_builder(robot, num_envs, randomize=True, seed=123)
76-79: Teardown: release out of band, avoid AttributeError, and sync.Ensure frees happen outside timing; also silence unused params.
- def teardown(self, robot, num_envs): - del self._solver - del self._model + def teardown(self, _robot, _num_envs): # noqa: ARG002 + if hasattr(self, "_solvers"): + self._solvers.clear() + del self._model + wp.synchronize_device() + gc.collect()
130-135: Initialize solver retention for Fast bench as well.wp.init() + self._solvers = [] builder = Example.create_model_builder(robot, num_envs, randomize=True, seed=123)
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
asv/benchmarks/setup/bench_model.py(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
asv/benchmarks/setup/bench_model.py (1)
newton/examples/example_mujoco.py (3)
Example(191-357)create_model_builder(295-324)create_solver(327-357)
🪛 Ruff (0.12.2)
asv/benchmarks/setup/bench_model.py
52-52: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
72-72: Unused method argument: num_envs
(ARG002)
76-76: Unused method argument: robot
(ARG002)
76-76: Unused method argument: num_envs
(ARG002)
118-118: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
137-137: Unused method argument: num_envs
(ARG002)
141-141: Unused method argument: robot
(ARG002)
141-141: Unused method argument: num_envs
(ARG002)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Run GPU Unit Tests on AWS EC2 (Pull Request)
- GitHub Check: Run GPU Benchmarks (Pull Request)
🔇 Additional comments (2)
asv/benchmarks/setup/bench_model.py (2)
61-64: Confirm ASV honors per-param repeat set in setup().ASV may read
repeatbefore callingsetup(). Please verify H1 actually runs with 6 repeats; if not, consider separate benches or a param-driven repeat.You can confirm by running the bench with verbose output and checking sample counts.
154-155: Registry updates look good.
Signed-off-by: Eric Heiden <eric-heiden@outlook.com>
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
asv/benchmarks/setup/bench_model.py (2)
60-75: Keep solvers alive across iterations; otherwise destructor contaminates timings.Assigning to self._solver each call drops the previous instance during the next call, pulling free work into the timed region. Also, synchronize after model finalize in setup to avoid spillover into the first iteration. Rename unused arg to silence ARG002.
Apply:
@@ def setup(self, robot, num_envs): if robot == "h1": # use more samples for H1 to reduce variance self.number = 30 wp.init() builder = Example.create_model_builder(robot, num_envs, randomize=True, seed=123) # finalize model self._model = builder.finalize() + # ensure no pending GPU work from finalize spills into timed region + wp.synchronize_device() + # hold solvers to keep destructor out of timed region + self._solvers = [] @skip_benchmark_if(wp.get_cuda_device_count() == 0) - def time_initialize_solver(self, robot, num_envs): - self._solver = Example.create_solver(self._model, robot, use_mujoco_cpu=False) - wp.synchronize_device() + def time_initialize_solver(self, robot, _num_envs): # noqa: ARG002 + solver = Example.create_solver(self._model, robot, use_mujoco_cpu=False) + wp.synchronize_device() + self._solvers.append(solver) @@ - def teardown(self, robot, num_envs): - del self._solver - del self._model + def teardown(self, _robot, _num_envs): # noqa: ARG002 + if hasattr(self, "_solvers"): + self._solvers.clear() + del self._model + # ensure destructor frees happen outside the timed region entirely + wp.synchronize_device()
125-144: Apply the same solver‑lifetime and sync fixes to FastInitializeSolver.Pre-sync after finalize; retain solvers; guard teardown; silence ARG002.
Apply:
@@ def setup(self, robot, num_envs): if robot == "h1": # use more samples for H1 to reduce variance self.number = 30 wp.init() builder = Example.create_model_builder(robot, num_envs, randomize=True, seed=123) # finalize model - self._model = builder.finalize() + self._model = builder.finalize() + wp.synchronize_device() + self._solvers = [] @skip_benchmark_if(wp.get_cuda_device_count() == 0) - def time_initialize_solver(self, robot, num_envs): - self._solver = Example.create_solver(self._model, robot, use_mujoco_cpu=False) - wp.synchronize_device() + def time_initialize_solver(self, robot, _num_envs): # noqa: ARG002 + solver = Example.create_solver(self._model, robot, use_mujoco_cpu=False) + wp.synchronize_device() + self._solvers.append(solver) - def teardown(self, robot, num_envs): - del self._solver - del self._model + def teardown(self, _robot, _num_envs): # noqa: ARG002 + if hasattr(self, "_solvers"): + self._solvers.clear() + del self._model + wp.synchronize_device()
🧹 Nitpick comments (4)
asv/benchmarks/setup/bench_model.py (4)
50-59: Annotate class params with ClassVar to satisfy Ruff RUF012.ASV requires class attributes; Ruff flags mutable ones. Add ClassVar and a typing import.
Apply:
@@ -import gc +import gc +from typing import ClassVar @@ -class KpiInitializeSolver: - params = (["humanoid", "g1", "h1", "cartpole", "ant", "quadruped"], [4096, 8192]) - param_names = ["robot", "num_envs"] +class KpiInitializeSolver: + params: ClassVar = (["humanoid", "g1", "h1", "cartpole", "ant", "quadruped"], [4096, 8192]) + param_names: ClassVar = ["robot", "num_envs"]Replicate the same ClassVar annotation for:
- KpiInitializeModel.params/param_names
- FastInitializeModel.params/param_names
- FastInitializeSolver.params/param_names
96-103: Optional: keep models alive during model-init timing to exclude destructor cost.Storing models until teardown avoids mixing teardown into the measured init time (same rationale as solver). Low effort, improves signal.
Apply:
@@ class FastInitializeModel: @@ def setup_cache(self): @@ del model @@ @skip_benchmark_if(wp.get_cuda_device_count() == 0) def time_initialize_model(self, robot, num_envs): builder = Example.create_model_builder(robot, num_envs, randomize=True, seed=123) # finalize model - _model = builder.finalize() + _model = builder.finalize() wp.synchronize_device() + # retain to keep destructor out of timing window + if not hasattr(self, "_models"): + self._models = [] + self._models.append(_model) @@ def peakmem_initialize_model_cpu(self, robot, num_envs): @@ del model + + def teardown(self, _robot, _num_envs): # noqa: ARG002 + if hasattr(self, "_models"): + self._models.clear() + wp.synchronize_device()
117-124: Mirror ClassVar annotations here, too.Same RUF012 for FastInitializeSolver.
Apply:
-class FastInitializeSolver: - params = (["humanoid", "g1", "h1", "cartpole", "ant", "quadruped"], [128, 256]) - param_names = ["robot", "num_envs"] +class FastInitializeSolver: + params: ClassVar = (["humanoid", "g1", "h1", "cartpole", "ant", "quadruped"], [128, 256]) + param_names: ClassVar = ["robot", "num_envs"]
41-47: Minor: exclude model destructor from KPI model-init timing (consistency).Optional parity with FastInitializeModel: retain models and free in teardown to avoid mixing teardown into timing.
Apply:
@@ def time_initialize_model(self, robot, num_envs): builder = Example.create_model_builder(robot, num_envs, randomize=True, seed=123) # finalize model - _model = builder.finalize() + _model = builder.finalize() wp.synchronize_device() + if not hasattr(self, "_models"): + self._models = [] + self._models.append(_model) + + def teardown(self, _robot, _num_envs): # noqa: ARG002 + if hasattr(self, "_models"): + self._models.clear() + wp.synchronize_device()
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
asv/benchmarks/setup/bench_model.py(4 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
asv/benchmarks/setup/bench_model.py (1)
newton/examples/example_mujoco.py (3)
Example(191-357)create_model_builder(295-324)create_solver(327-357)
🪛 Ruff (0.12.2)
asv/benchmarks/setup/bench_model.py
52-52: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
72-72: Unused method argument: num_envs
(ARG002)
76-76: Unused method argument: robot
(ARG002)
76-76: Unused method argument: num_envs
(ARG002)
83-83: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
118-118: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
137-137: Unused method argument: num_envs
(ARG002)
141-141: Unused method argument: robot
(ARG002)
141-141: Unused method argument: num_envs
(ARG002)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Run GPU Benchmarks (Pull Request)
- GitHub Check: Run GPU Unit Tests on AWS EC2 (Pull Request)
- GitHub Check: run-newton-tests / newton-unittests (ubuntu-latest)
- GitHub Check: run-newton-tests / newton-unittests (windows-latest)
🔇 Additional comments (3)
asv/benchmarks/setup/bench_model.py (3)
33-35: Sampling knob tweaks look good.repeat=1 and number=3 align with shorter wall time in CI while retaining stability.
85-88: FastInitializeModel sampling changes are fine.Matches KPI classes and keeps runtime reasonable.
154-156: Registry updates look correct.New solver benchmarks are correctly wired.
Signed-off-by: Eric Heiden <eric-heiden@outlook.com>
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (4)
asv/benchmarks/setup/bench_model.py (4)
117-118: Silence RUF012 on mutable class attributes or annotate as ClassVar.Same nit as above for params.
-class FastInitializeSolver: - params = (["humanoid", "g1", "h1", "cartpole", "ant", "quadruped"], [128, 256]) +class FastInitializeSolver: + params = (["humanoid", "g1", "h1", "cartpole", "ant", "quadruped"], [128, 256]) # noqa: RUF012
126-129: Verify per-param repeat override works here too.Ensure the runner respects the h1-specific repeat change.
#!/bin/bash # Cross-check FastInitializeSolver repeat override behavior alongside KPI class. set -euo pipefail rg -n -C3 'class FastInitializeSolver|repeat = 10' asv/benchmarks/setup/bench_model.py
60-75: Prevent finalize spillover into first timing and avoid destructor skew across iterations.
- Missing sync after builder.finalize() can charge finalize GPU work to the first iteration.
- Storing only self._solver lets previous solvers be destroyed during later timed calls, pulling destructor work into measurements.
- Unused-arg nits (ARG002) in the benchmark signatures.
Apply:
@@ def setup(self, robot, num_envs): if robot == "h1": # use more samples for H1 to reduce variance self.repeat = 10 wp.init() + self._solvers = [] builder = Example.create_model_builder(robot, num_envs, randomize=True, seed=123) # finalize model - self._model = builder.finalize() + self._model = builder.finalize() + # ensure finalize() work is not charged to the first timed iteration + wp.synchronize_device() @skip_benchmark_if(wp.get_cuda_device_count() == 0) - def time_initialize_solver(self, robot, num_envs): - self._solver = Example.create_solver(self._model, robot, use_mujoco_cpu=False) - wp.synchronize_device() + def time_initialize_solver(self, robot, _num_envs): # noqa: ARG002 + solver = Example.create_solver(self._model, robot, use_mujoco_cpu=False) + # retain to keep destructor out of this timing window + self._solvers.append(solver) + wp.synchronize_device() - def teardown(self, robot, num_envs): - del self._solver - del self._model + def teardown(self, _robot, _num_envs): # noqa: ARG002 + self._solvers.clear() + del self._model + # free outside timed regions + wp.synchronize_device() + gc.collect()Also applies to: 76-79
125-140: Same timing-correctness fixes needed as in KPI class: sync after finalize, retain solvers, fix ARG002.Mirror the corrections to keep measurements clean and reproducible.
@@ def setup(self, robot, num_envs): if robot == "h1": # use more samples for H1 to reduce variance self.repeat = 10 wp.init() + self._solvers = [] builder = Example.create_model_builder(robot, num_envs, randomize=True, seed=123) # finalize model - self._model = builder.finalize() + self._model = builder.finalize() + wp.synchronize_device() @skip_benchmark_if(wp.get_cuda_device_count() == 0) - def time_initialize_solver(self, robot, num_envs): - self._solver = Example.create_solver(self._model, robot, use_mujoco_cpu=False) - wp.synchronize_device() + def time_initialize_solver(self, robot, _num_envs): # noqa: ARG002 + solver = Example.create_solver(self._model, robot, use_mujoco_cpu=False) + self._solvers.append(solver) + wp.synchronize_device() - def teardown(self, robot, num_envs): - del self._solver - del self._model + def teardown(self, _robot, _num_envs): # noqa: ARG002 + self._solvers.clear() + del self._model + wp.synchronize_device() + gc.collect()Also applies to: 141-144
🧹 Nitpick comments (1)
asv/benchmarks/setup/bench_model.py (1)
51-52: Silence RUF012 on mutable class attributes or annotate as ClassVar.Ruff flags params as a mutable class attribute. Either annotate with typing.ClassVar or add a per-line noqa.
Apply one of:
-class KpiInitializeSolver: - params = (["humanoid", "g1", "h1", "cartpole", "ant", "quadruped"], [4096, 8192]) +class KpiInitializeSolver: + params = (["humanoid", "g1", "h1", "cartpole", "ant", "quadruped"], [4096, 8192]) # noqa: RUF012Or (preferred if you want types):
from typing import ClassVar class KpiInitializeSolver: params: ClassVar[tuple[list[str], list[int]]] = ( ["humanoid", "g1", "h1", "cartpole", "ant", "quadruped"], [4096, 8192], )
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
asv/benchmarks/setup/bench_model.py(4 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
asv/benchmarks/setup/bench_model.py (1)
newton/examples/example_mujoco.py (3)
Example(191-357)create_model_builder(295-324)create_solver(327-357)
🪛 Ruff (0.12.2)
asv/benchmarks/setup/bench_model.py
52-52: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
72-72: Unused method argument: num_envs
(ARG002)
76-76: Unused method argument: robot
(ARG002)
76-76: Unused method argument: num_envs
(ARG002)
83-83: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
118-118: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
137-137: Unused method argument: num_envs
(ARG002)
141-141: Unused method argument: robot
(ARG002)
141-141: Unused method argument: num_envs
(ARG002)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Run GPU Unit Tests on AWS EC2 (Pull Request)
- GitHub Check: Run GPU Benchmarks (Pull Request)
- GitHub Check: run-newton-tests / newton-unittests (windows-latest)
- GitHub Check: run-newton-tests / newton-unittests (ubuntu-latest)
🔇 Additional comments (3)
asv/benchmarks/setup/bench_model.py (3)
34-34: LGTM: number=1 is appropriate for stable per-call timing.
87-87: LGTM: number=1 keeps each timing to a single call (less cross-iteration interference).
154-155: LGTM: new benchmarks are registered.Matches class names; no issues.
Signed-off-by: Alain Denzler <adenzler@nvidia.com> Signed-off-by: Eric Heiden <eric-heiden@outlook.com> Co-authored-by: Alain Denzler <adenzler@nvidia.com> Co-authored-by: adenzler-nvidia <116633880+adenzler-nvidia@users.noreply.github.com>
Signed-off-by: Alain Denzler <adenzler@nvidia.com> Signed-off-by: Eric Heiden <eric-heiden@outlook.com> Co-authored-by: Alain Denzler <adenzler@nvidia.com> Co-authored-by: adenzler-nvidia <116633880+adenzler-nvidia@users.noreply.github.com>
Description
Warm-start timings:
SolverMuJoCo .__init__SolverMuJoCo .__init__SolverMuJoCo .__init__Newton Migration Guide
Please ensure the migration guide for warp.sim users is up-to-date with the changes made in this PR.
docs/migration.rstis up-to dateBefore your PR is "Ready for review"
newton/tests/test_examples.py)pre-commit run -aSummary by CodeRabbit
New Features
Performance
Bug Fixes
Tests
Chores