refactor solver by thowell · Pull Request #1395 · google-deepmind/mujoco_warp

thowell · 2026-06-01T17:34:49Z

small refactor to update solver.py and start unifying the monolithic and island solvers a bit

1. Shared `_eval_constraint` function

The inline constraint-evaluation logic that was duplicated in both the monolithic (_update_constraint_efc) and island (_update_constraint_efc_island) kernels is extracted into a single @wp.func called _eval_constraint. It handles all constraint types (equality, friction, elliptic cone, inequality) and returns a vec3 of (force, state, cost). Both kernels now call this shared function instead of reimplementing the logic.

2. Unify `_update_constraint_init_cost` across monolithic and island paths

The island-specific update_constraint_init_cost_island kernel is removed. Both paths now use the same _update_constraint_init_cost kernel — the island path calls it with flattened (.reshape(-1)) arrays and dim=d.nworld * m.ntree.

3. Unify `_solve_zero_search_dot` across monolithic and island paths

Same pattern as above: the island path previously used ctx.search_dot.zero_(), which unconditionally zeroed all islands including converged ones. Now it reuses the monolithic _solve_zero_search_dot kernel (which skips done islands) via .reshape(-1).

4. Reuse `_eval_elliptic` in island linesearch

_eval_elliptic_cost_island was inlining the elliptic cone cost evaluation. It now delegates to the existing _eval_elliptic helper, and similarly uses _eval_frictionloss_pt_3alphas / _eval_pt_direct_3alphas instead of three separate calls.

5. Prefix all non-public kernels/functions with `_`

Every kernel and host-level function that isn't part of the public API is renamed with a leading underscore (e.g., solve_init_search → _solve_init_search, linesearch_island → _linesearch_kernel_island, _solve_islands → _solve_island, etc.). ~30 kernels and functions are affected.

this pr

mjwarp-testspeed benchmarks/humanoid/humanoid.xml --nworld=8192 --nconmax=24 --njmax=64
Loading model from: benchmarks/humanoid/humanoid.xml...

Model
  nq: 28 nv: 27 nu: 21 nbody: 17 ngeom: 20
Option
  integrator: EULER
  cone: PYRAMIDAL
  solver: NEWTON iterations: 100 ls_iterations: 50
  is_sparse: False
  ls_parallel: False
  broadphase: NXN broadphase_filter: PLANE|SPHERE|OBB
Data
  nworld: 8192 naconmax: 196608 njmax: 64
Rolling out 1000 steps at dt = 0.005...

Summary for 8192 parallel rollouts

Total JIT time: 4.68 s
Total simulation time: 1.69 s
Total steps per second: 4,836,274
Total realtime factor: 24,181.37 x
Total time per step: 206.77 ns
Total converged worlds: 8192 / 8192

mjwarp-testspeed benchmarks/humanoid/three_humanoids.xml --nworld=8192 --nconmax=100 --njmax=192
Loading model from: benchmarks/humanoid/three_humanoids.xml...

Model
  nq: 84 nv: 81 nu: 63 nbody: 49 ngeom: 58
Option
  integrator: EULER
  cone: PYRAMIDAL
  solver: NEWTON iterations: 100 ls_iterations: 50
  is_sparse: True
  ls_parallel: False
  broadphase: NXN broadphase_filter: PLANE|SPHERE|OBB
Data
  nworld: 8192 naconmax: 819200 njmax: 192
Rolling out 1000 steps at dt = 0.005...

Summary for 8192 parallel rollouts

Total JIT time: 0.67 s
Total simulation time: 11.04 s
Total steps per second: 742,041
Total realtime factor: 3,710.20 x
Total time per step: 1347.63 ns
Total converged worlds: 8192 / 8192

193404a

mjwarp-testspeed benchmarks/humanoid/humanoid.xml --nworld=8192 --nconmax=24 --njmax=64
Loading model from: benchmarks/humanoid/humanoid.xml...

Model
  nq: 28 nv: 27 nu: 21 nbody: 17 ngeom: 20
Option
  integrator: EULER
  cone: PYRAMIDAL
  solver: NEWTON iterations: 100 ls_iterations: 50
  is_sparse: False
  ls_parallel: False
  broadphase: NXN broadphase_filter: PLANE|SPHERE|OBB
Data
  nworld: 8192 naconmax: 196608 njmax: 64
Rolling out 1000 steps at dt = 0.005...

Summary for 8192 parallel rollouts

Total JIT time: 0.44 s
Total simulation time: 1.70 s
Total steps per second: 4,819,429
Total realtime factor: 24,097.15 x
Total time per step: 207.49 ns
Total converged worlds: 8192 / 8192

mjwarp-testspeed benchmarks/humanoid/three_humanoids.xml --nworld=8192 --nconmax=100 --njmax=192
Loading model from: benchmarks/humanoid/three_humanoids.xml...

Model
  nq: 84 nv: 81 nu: 63 nbody: 49 ngeom: 58
Option
  integrator: EULER
  cone: PYRAMIDAL
  solver: NEWTON iterations: 100 ls_iterations: 50
  is_sparse: True
  ls_parallel: False
  broadphase: NXN broadphase_filter: PLANE|SPHERE|OBB
Data
  nworld: 8192 naconmax: 819200 njmax: 192
Rolling out 1000 steps at dt = 0.005...

Summary for 8192 parallel rollouts

Total JIT time: 0.67 s
Total simulation time: 11.10 s
Total steps per second: 737,937
Total realtime factor: 3,689.68 x
Total time per step: 1355.13 ns
Total converged worlds: 8192 / 8192

adenzler-nvidia

Nice refactor — the _eval_constraint extraction and the monolithic/island unification look correct to me (traced the per-branch force/state/cost equivalence, the if cost != 0.0 gating, and the padding-island done=True change; solver_test.py passes locally). Two minor non-blocking nits in _eval_elliptic_cost_island, otherwise LGTM.

adenzler-nvidia

Both nits from the previous review are addressed:

_eval_elliptic now takes mu as an argument, removing the double-compute.
_eval_elliptic_cost_island reuses the u0/v0 locals when building quad1.

The _eval_constraint extraction and monolithic/island unification look correct. LGTM — please resolve the conflicts with main before merging.

thowell requested a review from adenzler-nvidia June 1, 2026 17:35

adenzler-nvidia reviewed Jun 3, 2026

View reviewed changes

Comment thread mujoco_warp/_src/solver.py Outdated

Comment thread mujoco_warp/_src/solver.py

thowell force-pushed the solver_refactor branch 2 times, most recently from 15f5394 to c8631a5 Compare June 3, 2026 10:39

thowell requested a review from adenzler-nvidia June 3, 2026 10:40

adenzler-nvidia approved these changes Jun 3, 2026

View reviewed changes

thowell force-pushed the solver_refactor branch from c8631a5 to adc097a Compare June 3, 2026 14:52

refactor solver

8bdf6dd

thowell force-pushed the solver_refactor branch from adc097a to 8bdf6dd Compare June 3, 2026 15:00

thowell merged commit 2da62c5 into google-deepmind:main Jun 3, 2026
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor solver#1395

refactor solver#1395
thowell merged 1 commit into
google-deepmind:mainfrom
thowell:solver_refactor

thowell commented Jun 1, 2026

Uh oh!

adenzler-nvidia left a comment

Uh oh!

Uh oh!

Uh oh!

adenzler-nvidia left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

thowell commented Jun 1, 2026

1. Shared _eval_constraint function

2. Unify _update_constraint_init_cost across monolithic and island paths

3. Unify _solve_zero_search_dot across monolithic and island paths

4. Reuse _eval_elliptic in island linesearch

5. Prefix all non-public kernels/functions with _

Uh oh!

adenzler-nvidia left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

adenzler-nvidia left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. Shared `_eval_constraint` function

2. Unify `_update_constraint_init_cost` across monolithic and island paths

3. Unify `_solve_zero_search_dot` across monolithic and island paths

4. Reuse `_eval_elliptic` in island linesearch

5. Prefix all non-public kernels/functions with `_`