Skip to content

refactor solver#1395

Merged
thowell merged 1 commit into
google-deepmind:mainfrom
thowell:solver_refactor
Jun 3, 2026
Merged

refactor solver#1395
thowell merged 1 commit into
google-deepmind:mainfrom
thowell:solver_refactor

Conversation

@thowell

@thowell thowell commented Jun 1, 2026

Copy link
Copy Markdown
Collaborator

small refactor to update solver.py and start unifying the monolithic and island solvers a bit

1. Shared _eval_constraint function

The inline constraint-evaluation logic that was duplicated in both the monolithic (_update_constraint_efc) and island (_update_constraint_efc_island) kernels is extracted into a single @wp.func called _eval_constraint. It handles all constraint types (equality, friction, elliptic cone, inequality) and returns a vec3 of (force, state, cost). Both kernels now call this shared function instead of reimplementing the logic.

2. Unify _update_constraint_init_cost across monolithic and island paths

The island-specific update_constraint_init_cost_island kernel is removed. Both paths now use the same _update_constraint_init_cost kernel — the island path calls it with flattened (.reshape(-1)) arrays and dim=d.nworld * m.ntree.

3. Unify _solve_zero_search_dot across monolithic and island paths

Same pattern as above: the island path previously used ctx.search_dot.zero_(), which unconditionally zeroed all islands including converged ones. Now it reuses the monolithic _solve_zero_search_dot kernel (which skips done islands) via .reshape(-1).

4. Reuse _eval_elliptic in island linesearch

_eval_elliptic_cost_island was inlining the elliptic cone cost evaluation. It now delegates to the existing _eval_elliptic helper, and similarly uses _eval_frictionloss_pt_3alphas / _eval_pt_direct_3alphas instead of three separate calls.

5. Prefix all non-public kernels/functions with _

Every kernel and host-level function that isn't part of the public API is renamed with a leading underscore (e.g., solve_init_search_solve_init_search, linesearch_island_linesearch_kernel_island, _solve_islands_solve_island, etc.). ~30 kernels and functions are affected.


this pr

mjwarp-testspeed benchmarks/humanoid/humanoid.xml --nworld=8192 --nconmax=24 --njmax=64
Loading model from: benchmarks/humanoid/humanoid.xml...

Model
  nq: 28 nv: 27 nu: 21 nbody: 17 ngeom: 20
Option
  integrator: EULER
  cone: PYRAMIDAL
  solver: NEWTON iterations: 100 ls_iterations: 50
  is_sparse: False
  ls_parallel: False
  broadphase: NXN broadphase_filter: PLANE|SPHERE|OBB
Data
  nworld: 8192 naconmax: 196608 njmax: 64
Rolling out 1000 steps at dt = 0.005...

Summary for 8192 parallel rollouts

Total JIT time: 4.68 s
Total simulation time: 1.69 s
Total steps per second: 4,836,274
Total realtime factor: 24,181.37 x
Total time per step: 206.77 ns
Total converged worlds: 8192 / 8192
mjwarp-testspeed benchmarks/humanoid/three_humanoids.xml --nworld=8192 --nconmax=100 --njmax=192
Loading model from: benchmarks/humanoid/three_humanoids.xml...

Model
  nq: 84 nv: 81 nu: 63 nbody: 49 ngeom: 58
Option
  integrator: EULER
  cone: PYRAMIDAL
  solver: NEWTON iterations: 100 ls_iterations: 50
  is_sparse: True
  ls_parallel: False
  broadphase: NXN broadphase_filter: PLANE|SPHERE|OBB
Data
  nworld: 8192 naconmax: 819200 njmax: 192
Rolling out 1000 steps at dt = 0.005...

Summary for 8192 parallel rollouts

Total JIT time: 0.67 s
Total simulation time: 11.04 s
Total steps per second: 742,041
Total realtime factor: 3,710.20 x
Total time per step: 1347.63 ns
Total converged worlds: 8192 / 8192

193404a

mjwarp-testspeed benchmarks/humanoid/humanoid.xml --nworld=8192 --nconmax=24 --njmax=64
Loading model from: benchmarks/humanoid/humanoid.xml...

Model
  nq: 28 nv: 27 nu: 21 nbody: 17 ngeom: 20
Option
  integrator: EULER
  cone: PYRAMIDAL
  solver: NEWTON iterations: 100 ls_iterations: 50
  is_sparse: False
  ls_parallel: False
  broadphase: NXN broadphase_filter: PLANE|SPHERE|OBB
Data
  nworld: 8192 naconmax: 196608 njmax: 64
Rolling out 1000 steps at dt = 0.005...

Summary for 8192 parallel rollouts

Total JIT time: 0.44 s
Total simulation time: 1.70 s
Total steps per second: 4,819,429
Total realtime factor: 24,097.15 x
Total time per step: 207.49 ns
Total converged worlds: 8192 / 8192
mjwarp-testspeed benchmarks/humanoid/three_humanoids.xml --nworld=8192 --nconmax=100 --njmax=192
Loading model from: benchmarks/humanoid/three_humanoids.xml...

Model
  nq: 84 nv: 81 nu: 63 nbody: 49 ngeom: 58
Option
  integrator: EULER
  cone: PYRAMIDAL
  solver: NEWTON iterations: 100 ls_iterations: 50
  is_sparse: True
  ls_parallel: False
  broadphase: NXN broadphase_filter: PLANE|SPHERE|OBB
Data
  nworld: 8192 naconmax: 819200 njmax: 192
Rolling out 1000 steps at dt = 0.005...

Summary for 8192 parallel rollouts

Total JIT time: 0.67 s
Total simulation time: 11.10 s
Total steps per second: 737,937
Total realtime factor: 3,689.68 x
Total time per step: 1355.13 ns
Total converged worlds: 8192 / 8192

@thowell thowell requested a review from adenzler-nvidia June 1, 2026 17:35

@adenzler-nvidia adenzler-nvidia left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice refactor — the _eval_constraint extraction and the monolithic/island unification look correct to me (traced the per-branch force/state/cost equivalence, the if cost != 0.0 gating, and the padding-island done=True change; solver_test.py passes locally). Two minor non-blocking nits in _eval_elliptic_cost_island, otherwise LGTM.

Comment thread mujoco_warp/_src/solver.py Outdated
Comment thread mujoco_warp/_src/solver.py
@thowell thowell force-pushed the solver_refactor branch 2 times, most recently from 15f5394 to c8631a5 Compare June 3, 2026 10:39
@thowell thowell requested a review from adenzler-nvidia June 3, 2026 10:40

@adenzler-nvidia adenzler-nvidia left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both nits from the previous review are addressed:

  • _eval_elliptic now takes mu as an argument, removing the double-compute.
  • _eval_elliptic_cost_island reuses the u0/v0 locals when building quad1.

The _eval_constraint extraction and monolithic/island unification look correct. LGTM — please resolve the conflicts with main before merging.

@thowell thowell force-pushed the solver_refactor branch from c8631a5 to adc097a Compare June 3, 2026 14:52
@thowell thowell force-pushed the solver_refactor branch from adc097a to 8bdf6dd Compare June 3, 2026 15:00
@thowell thowell merged commit 2da62c5 into google-deepmind:main Jun 3, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants