Skip to content

Fix JTCJ_island missing off-diagonal cone terms#1407

Merged
adenzler-nvidia merged 1 commit into
google-deepmind:mainfrom
adenzler-nvidia:adenzler/fix-jtcj-island-cone
Jun 4, 2026
Merged

Fix JTCJ_island missing off-diagonal cone terms#1407
adenzler-nvidia merged 1 commit into
google-deepmind:mainfrom
adenzler-nvidia:adenzler/fix-jtcj-island-cone

Conversation

@adenzler-nvidia

Copy link
Copy Markdown
Collaborator

Summary

  • The dense-Jacobian branch of _update_gradient_JTCJ_island drops the swap-pair cone term hcone * J[ic2, i] * J[ic1, j] for the dim1id != dim2id case whenever any of the four J entries used in the inner block is zero. Per-cell H is then short the J[ic2, a] * J[ic1, b] half of the off-diagonal cone contribution.
  • Replace the conditional swap-pair block with a separate full loop for the second rank-1 term, gated on dim1id != dim2id. Both loops keep the original aggressive J==0 early-skip.
  • Matches the value computed by monolithic _update_gradient_JTCJ_dense (solver.py:2700-2704).

Why no existing test catches this

The scalar _cholesky_factorize_solve_island clamps mid-factorization (s <= 1e-6 -> s = 1e-6), absorbing the small indefiniteness the bug leaves in per-island H. It becomes visible the moment the per-island solve uses any Cholesky variant without diagonal clamping (e.g. wp.tile_cholesky on the per-island H sub-block), where it surfaces as NaN qacc — reproducible with constraints.xml keyframe 2 + elliptic cone + Newton + dense Jacobian.

Performance (RTX PRO 6000, 3-run averages)

  • three_humanoids @ nworld=8192, islands ON: 394k → 392k sps (noise; no elliptic islands)
  • aloha_clutter @ nworld=512, islands ON: 20.6k → 20.9k sps (+1.2%)

Test plan

  • solver_test.py (196 passed)

The dense-Jacobian branch of _update_gradient_JTCJ_island skips an
iteration when J[ic1, idof_i] is zero, but in the dim1 != dim2 case
the swap-pair contribution hcone * J[ic2, idof_i] * J[ic1, idof_j]
still needs to write — the swap-pair-only block under 'if dim1id !=
dim2id' only ran when J[ic1, i] AND J[ic2, j] were both nonzero. Cells
where exactly one of those four J entries was zero silently dropped
the corresponding cone contribution.

Per cell H[a, b] for an off-diagonal cone pair (dim1, dim2) the
contribution should be
  hcone * (J[ic1, a] * J[ic2, b] + J[ic2, a] * J[ic1, b])
which the monolithic _update_gradient_JTCJ_dense computes correctly
(solver.py:2700-2704). This change keeps the original loop (which
already handles the first term) and adds a second loop with the same
shape but ic1/ic2 swapped, gated on dim1id != dim2id, for the second
term. Both loops keep the aggressive J==0 early-skip from the
original.

The existing scalar _cholesky_factorize_solve_island absorbs the
resulting indefinite per-island H via its mid-factorization clamp
(s <= 1e-6 -> s = 1e-6), so no existing test fails. The bug becomes
visible when the per-island solve uses any Cholesky variant without
diagonal clamping (e.g. wp.tile_cholesky on the per-island H
sub-block), producing NaN qacc on configurations such as
constraints.xml keyframe 2 with elliptic cone + Newton + dense
Jacobian.
@adenzler-nvidia adenzler-nvidia requested a review from thowell June 4, 2026 06:59
@adenzler-nvidia adenzler-nvidia merged commit 26e3b00 into google-deepmind:main Jun 4, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants