Fix JTCJ_island missing off-diagonal cone terms#1407
Merged
adenzler-nvidia merged 1 commit intoJun 4, 2026
Merged
Conversation
The dense-Jacobian branch of _update_gradient_JTCJ_island skips an iteration when J[ic1, idof_i] is zero, but in the dim1 != dim2 case the swap-pair contribution hcone * J[ic2, idof_i] * J[ic1, idof_j] still needs to write — the swap-pair-only block under 'if dim1id != dim2id' only ran when J[ic1, i] AND J[ic2, j] were both nonzero. Cells where exactly one of those four J entries was zero silently dropped the corresponding cone contribution. Per cell H[a, b] for an off-diagonal cone pair (dim1, dim2) the contribution should be hcone * (J[ic1, a] * J[ic2, b] + J[ic2, a] * J[ic1, b]) which the monolithic _update_gradient_JTCJ_dense computes correctly (solver.py:2700-2704). This change keeps the original loop (which already handles the first term) and adds a second loop with the same shape but ic1/ic2 swapped, gated on dim1id != dim2id, for the second term. Both loops keep the aggressive J==0 early-skip from the original. The existing scalar _cholesky_factorize_solve_island absorbs the resulting indefinite per-island H via its mid-factorization clamp (s <= 1e-6 -> s = 1e-6), so no existing test fails. The bug becomes visible when the per-island solve uses any Cholesky variant without diagonal clamping (e.g. wp.tile_cholesky on the per-island H sub-block), producing NaN qacc on configurations such as constraints.xml keyframe 2 with elliptic cone + Newton + dense Jacobian.
thowell
approved these changes
Jun 4, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
_update_gradient_JTCJ_islanddrops the swap-pair cone termhcone * J[ic2, i] * J[ic1, j]for thedim1id != dim2idcase whenever any of the four J entries used in the inner block is zero. Per-cell H is then short theJ[ic2, a] * J[ic1, b]half of the off-diagonal cone contribution.dim1id != dim2id. Both loops keep the original aggressiveJ==0early-skip._update_gradient_JTCJ_dense(solver.py:2700-2704).Why no existing test catches this
The scalar
_cholesky_factorize_solve_islandclamps mid-factorization (s <= 1e-6 -> s = 1e-6), absorbing the small indefiniteness the bug leaves in per-island H. It becomes visible the moment the per-island solve uses any Cholesky variant without diagonal clamping (e.g.wp.tile_choleskyon the per-island H sub-block), where it surfaces as NaN qacc — reproducible withconstraints.xmlkeyframe 2 + elliptic cone + Newton + dense Jacobian.Performance (RTX PRO 6000, 3-run averages)
three_humanoids@ nworld=8192, islands ON: 394k → 392k sps (noise; no elliptic islands)aloha_clutter@ nworld=512, islands ON: 20.6k → 20.9k sps (+1.2%)Test plan
solver_test.py(196 passed)