Skip to content

Validate pivot range in linalg.ldl_solve CPU kernel#181032

Closed
qflen wants to merge 1 commit intopytorch:mainfrom
qflen:fix-ldl-solve-pivot-validation-163450
Closed

Validate pivot range in linalg.ldl_solve CPU kernel#181032
qflen wants to merge 1 commit intopytorch:mainfrom
qflen:fix-ldl-solve-pivot-validation-163450

Conversation

@qflen
Copy link
Copy Markdown
Contributor

@qflen qflen commented Apr 21, 2026

Summary

torch.linalg.ldl_solve forwards user-provided pivots straight to Lapack
SYTRS, which writes into unrelated memory when |IPIV(k)| falls outside
[1, N]. The out-of-bounds writes corrupt the heap, so the crash manifests
much later during tensor teardown as a malformed-free abort (the reproducer
from #163450 shows malloc(): unsorted double linked list corrupted).

The CPU kernel for linalg.lu_solve already guards against the same class
of bug; this PR ports that sanity check to ldl_solve_kernel. Negative
pivots are accepted since they legally encode 2x2 block pivots, but
|pivot| must still satisfy 1 <= |pivot| <= N.

Fixes #163450.

Test plan

Lapack SYTRS writes past the matrix when |IPIV(k)| falls outside [1, N],
which surfaces as heap corruption during tensor teardown. Mirror the
lu_solve CPU kernel's sanity check so out-of-range pivots raise a clean
RuntimeError instead.

Fixes pytorch#163450
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Apr 21, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/181032

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 1 New Failure

As of commit d26c4f8 with merge base ba36784 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot Bot added the release notes: linalg_frontend release notes category label Apr 21, 2026
Copy link
Copy Markdown
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@albanD
Copy link
Copy Markdown
Collaborator

albanD commented Apr 22, 2026

@pytorchbot merge

@pytorch-bot pytorch-bot Bot added the ciflow/trunk Trigger trunk jobs on your pull request label Apr 22, 2026
@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge failed

Reason: 1 jobs have failed, first few of them are: trunk / linux-jammy-cuda13.0-py3.10-gcc11 / test (distributed, 1, 3, linux.g4dn.12xlarge.nvidia.gpu)

Details for Dev Infra team Raised by workflow job

@albanD
Copy link
Copy Markdown
Collaborator

albanD commented Apr 23, 2026

@pytorchbot merge -i

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge started

Your change will be merged while ignoring the following 1 checks: trunk / linux-jammy-cuda13.0-py3.10-gcc11 / test (distributed, 1, 3, linux.g4dn.12xlarge.nvidia.gpu)

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge failed

Reason: HTTP Error 503: Service Unavailable

Details for Dev Infra team Raised by workflow job

@albanD
Copy link
Copy Markdown
Collaborator

albanD commented Apr 23, 2026

@pytorchbot merge -i

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge started

Your change will be merged while ignoring the following 1 checks: trunk / linux-jammy-cuda13.0-py3.10-gcc11 / test (distributed, 1, 3, linux.g4dn.12xlarge.nvidia.gpu)

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request Merged open source release notes: linalg_frontend release notes category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

torch.linalg.ldl_solve aborted heap corruption with wrong input

4 participants