[MPS] Better error checking for FFT ops by malfet · Pull Request #166272 · pytorch/pytorch

malfet · 2025-10-26T22:00:47Z

Stack from ghstack (oldest at bottom):

Namely, error out rather than crash when out dtype is of an unexpected type
Resize output tensor to the expected size in _out operation, to prevent crash when tensor of an unexpected size is passed.
Preserve symbolic shapes whenever possible

Test plan: Run python test_ops.py -v -k test_out_warning_fft_hfft_mps for MPS device, without this change it crashes with Error: Invalid KernelDAG, equalShape for destination failed', run python ../test/test_ops.py -v -k test_dtypes_stft_mps, without this change it crashes with A complex mlir::Type does not have a corresponding complex MPSDataType", when input dtype is bfloat16

[ghstack-poisoned]

pytorch-bot · 2025-10-26T22:00:50Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/166272

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 52 Pending

As of commit 76c1bb5 with merge base d049ed2 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]

malfet · 2025-10-28T01:30:03Z

@pytorchbot merge -f "Lint is green, I'm pretty sure MPS should be safe as well"

pytorchmergebot · 2025-10-28T01:31:36Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Raise an exception, as it's meaningless and results in segfault otherwise: ``` % python -c "import torch;torch.rand(10, dtype=torch.cfloat, device='mps').amax()" (mpsFileLoc): /AppleInternal/Library/BuildRoots/4~B6shugDBannYeMBGCfhw7wjvNJOfy4BrawZ7TdI/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm:176:0: error: 'mps.reduction_max' op operand #0 must be tensor of mps native type values, but got 'tensor<10xcomplex<f32>>' (mpsFileLoc): /AppleInternal/Library/BuildRoots/4~B6shugDBannYeMBGCfhw7wjvNJOfy4BrawZ7TdI/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm:176:0: note: see current operation: %2 = "mps.reduction_max"(%arg0, %1) <{keep_dims, propagate_nans}> : (tensor<10xcomplex<f32>>, tensor<1xsi32>) -> tensor<1xcomplex<f32>> (mpsFileLoc): /AppleInternal/Library/BuildRoots/4~B6shugDBannYeMBGCfhw7wjvNJOfy4BrawZ7TdI/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm:176:0: error: 'mps.reduction_max' op operand #0 must be tensor of mps native type values, but got 'tensor<10xcomplex<f32>>' (mpsFileLoc): /AppleInternal/Library/BuildRoots/4~B6shugDBannYeMBGCfhw7wjvNJOfy4BrawZ7TdI/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm:176:0: note: see current operation: %2 = "mps.reduction_max"(%arg0, %1) <{keep_dims, propagate_nans}> : (tensor<10xcomplex<f32>>, tensor<1xsi32>) -> tensor<1xcomplex<f32>> /AppleInternal/Library/BuildRoots/4~B6shugDBannYeMBGCfhw7wjvNJOfy4BrawZ7TdI/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphExecutable.mm:1347: failed assertion `original module failed verification' zsh: abort python -c ``` To be tested by `test_ops.py` Pull Request resolved: #166214 Approved by: https://github.com/dcci, https://github.com/kulinseth, https://github.com/Skylion007 ghstack dependencies: #166272

…6214) Raise an exception, as it's meaningless and results in segfault otherwise: ``` % python -c "import torch;torch.rand(10, dtype=torch.cfloat, device='mps').amax()" (mpsFileLoc): /AppleInternal/Library/BuildRoots/4~B6shugDBannYeMBGCfhw7wjvNJOfy4BrawZ7TdI/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm:176:0: error: 'mps.reduction_max' op operand #0 must be tensor of mps native type values, but got 'tensor<10xcomplex<f32>>' (mpsFileLoc): /AppleInternal/Library/BuildRoots/4~B6shugDBannYeMBGCfhw7wjvNJOfy4BrawZ7TdI/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm:176:0: note: see current operation: %2 = "mps.reduction_max"(%arg0, %1) <{keep_dims, propagate_nans}> : (tensor<10xcomplex<f32>>, tensor<1xsi32>) -> tensor<1xcomplex<f32>> (mpsFileLoc): /AppleInternal/Library/BuildRoots/4~B6shugDBannYeMBGCfhw7wjvNJOfy4BrawZ7TdI/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm:176:0: error: 'mps.reduction_max' op operand #0 must be tensor of mps native type values, but got 'tensor<10xcomplex<f32>>' (mpsFileLoc): /AppleInternal/Library/BuildRoots/4~B6shugDBannYeMBGCfhw7wjvNJOfy4BrawZ7TdI/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm:176:0: note: see current operation: %2 = "mps.reduction_max"(%arg0, %1) <{keep_dims, propagate_nans}> : (tensor<10xcomplex<f32>>, tensor<1xsi32>) -> tensor<1xcomplex<f32>> /AppleInternal/Library/BuildRoots/4~B6shugDBannYeMBGCfhw7wjvNJOfy4BrawZ7TdI/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphExecutable.mm:1347: failed assertion `original module failed verification' zsh: abort python -c ``` To be tested by `test_ops.py` Pull Request resolved: pytorch#166214 Approved by: https://github.com/dcci, https://github.com/kulinseth, https://github.com/Skylion007 ghstack dependencies: pytorch#166272

Namely, error out rather than crash when out dtype is of an unexpected type Also, resize output tensor to the expected size in `_out` operation, to prevent crash when tensor of an unexpeced size is pased. Test plan: Run `test_ops.py` for MPS device ghstack-source-id: 2e3f925 Pull Request resolved: pytorch/pytorch#166272

Update

05787b0

[ghstack-poisoned]

malfet requested a review from kulinseth as a code owner October 26, 2025 22:00

pytorch-bot Bot added ciflow/mps Run MPS tests (subset of trunk) release notes: mps Release notes category labels Oct 26, 2025

Update

b3921d4

[ghstack-poisoned]

kulinseth approved these changes Oct 27, 2025

View reviewed changes

Skylion007 reviewed Oct 27, 2025

View reviewed changes

Comment thread aten/src/ATen/native/mps/operations/FastFourierTransform.mm Outdated

malfet added 2 commits October 27, 2025 18:02

Update

1ba7974

[ghstack-poisoned]

Update

76c1bb5

[ghstack-poisoned]

pytorchmergebot added the merging label Oct 28, 2025

pytorchmergebot closed this in add37ba Oct 28, 2025

pytorchmergebot added Merged and removed merging labels Oct 28, 2025

github-actions Bot deleted the gh/malfet/574/head branch November 29, 2025 02:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MPS] Better error checking for FFT ops#166272

[MPS] Better error checking for FFT ops#166272
malfet wants to merge 4 commits intogh/malfet/574/basefrom
gh/malfet/574/head

malfet commented Oct 26, 2025 •

edited

Loading

Uh oh!

pytorch-bot Bot commented Oct 26, 2025 •

edited

Loading

Uh oh!

Uh oh!

malfet commented Oct 28, 2025

Uh oh!

pytorchmergebot commented Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

malfet commented Oct 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented Oct 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/166272

⏳ No Failures, 52 Pending

Uh oh!

Uh oh!

malfet commented Oct 28, 2025

Uh oh!

pytorchmergebot commented Oct 28, 2025

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

malfet commented Oct 26, 2025 •

edited

Loading

pytorch-bot Bot commented Oct 26, 2025 •

edited

Loading