Add stft option to align window for center = false#145324
Add stft option to align window for center = false#145324jackzhxng merged 5 commits intojz/stft-old-fcfrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/145324
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 80e461e with merge base b2c89bc ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Attention! native_functions.yaml was changedIf you are adding a new function or defaulted argument to native_functions.yaml, you cannot use it from pre-existing Python frontend code until our FC window passes (two weeks). Split your PR into two PRs, one which adds the new C++ functionality, and one that makes use of it from Python, and land them two weeks apart. See https://github.com/pytorch/pytorch/wiki/PyTorch's-Python-Frontend-Backward-and-Forward-Compatibility-Policy#forwards-compatibility-fc for more info. Caused by: |
254fdc5 to
e386693
Compare
e264167 to
4536cfd
Compare
e9c4dff to
aa60e40
Compare
aa60e40 to
2293218
Compare
2293218 to
324120b
Compare
|
cc @philgzl, I would highly value your opinion here! |
|
I think it looks fine. As discussed here it could be that padding after/before striding is equivalent after this fix. Although in the code snippet I wrote there the results are not equal (but again I didn't investigate much). |
|
|
||
| # Overload without center & pad mode, needed for forward-compatibility | ||
| - func: stft(Tensor self, int n_fft, int? hop_length=None, int? win_length=None, Tensor? window=None, bool normalized=False, bool? onesided=None, bool? return_complex=None) -> Tensor | ||
| - func: stft(Tensor self, int n_fft, int? hop_length=None, int? win_length=None, Tensor? window=None, bool normalized=False, bool? onesided=None, bool? return_complex=None, bool? align_to_window=None) -> Tensor |
There was a problem hiding this comment.
I checked in with @rgommers and his suggestion for another way to specify the mode is to instead do str? align_mode=None which will allow for different alignment modes in the future. This would then match the API proposal in librosa: librosa/librosa#596 (comment)
Ralf also said he'd ping a scipy expert to come take a look :)
There was a problem hiding this comment.
Note that I'm not super attached to my suggestion, but in general when one has multiple modes a string option is cleaner than multiple boolean options that interact.
Ralf also said he'd ping a scipy expert to come take a look :)
@DietBru I thought this may interest you. I looked at scipy.signal.ShortTimeFFT as a reference, and don't think we have this mode there - but I could be wrong, it's not trivial to be sure given that there are quite a few modes and they don't match between SciPy and PyTorch.
There was a problem hiding this comment.
Taking the liberty to chip in here.
don't think we have this mode there
Probably because align_to_window=True should be the default behavior. Without it, the first samples are discarded when center=False and n_fft > win_length. I don't see why anyone wouldn't use this other than preserving BC.
There was a problem hiding this comment.
In that case, it might not be horrible to have an awful name, with the intention of flipping the default after a deprecation cycle. IDK, @dvorjackz, I'll defer to your choice.
There was a problem hiding this comment.
Concerning ShortTimeFFT:
The default behavior is to center the first window slice over the first signal sample, as illustrated here. This also incorporates a phase shift, to ensure that the phase of each STFT slice is correct. Furthermore, ShortTimeFFT.stft() provides a parameter k_offset to shift the input signal.
In summary: The parameter k_offset and the phase shift allow to arbitrarily set window alignments on the signal, which makes the ShortTimeFFT much more flexible.
I am not sure I can be of much help here. The challenge with aligning the window is to ensure consistent behavior at the borders—especially when it is desired to recreate the original signal with the ISTFT (one of the main motivations for the ShortTimeFFT implementation) .
|
Just checking, is there a OpInfo for this op that needs to be updated? |
|
Yes please |
None yet it seems |
7051ec8 to
bd9c45b
Compare
iseeyuan
left a comment
There was a problem hiding this comment.
LGTM. Thanks @dvorjackz !
Long overdue follow-up on https://github.com/pytorch/pytorch/pull/73432/files#diff-5f3d4caa0693a716fc46fd7f6339312f1b5f0bf89e3a3ff58e9dc13a9486b17aR719 Onnx stft doesn't support centering, [and all of the existing tests are for center = False](https://github.com/pytorch/pytorch/blob/main/test/onnx/test_pytorch_onnx_onnxruntime.py#L8026). I will open a follow-up issue to address this, this is just a nice-to-have. Pr chain: - -> [Advance past fc window for stft center #145437](#145437) - [Add stft option to align window for center = false #145324](#145324) - [Add istft option to align window for center = false](#145510) Pull Request resolved: #145437 Approved by: https://github.com/justinchuby, https://github.com/iseeyuan
Adds a flag for aligning the start of the window to the start of the signal when center = false (aka window-based padding). The same flag was proposed a while ago for Librosa as well.
For internal reasons, we need to add this behavior to the op and this flag allows us to do so while preserving bc compatibility
Pr chain: