Skip to content

CUDA irfft may be doing unnecessary cloning of input #38413

@ssnl

Description

@ssnl

Context:

  1. In cuFFT,

    Out-of-place complex-to-real FFT will overwrite input buffer if custom strides are set by the user.

  2. We therefore had this check in CuFFTPlanCache.h
    // Note that this is before the actual cloning. This is intentional so we can
    // check for advanced data layout with complex-to-real transform. cuFFT
    // out-of-place complex-to-real transforms with advanced layout may overwrite
    // input, and we need to clone the input.
    //
    // This just needs contiguity in cases except for twosided real-to-complex
    // transform where we won't have simple data layout as output is two sided.
    //
    // See NOTE [ cuFFT Embedded Strides ] in native/cuda/SpectralOps.cu.
    bool simple_layout = !(!complex_input && complex_output && !onesided) && // not twosided R2C
    (clone_input || input.is_contiguous()); // contiguous
    if (!simple_layout && complex_input && !complex_output) {
    clone_input = true;
    simple_layout = true;
    }
  3. Users still reported irfft input being modified on T4 Multidimensional CUDA irfft modifies input #34551
  4. An unconditional input cloning for irfft seemed to have fixed it Fix input overwriting in irfft #35219

We should figure out why the previous check doesn't detect all the cases, whether it is a bug in our check or in cuFFT. I don't have access to a T4 so I write this issue to document the situation in case anyone wants to take a look.

cc @ngimel @mruberry @peterbell10 @VitalyFedyunin

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: cudaRelated to torch.cuda, and CUDA support in generalmodule: fftmodule: performanceIssues related to performance, either of kernel code or framework gluetriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions