Skip to content

[Codegen Assert] no broadcast allowed for output TensorView #95

@jjsjann123

Description

@jjsjann123

❓ Questions and Help

Broadcast is not allowed on output TensorView. There's a proper check to detect that which spits out [output_tv] cannot be registered as an output as it has a broadcast axis

Current concern on this:

The code snippet below would trigger a TORCH_CHECK to fail.

TensoView *t1 = makeDummyTensor(1);
TensorView *t2 = broadcast(t1, {false, true});
fusion.addInput(t1);
fusion.addOutput(t2);
// ...

If we explicitly mark t2 to be broadcasted on dimension 1, we are assuming its corresponding stride to be 0 (because broadcasted elements map to the same physical memory location).
However, as t2 is an output tensor its stride will be an input to the kernel (the generated kernel will be something like below):

void kernel(Tensor<float, 1> T1, Tensor<float, 2> T2) {
   // ...
}

Hence marking the I/O TensorView as broadcasting violates the contract that I/O Tensors are provided at runtime. We can't generate a safe kernel that would behave as the user of the generated code would expect.

If later we found out use cases that broadcasting on output could save memory bandwidth of generated kernel, we could revisit this topic.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions