[Codegen Assert] no broadcast allowed for output TensorView

## ❓ Questions and Help

Broadcast is not allowed on output TensorView. There's a proper check to detect that which spits out ```[output_tv] cannot be registered as an output as it has a broadcast axis```

## Current concern on this:

The code snippet below would trigger a `TORCH_CHECK` to fail.
```
TensoView *t1 = makeDummyTensor(1);
TensorView *t2 = broadcast(t1, {false, true});
fusion.addInput(t1);
fusion.addOutput(t2);
// ...
```

If we explicitly mark `t2` to be broadcasted on dimension 1, we are assuming its corresponding stride to be 0 (because broadcasted elements map to the same physical memory location).
However, as `t2` is an output tensor its stride will be an input to the kernel (the generated kernel will be something like below):
```
void kernel(Tensor<float, 1> T1, Tensor<float, 2> T2) {
   // ...
}
```

Hence marking the I/O TensorView as broadcasting violates the contract that I/O Tensors are provided at runtime. We can't generate a safe kernel that would behave as the user of the generated code would expect.

If later we found out use cases that broadcasting on output could save memory bandwidth of generated kernel, we could revisit this topic.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Codegen Assert] no broadcast allowed for output TensorView #95

❓ Questions and Help

Current concern on this:

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Codegen Assert] no broadcast allowed for output TensorView #95

Description

❓ Questions and Help

Current concern on this:

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions