Fix vec validation on expanded broadcasting#2044
Conversation
| testValidate(&fusion, cg_outputs, {input1}, {input1}, __LINE__, __FILE__); | ||
| } | ||
|
|
||
| TEST_F(NVFuserTest, FusionVectorizeRepro_CUDA) { |
There was a problem hiding this comment.
Please add this:
// Repro of issue #1843
| auto tv7 = sum(tv0, {1}, true); | ||
| auto tv_exp = | ||
| expand(tv7, {tv0->axis(0)->extent(), IrBuilder::create<Int>(32128)}); | ||
| auto tv3 = exp(tv1); | ||
| auto tv8 = mul(tv3, tv_exp); | ||
| auto tv13 = sub(tv0, tv8); | ||
| fusion->addOutput(tv13); |
There was a problem hiding this comment.
Does this actually reproduce the issue? It seems this PR addresses an issue with validation vectorization of input and output tensors with an expanded broadcast IterDomain. This test has an expanded broadcast, but that isn't in any of input or output tensors. Am I missing anything?
There was a problem hiding this comment.
Haha, the test is tricky, there's a fusion segmentation happening right at the expand op.
I tried to get a repro without that segmentation, somehow that doesn't trigger the issue cold_sweat
There was a problem hiding this comment.
Hmm. Can you please add that as a comment to the test? It'd be great if we could understand why it wouldn't fail without segmentation.
| const auto size = aten_tensor.sizes().at(i); | ||
| auto root_id = tv->getMaybeRFactorDomain()[i]; | ||
| const auto is_expanded_broadcasting = | ||
| root_id->isBroadcast() && root_id->hasExpandedExtent(); |
There was a problem hiding this comment.
Do we need to check for expanded broadcast here? I was under the impression that even a normal broadcast with an arbitrary stride should also pass the test, since we never index into broadcast dimension and it shouldn't break vectorization.
There was a problem hiding this comment.
Non-expanded broadcasting is covered by the size == 1 check below. I keep "expanded broadcast" a separate case because I want to check that, if it is an expanded broadcast, then it needs to have stride 0. But for regular broadcast, there is no such limitation.
Maybe I did something stupid when I was trying to repro it, but having Hopefully Xiang can easily get a repro with a simplified definition. 🤞 |
- Make `TensorViewBuilder` able to generate expanded tensor. - Fixes a problem on the contiguity found while working on #2044. Note that an expanded dimension and the dimension before can never be contiguous. Without fixing this, the added test will generate silent wrong result. - Validate contiguity of expanded dimensions during lowering
|
Noticed that this one hasn't been merged yet. Would be nice to get it. 🙇 |
csarofeen
left a comment
There was a problem hiding this comment.
Test references the issue at hand mentioning segmentation requirement for reproduction. Merging as is.
|
@zasdfgbnm to rebase when you have a chance. |
- Make `TensorViewBuilder` able to generate expanded tensor. - Fixes a problem on the contiguity found while working on csarofeen/pytorch#2044. Note that an expanded dimension and the dimension before can never be contiguous. Without fixing this, the added test will generate silent wrong result. - Validate contiguity of expanded dimensions during lowering
Fixes #1843. Thanks, @jjsjann123 for the test and discussion.