Add expand and expandAs support in arith and fusion executor.#1739
Add expand and expandAs support in arith and fusion executor.#1739csarofeen merged 4 commits intobroadcast_cleanupfrom
Conversation
| iter_types[i] = dom[i]->getIterType(); | ||
| } | ||
| extent_vals[i] = promoteSize(extent_vals[i], dom[i]->extent()); | ||
| iter_types[i] = dom[i]->getIterType(); |
There was a problem hiding this comment.
For extra safety, having something below would make me feel better:
if (iter_types[i] == nullptr) {
iter_types[i] = dom[i]->getIterType();
} else {
TORCH_INTERNAL_ASSERT( iter_types[i] == dom[i]->getIterType());
}
There was a problem hiding this comment.
I left a note, but we can't do this now because of gather. It seems to me gather can be promoted to IterType::Serial pretty aggressively, but one thing we should do, is if people gather with a window size of 1, shouldn't we just make that an explicit broadcast? Is this what convertToBroadcast was doing? If so, that should just be done in gather definition.
There was a problem hiding this comment.
I see. Propagation of IterDomain types seem to need to be done more carefully.
I think we can convert size-one gather to just a broadcast.
| IrBuilder::create<TensorDomain>( | ||
| out_domain, std::vector<bool>(out_domain.size(), true)), | ||
| inp->getDataType().value()); | ||
| IrBuilder::create<UnaryOp>(UnaryOpType::Set, out_tensor, inp); |
There was a problem hiding this comment.
Initially, I thought this would be a binary op, but there's no actual data dependency with other, so unary seems to make more sense.
Well, but what if a tensor was only used as an other argument? We still would need that tensor, but such tensors would not be included in, e.g., getAllValsBetween. I wonder if that could cause any problem in lowering.
There was a problem hiding this comment.
Will make sure it works with real segmentation, but yes it has been working as all we need is to be able to infer the size of the other tensor at runtime. I have an explicit test which works fine as the tensor is marked as an input. Will try to check in a case where there's actual segmentation and the second kernel is the one with the expand.
naoyam
left a comment
There was a problem hiding this comment.
LGTM.
As a follow-up, we should use the expanded-as size information to make broadcast concretization mappings more precise and less conservative.
Add expand and expandAs support in arith and fusion executor. (#1739) * Make expandOp Expr.
|
Linking issue #1739 |
Add support for
expandandexpandAs. This just potentially adds expand on output tensors. Support is full dynamic shapes (see test case for multiple uses).Interface is a bit clunky, but works fine, it's hard to simplify since we may want to take in symbolic integers that could even be passed in directly as an input (shown in test case).
I did not do the python interface, was going to leave that to @kevinstephano. Kevin, would recommend just looking through the test and arith.cpp/h.
Also:
isConstIntandevaluateIntonValwhich allows us to easily infer a compile time constant value.expandedExtentthrough IterDomain to indicate on a broadcast dimension what the expanded size should be (only used if it's allocated as an output of a fusion).