[POC] Dynamic expand with SymInt implementation#3558
Conversation
expand with SymInt implementation
|
Does this pr build locally on your end? Build on CI failed with conflicts. |
| for (int index = 0; i < sizes.size(); i++) { | ||
| auto _symbolicIntNode = sizes[i].toSymbolicIntNode(); | ||
| auto _sizenode = MakeNode<SizeNode>(_symbolicIntNode); | ||
| upper_bound.push_back(_sizenode.getStaticValue()); | ||
| dynamic_dims.push_back(_sizenode.isDynamic()); | ||
| } |
There was a problem hiding this comment.
we can probably put these in a helper function for now. I felt like we should be able to codegen these in the future too.
|
|
||
| /* Construct Upper Bound Tensor Shape */ | ||
| xla::XlaOp upper_bound_size_input = | ||
| xla::Parameter(loctx->builder(), 0, target_shape, "upper_bound_size"); |
There was a problem hiding this comment.
hmm, it is rare to use xla::Parameter in the lowering, you can use xla::Zero and xla::Broadcast to achieve the same.
There was a problem hiding this comment.
Why do I use to two API calls to reach the same goal instead of one?
|
This PR doesn't build at the moment because the upstream layer LTC doesn't yet have API support for |
|
Update: The current unit test checks the |
| std::vector<torch::lazy::NodePtr> size_nodes; | ||
| std::vector<int64_t> upper_bounds; | ||
| std::vector<bool> dynamic_dims; | ||
| /* TODO: move this code to a helper function */ |
There was a problem hiding this comment.
Considering to move this code to a helper function. My candidates are creating a new helper file, helpers.h, or tensor_util.h? @JackCaoG wdyt?
There was a problem hiding this comment.
you could also consider putting a helper in dynamic_ir.h. And I can make a copy for the TS backend. That's a disadvantage of code duplication. Hopefully, we won't need to do it often.
| std::vector<torch::lazy::NodePtr> size_nodes; | ||
| std::vector<int64_t> upper_bounds; | ||
| std::vector<bool> dynamic_dims; | ||
| /* TODO: move this code to a helper function */ |
There was a problem hiding this comment.
you could also consider putting a helper in dynamic_ir.h. And I can make a copy for the TS backend. That's a disadvantage of code duplication. Hopefully, we won't need to do it often.
| absl::Span<const xla::XlaOp> output_sizes) { | ||
| for (int i = 0; i < output_sizes.size(); i++) { | ||
| xla::Shape dim_shape = XlaHelpers::ShapeOfXlaOp(output_sizes[i]); | ||
| if (dim_shape.is_dynamic()) { |
There was a problem hiding this comment.
Yes, dimensions from InferShape set shape. This means when a dimension is static, the upper_bound values are expected to be equal to the "true" dimensions. Does this align with your side @Krovatkin?
Can you elaborate on: we would like shapes to appear symbolically in a graph and I'm not sure this would happen if dimensions are static? @Krovatkin
| - erfinv | ||
| - exp | ||
| - expand | ||
| - expand.SymInt |
There was a problem hiding this comment.
I'm actually not seeing the exact place where your hooking in XLATensor::expand into a dispatcher? is the plan to use codegen for that?
There was a problem hiding this comment.
Our current codegen creates XLANativeFunctions.h for aten_xla_type.cpp. In it, you find the following definitions. Does it answer your question @Krovatkin?
static at::Tensor expand(const at::Tensor & self, at::IntArrayRef size, bool implicit);
static at::Tensor expand(const at::Tensor & self, c10::SymIntArrayRef size, bool implicit);
| -DPYTHON_INCLUDE_DIR=$(python -c "from distutils.sysconfig import get_python_inc; print(get_python_inc())") \ | ||
| -DPYTHON_LIBRARY=$(python -c "import distutils.sysconfig as sysconfig; print(sysconfig.get_config_var('LIBDIR') + '/' + sysconfig.get_config_var('LDLIBRARY'))") | ||
| -DPYTHON_INCLUDE_DIR=$(python3 -c "from distutils.sysconfig import get_python_inc; print(get_python_inc())") \ | ||
| -DPYTHON_LIBRARY=$(python3 -c "import distutils.sysconfig as sysconfig; print(sysconfig.get_config_var('LIBDIR') + '/' + sysconfig.get_config_var('LDLIBRARY'))") |
There was a problem hiding this comment.
There was an issue by forcing python to python3, @yeounoh might have more context. If this is for the test purpose you can use
sudo update-alternatives --install /usr/bin/python python /usr/bin/python3.8 100
on your tpuvm
|
|
||
| std::string SizeNode::ToString() const { return "SizeNode"; } | ||
|
|
||
| SizeAdd::SizeAdd(XlaValue a, XlaValue b) |
There was a problem hiding this comment.
should we check that XlaValue actually contains DimensionNode and store them as DimensionNode diretly? This can save us from doing multiple dynamic_cast on getStaticValue.
| SizeAdd(XlaValue a, XlaValue b); | ||
| int64_t getStaticValue() const override; | ||
| std::string ToString() const override; | ||
| }; |
There was a problem hiding this comment.
why these operator does not have Lower?
There was a problem hiding this comment.
Not sure if we need to lower these ops as they return upper bound information. Unless we directly pass SizeAdd to downstream ops like expand. @Krovatkin wdyt?
To help the discussion I will push a lowering implementation shortly. Let's continue discussing.
| SizeDiv::SizeDiv(XlaValue a, XlaValue b) | ||
| : DimensionNode( | ||
| torch::lazy::OpKind{c10::Symbol::fromQualString("aten::div")}, {a, b}, | ||
| torch::lazy::MHash(1)){}; |
There was a problem hiding this comment.
I take this hash is a hack for now..
expand with SymInt implementationexpand with SymInt implementation
| c10::SymInt xla_y0_size = xla_y.sym_sizes()[0]; | ||
| torch::Tensor xla_a = CopyToDevice(a, device); | ||
| torch::Tensor xla_b = xla_a.expand( | ||
| c10::SymIntArrayRef({xla_y0_size, c10::SymInt(3), c10::SymInt(4)}), |
There was a problem hiding this comment.
@Krovatkin I hope constructing c10::SymIntArrayRef makes sense (ref). Let me know if it should be done differently.
There was a problem hiding this comment.
it looks good. I think just plain xla_a.expand_symint({xla_y0_size, c10::SymInt(3), c10::SymInt(4)}, .... should've worked. If not we should look into that.
There was a problem hiding this comment.
@Krovatkin as discussed this morning, looks like this line causes the failure raised here. What's your guidance?
…temp measure for POC
expand with SymInt implementationexpand with SymInt implementation
This is the POC implementation of
torch.Tensor.expandop based on the PyTorchSymIntPOC implementation PR.Action items to unblock:
expandwithSymIntinput parameter as a POC. This won't compile untilexpandwithSymIntis available in PyTorchexpand.SymIntop (withc10::SymIntArrayRefinput signature is available) is readydynamic_iras a subclass ofXLANodeSizeNodelowering usingxla::GetDimensionSizeAPIexpand.SymIntin PyTorchexpand.SymIntin PyTorchexpandin LazyTensor shape inference pytorch#77830DimensionNode::isDynamic()after PyTorch API support becomes availableDimensionNode::isDynamic()API vialazy::Shape::is_symbolic()pytorch#77909SymInt-related helper functions to improve reuse and modularity.DimensionNodeclass to LTC core and enable multiple inheritance forSizenodesexpand.SymIntin PyTorch/XLAgtestneeds to be updated #3589torch.nonzeroin PyTorchtorch.nonzeroin PyTorch/XLAis_symbolicAPI issueTestExpandSymIntfails due to incorrect dynamism report byisDynamic()#3680