Full indexing rework#260
Conversation
…issue mentioned in comments of: getBCastMergedIndices in index_compute.cpp
… into contiguity_cherry_pick
… into contiguity_cherry_pick
…ons we will use for index_compute.
…nst/non-const tensor dims.
… into contiguity_v3
|
I'm merging #277. which temporarily hides all the problem this PR is supposed to fix. We need to revert/undo the changes when testing the correctness of this PR. Sorry for the inconvenience. |
… into contiguity_merged
|
|
||
| at::Tensor cg_output = at::empty({x, y, z}, options); | ||
| void testGPU_FusionComplexBCast() { | ||
| { |
There was a problem hiding this comment.
can we split this into separate test functions?
There was a problem hiding this comment.
Sure, would you please push a commit for it to this pr?
| // | ||
| // The reason we need both TensorView and TensorDomain is that we need to have a | ||
| // record of both what is being computed and how it is being computed. For | ||
| // Example we may have the operation: TV3[I, J, K] = TV2[I, J, K] + TV1[I, J, K] |
| const std::vector<IterDomain*>& getRFactorDomain() const { | ||
| return rfactor_domain_; | ||
| }; | ||
| // If rfactor domain exists in domain() return it, otherwise return root |
|
|
||
| void IRPrinter::handle(const kir::IterDomain*) { | ||
| TORCH_INTERNAL_ASSERT(false, "Unreachable"); | ||
| void IRPrinter::handle(const kir::IterDomain* id) { |
There was a problem hiding this comment.
where is this needed? I hit this a few times while working on the refactoring, but every time it indicated that something else needed to be updated
There was a problem hiding this comment.
Debugging matching between the IterDomain of kir and fusion was pretty challenging when I couldn't print what they were. No remaining code relies on it, but I rely on it for debugging. I had to update Type.cpp
There was a problem hiding this comment.
I think we should now be able to fill the rest out marked as Unreachable
| } | ||
|
|
||
| std::vector<Expr*> UnsortedExprs::getFrom(std::vector<Val*> outputs) { | ||
| if (outputs.empty()) |
| // the life of this context guard. | ||
| class TVDomainGuard { | ||
| public: | ||
| TensorView* tv_; |
There was a problem hiding this comment.
- if public, no need for _ suffix
- = nullptr (same for prev_domain)
There was a problem hiding this comment.
Should be private, thanks.
| return {}; | ||
| } | ||
|
|
||
| std::vector<kir::Bool*> preds(root.size(), new kir::Bool(true)); |
There was a problem hiding this comment.
I just wanted to pre-initialize before the loop below.
| const std::vector<kir::ForLoop*>& loops, | ||
| kir::Bool* thread_pred) { | ||
| if (loops.empty()) | ||
| return new kir::Bool(true); |
|
|
||
| void openLoop(kir::ForLoop*); | ||
|
|
||
| std::unordered_map<IterDomain*, kir::Bool*> predicates; |
There was a problem hiding this comment.
private: to separate private methods from private state
There was a problem hiding this comment.
They're ordered, that may be the best you get from me.
| return "Scalar"; | ||
| case ValType::NamedScalar: | ||
| return "NamedScalar"; | ||
| case ValType::KirIterDomain: |
There was a problem hiding this comment.
Yes, for ir_iostream, an IR that isn't printable isn't debug-able.
| // Simply grabs all exprs needed to produce provided outputs, only in dependency | ||
| // order, not computeAtOrder like ExprSort in fusion.h | ||
| class UnsortedExprs : public IterVisitor { |
There was a problem hiding this comment.
ExprSort doesn't do anything for computeAt anymore as the logic is moved to lower_loops.cpp. Perhaps, we don't need this class and can just use ExprSort.
| static std::vector<Expr*> runPass( | ||
| Fusion* fusion, | ||
| const std::vector<Expr*>& exprs, | ||
| const std::unordered_set<Expr*>& init_exprs, |
There was a problem hiding this comment.
We can re-enable that fix without carrying init_exprs. We can detect initialization expressions by the consumer pattern and the current loop nest.
There was a problem hiding this comment.
Could you open a new issue and I can do a follow up PR? I believe we still have correctness now.
There was a problem hiding this comment.
You're right. The previous failing test still works fine, but I'm not sure whether it's intended behavior. I re-opened the issue (#64).
naoyam
left a comment
There was a problem hiding this comment.
As @csarofeen mentioned, predicate generation for initializing reduction buffers seems broken again. The original fix consists of #246 and #255.
|
I did a quick review of all changes and left two comments. One is regarding the predicate generation for reduction buffers. Another is the newly added class, |
Doing a lot of indexing rework.
Right now we'd say this is contiguous, but linear indexing on this is not the same as:
The latter should be contiguous merges, the former not.