Conversation
|
Going to merge in #1739 before merging in this. Leaving separate for ease of review. |
naoyam
left a comment
There was a problem hiding this comment.
LGTM.
Does this mean there is no longer automatic conversion of size-one IterDomains to broadcast domains?
| if (root_dom[i]->isReduction() || | ||
| root_dom[i]->getIterType() == IterType::BroadcastWithoutStride || | ||
| root_dom[i]->isStride()) { | ||
| if (root_dom[i]->isReduction() || root_dom[i]->isStride()) { |
There was a problem hiding this comment.
At line 1316, the stride of a broadcast domain is set as oneVal, which seems inconsistent compared to here.
There was a problem hiding this comment.
Yeah, code is inconsistent, but the only thing that actually matters is the stride_i++ (if broadcasted dim) in either of these. Will try to make them consistent.
There was a problem hiding this comment.
Since I effectively removed BoadcastWithoutStride I think the change on 1316 is the inconsistent one.
There was a problem hiding this comment.
Why is there an isStride in this conditional but not the one in the global producer? Does it not follow the same pattern as isReduction in indexing?
There was a problem hiding this comment.
Stride IDs are removed in rfactor domains, so no stride IDs should appear in producer rfactor domains. Having the same conditional in the producer indexing cases should not break, but it's just not necessary.
|
|
Syncing nvfuser devel branch to upstream master. https://github.com/csarofeen/pytorch/ Bug fixes and minor refactor Squashed commits to WAR github API Commits that's actually in this PR from the devel branch: ``` 4c60e7d Add examples infrastructure for using nvFuser in a standalone program (#1725) 02a05d9 Fix issue #1751 (#1753) 8a69aa3 Refactor NvFuser transpose API to match eager mode behavior (#1746) ffdf6b7 Remove BroadcastWithoutStride. (#1738) 02bab16 Fix flipping of a boolean flag (#1745) 465d668 cleanup (#1744) 26d354e fixing noncontig broadcast (#1742) 856b6b2 Add IterDomainBuilder (#1736) 1fd974f fixing warning for gcc7 (#1732) de2740a disabling complex in python tests for #1730 (#1733) fbbbe0a fixing MSVC build (#1728) b5feee5 Fix the fused reduction runtime kernel (#1729) 5247682 Re-entrant GroupedGridReduction (#1727) ``` RUN_TORCHBENCH: nvfuser Pull Request resolved: pytorch#79147 Approved by: https://github.com/davidberard98
…h#79406) Landing reverted PR pytorch#79147. Syncing nvfuser devel branch to upstream master. https://github.com/csarofeen/pytorch/ Bug fixes and minor refactor Squashed commits to WAR github API Commits that's actually in this PR from the devel branch: ``` 4c60e7d Add examples infrastructure for using nvFuser in a standalone program (#1725) 02a05d9 Fix issue #1751 (#1753) 8a69aa3 Refactor NvFuser transpose API to match eager mode behavior (#1746) ffdf6b7 Remove BroadcastWithoutStride. (#1738) 02bab16 Fix flipping of a boolean flag (#1745) 465d668 cleanup (#1744) 26d354e fixing noncontig broadcast (#1742) 856b6b2 Add IterDomainBuilder (#1736) 1fd974f fixing warning for gcc7 (#1732) de2740a disabling complex in python tests for #1730 (#1733) fbbbe0a fixing MSVC build (#1728) b5feee5 Fix the fused reduction runtime kernel (#1729) 5247682 Re-entrant GroupedGridReduction (#1727) ``` RUN_TORCHBENCH: nvfuser Pull Request resolved: pytorch#79406 Approved by: https://github.com/davidberard98
Just some cleanup as when I started looking at expand I couldn't figure out why we have
BroadcastWithoutStrideI don't see any real disadvantages with having the extra dimension allocated. Maybe it could impact some contiguous indexing on intermediate global buffers, but that seems quite minor for this level of technical debt.