Skip to content

Remove BroadcastWithoutStride.#1738

Merged
csarofeen merged 4 commits intodevelfrom
broadcast_cleanup
Jun 3, 2022
Merged

Remove BroadcastWithoutStride.#1738
csarofeen merged 4 commits intodevelfrom
broadcast_cleanup

Conversation

@csarofeen
Copy link
Copy Markdown
Owner

Just some cleanup as when I started looking at expand I couldn't figure out why we have BroadcastWithoutStride I don't see any real disadvantages with having the extra dimension allocated. Maybe it could impact some contiguous indexing on intermediate global buffers, but that seems quite minor for this level of technical debt.

@csarofeen csarofeen requested a review from naoyam May 30, 2022 00:32
@csarofeen
Copy link
Copy Markdown
Owner Author

Going to merge in #1739 before merging in this. Leaving separate for ease of review.

Copy link
Copy Markdown
Collaborator

@naoyam naoyam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Does this mean there is no longer automatic conversion of size-one IterDomains to broadcast domains?

if (root_dom[i]->isReduction() ||
root_dom[i]->getIterType() == IterType::BroadcastWithoutStride ||
root_dom[i]->isStride()) {
if (root_dom[i]->isReduction() || root_dom[i]->isStride()) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At line 1316, the stride of a broadcast domain is set as oneVal, which seems inconsistent compared to here.

Copy link
Copy Markdown
Owner Author

@csarofeen csarofeen Jun 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, code is inconsistent, but the only thing that actually matters is the stride_i++ (if broadcasted dim) in either of these. Will try to make them consistent.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I effectively removed BoadcastWithoutStride I think the change on 1316 is the inconsistent one.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is there an isStride in this conditional but not the one in the global producer? Does it not follow the same pattern as isReduction in indexing?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stride IDs are removed in rfactor domains, so no stride IDs should appear in producer rfactor domains. Having the same conditional in the producer indexing cases should not break, but it's just not necessary.

@csarofeen
Copy link
Copy Markdown
Owner Author

Does this mean there is no longer automatic conversion of size-one IterDomains to broadcast domains? No, as TensorView builder still does this, and you can explicitly send in the IterDomains with broadcast if you directly make them in integration (I'm now using the IterDomainBuilder). I just got rid of that IterDomain conversion in the TensorView constructor as I couldn't find anyplace it was reasonably used, and I didn't like the idea of TensorView modifying already constructed IterDomains. If we'd like we could make the IterDomain constructor to automatically promote it to a broadcast if extent is size 1, but I think just using TensorViewBuilder to take care of size-1 promotion to broadcast is generally better.

@csarofeen csarofeen merged commit ffdf6b7 into devel Jun 3, 2022
jjsjann123 added a commit that referenced this pull request Jun 22, 2022
Syncing nvfuser devel branch to upstream master. https://github.com/csarofeen/pytorch/

Bug fixes and minor refactor

Squashed commits to WAR github API
Commits that's actually in this PR from the devel branch:

```
4c60e7d Add examples infrastructure for using nvFuser in a standalone program (#1725)
02a05d9 Fix issue #1751 (#1753)
8a69aa3 Refactor NvFuser transpose API to match eager mode behavior (#1746)
ffdf6b7 Remove BroadcastWithoutStride. (#1738)
02bab16 Fix flipping of a boolean flag (#1745)
465d668 cleanup (#1744)
26d354e fixing noncontig broadcast (#1742)
856b6b2 Add IterDomainBuilder (#1736)
1fd974f fixing warning for gcc7 (#1732)
de2740a disabling complex in python tests for #1730 (#1733)
fbbbe0a fixing MSVC build (#1728)
b5feee5 Fix the fused reduction runtime kernel (#1729)
5247682 Re-entrant GroupedGridReduction (#1727)
```

RUN_TORCHBENCH: nvfuser
Pull Request resolved: pytorch#79147
Approved by: https://github.com/davidberard98
jjsjann123 added a commit that referenced this pull request Jun 22, 2022
…h#79406)

Landing reverted PR pytorch#79147.

Syncing nvfuser devel branch to upstream master. https://github.com/csarofeen/pytorch/

Bug fixes and minor refactor

Squashed commits to WAR github API
Commits that's actually in this PR from the devel branch:

```
4c60e7d Add examples infrastructure for using nvFuser in a standalone program (#1725)
02a05d9 Fix issue #1751 (#1753)
8a69aa3 Refactor NvFuser transpose API to match eager mode behavior (#1746)
ffdf6b7 Remove BroadcastWithoutStride. (#1738)
02bab16 Fix flipping of a boolean flag (#1745)
465d668 cleanup (#1744)
26d354e fixing noncontig broadcast (#1742)
856b6b2 Add IterDomainBuilder (#1736)
1fd974f fixing warning for gcc7 (#1732)
de2740a disabling complex in python tests for #1730 (#1733)
fbbbe0a fixing MSVC build (#1728)
b5feee5 Fix the fused reduction runtime kernel (#1729)
5247682 Re-entrant GroupedGridReduction (#1727)
```

RUN_TORCHBENCH: nvfuser
Pull Request resolved: pytorch#79406
Approved by: https://github.com/davidberard98
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants