Conversation
added 2 commits
May 25, 2022 00:05
Collaborator
Author
|
The performance of the added C++ test on Titan RTX: The thread mapping follows the outer-reduction scheduling scheme. When the horizontal grouping is not used, the total time of the two grid reductions was about 23 us, so the grouping has some non-negligible performance impact in this persistent case. |
csarofeen
approved these changes
May 25, 2022
Owner
csarofeen
left a comment
There was a problem hiding this comment.
Didn't review the test but the fixes make sense to me so stamping.
jjsjann123
added a commit
that referenced
this pull request
Jun 22, 2022
Syncing nvfuser devel branch to upstream master. https://github.com/csarofeen/pytorch/ Bug fixes and minor refactor Squashed commits to WAR github API Commits that's actually in this PR from the devel branch: ``` 4c60e7d Add examples infrastructure for using nvFuser in a standalone program (#1725) 02a05d9 Fix issue #1751 (#1753) 8a69aa3 Refactor NvFuser transpose API to match eager mode behavior (#1746) ffdf6b7 Remove BroadcastWithoutStride. (#1738) 02bab16 Fix flipping of a boolean flag (#1745) 465d668 cleanup (#1744) 26d354e fixing noncontig broadcast (#1742) 856b6b2 Add IterDomainBuilder (#1736) 1fd974f fixing warning for gcc7 (#1732) de2740a disabling complex in python tests for #1730 (#1733) fbbbe0a fixing MSVC build (#1728) b5feee5 Fix the fused reduction runtime kernel (#1729) 5247682 Re-entrant GroupedGridReduction (#1727) ``` RUN_TORCHBENCH: nvfuser Pull Request resolved: pytorch#79147 Approved by: https://github.com/davidberard98
jjsjann123
added a commit
that referenced
this pull request
Jun 22, 2022
…h#79406) Landing reverted PR pytorch#79147. Syncing nvfuser devel branch to upstream master. https://github.com/csarofeen/pytorch/ Bug fixes and minor refactor Squashed commits to WAR github API Commits that's actually in this PR from the devel branch: ``` 4c60e7d Add examples infrastructure for using nvFuser in a standalone program (#1725) 02a05d9 Fix issue #1751 (#1753) 8a69aa3 Refactor NvFuser transpose API to match eager mode behavior (#1746) ffdf6b7 Remove BroadcastWithoutStride. (#1738) 02bab16 Fix flipping of a boolean flag (#1745) 465d668 cleanup (#1744) 26d354e fixing noncontig broadcast (#1742) 856b6b2 Add IterDomainBuilder (#1736) 1fd974f fixing warning for gcc7 (#1732) de2740a disabling complex in python tests for #1730 (#1733) fbbbe0a fixing MSVC build (#1728) b5feee5 Fix the fused reduction runtime kernel (#1729) 5247682 Re-entrant GroupedGridReduction (#1727) ``` RUN_TORCHBENCH: nvfuser Pull Request resolved: pytorch#79406 Approved by: https://github.com/davidberard98
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.