[Fixbug] Reduce should perform syncthread after initializing shared memory to zero#325
Merged
yaoyaoding merged 1 commit intohidet-org:mainfrom Jul 22, 2023
Merged
Conversation
Member
|
Thanks @xinli-git ! |
vadiklyutiy
pushed a commit
that referenced
this pull request
Dec 19, 2024
…le compiling the model `sam` (#444) Closes #325 The error in the linked issue was caused by [this code segment](https://github.com/CentML/hidet/blob/bfbb4db6d7792ed3de3be4e9702e597b8fbbe373/python/hidet/graph/transforms/conv_channel_last.py#L46-L75) in `graph/transforms/conv_channel_last.py`. By the logic flow of this code segment, if the operator `node` has two inputs, the first one with rank 4 and the second rank 3(an example case in the model: an `AddOp` where the first input has shape `[1, 256, 64, 64]` and the second `[256, 1, 1]`) , then by the time the code reaches the line 75, the variable `new_perm`would have value `[1, 2, 0]`, and this value will be recorded as the permutation scheme used to get the new output, which is incorrect as the appropriate value should be `[0, 2, 3, 1]` here.
vadiklyutiy
pushed a commit
that referenced
this pull request
Dec 20, 2024
…le compiling the model `sam` (#444) Closes #325 The error in the linked issue was caused by [this code segment](https://github.com/CentML/hidet/blob/bfbb4db6d7792ed3de3be4e9702e597b8fbbe373/python/hidet/graph/transforms/conv_channel_last.py#L46-L75) in `graph/transforms/conv_channel_last.py`. By the logic flow of this code segment, if the operator `node` has two inputs, the first one with rank 4 and the second rank 3(an example case in the model: an `AddOp` where the first input has shape `[1, 256, 64, 64]` and the second `[256, 1, 1]`) , then by the time the code reaches the line 75, the variable `new_perm`would have value `[1, 2, 0]`, and this value will be recorded as the permutation scheme used to get the new output, which is incorrect as the appropriate value should be `[0, 2, 3, 1]` here.
vadiklyutiy
pushed a commit
that referenced
this pull request
Dec 26, 2024
…le compiling the model `sam` (#444) Closes #325 The error in the linked issue was caused by [this code segment](https://github.com/CentML/hidet/blob/bfbb4db6d7792ed3de3be4e9702e597b8fbbe373/python/hidet/graph/transforms/conv_channel_last.py#L46-L75) in `graph/transforms/conv_channel_last.py`. By the logic flow of this code segment, if the operator `node` has two inputs, the first one with rank 4 and the second rank 3(an example case in the model: an `AddOp` where the first input has shape `[1, 256, 64, 64]` and the second `[256, 1, 1]`) , then by the time the code reaches the line 75, the variable `new_perm`would have value `[1, 2, 0]`, and this value will be recorded as the permutation scheme used to get the new output, which is incorrect as the appropriate value should be `[0, 2, 3, 1]` here.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This bug slipped through the CI because the test shapes were all too small. There is an initialization of shared memory that is not guarded by syncthread before others were able to access it.