Skip to content

[Fixbug] Binary arthmatic ops raise error when one is scalar on GPU#109

Merged
yaoyaoding merged 1 commit intohidet-org:mainfrom
yaoyaoding:fix-arth
Feb 17, 2023
Merged

[Fixbug] Binary arthmatic ops raise error when one is scalar on GPU#109
yaoyaoding merged 1 commit intohidet-org:mainfrom
yaoyaoding:fix-arth

Conversation

@yaoyaoding
Copy link
Copy Markdown
Member

Fix #95.

@yaoyaoding yaoyaoding merged commit 7820672 into hidet-org:main Feb 17, 2023
@yaoyaoding yaoyaoding deleted the fix-arth branch February 17, 2023 02:40
vadiklyutiy pushed a commit that referenced this pull request Jul 22, 2024
- Add tile-level operations like, `copy`, `mask`, `partition_src`, and
`partition_dst`.
- Add a pass to lower the tile-level operations to Hidet IR. 
- Enhance the infra to facilitate the lowering. 

`copy`: Copy a tensor to another tensor. 
`mask`: Create a mask tensor for the copy operation. Typically, this
operation is used when the tile size can not divide the matrix shape.
`partition_src`, `partition_dst`: These two operations partition a
tensor held by the entire thread block into subtensors held by a single
thread. These operations allow us to move expressions related to
`threadIdx` and `blockIdx` outside the loop.

---------

Co-authored-by: Xiao Zhang <xiao.zhang@centml.ai>
vadiklyutiy pushed a commit that referenced this pull request Jul 23, 2024
- Add tile-level operations like, `copy`, `mask`, `partition_src`, and
`partition_dst`.
- Add a pass to lower the tile-level operations to Hidet IR. 
- Enhance the infra to facilitate the lowering. 

`copy`: Copy a tensor to another tensor. 
`mask`: Create a mask tensor for the copy operation. Typically, this
operation is used when the tile size can not divide the matrix shape.
`partition_src`, `partition_dst`: These two operations partition a
tensor held by the entire thread block into subtensors held by a single
thread. These operations allow us to move expressions related to
`threadIdx` and `blockIdx` outside the loop.

---------

Co-authored-by: Xiao Zhang <xiao.zhang@centml.ai>
vadiklyutiy pushed a commit that referenced this pull request Dec 26, 2024
- Add tile-level operations like, `copy`, `mask`, `partition_src`, and
`partition_dst`.
- Add a pass to lower the tile-level operations to Hidet IR. 
- Enhance the infra to facilitate the lowering. 

`copy`: Copy a tensor to another tensor. 
`mask`: Create a mask tensor for the copy operation. Typically, this
operation is used when the tile size can not divide the matrix shape.
`partition_src`, `partition_dst`: These two operations partition a
tensor held by the entire thread block into subtensors held by a single
thread. These operations allow us to move expressions related to
`threadIdx` and `blockIdx` outside the loop.

---------

Co-authored-by: Xiao Zhang <xiao.zhang@centml.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] binary arithmetic with CUDA scalar

1 participant