-
Notifications
You must be signed in to change notification settings - Fork 225
Closed
Description
Problem Description
When trying to run the kernel on inputs of GPU ID of non-zero. E.g. 1,2,3,4,5,6,7. It will throw the following error.
Memory access fault by GPU node-2 (Agent handle: 0x9b15d70) on address 0x7ee42d200000. Reason: Unknown.
tensor(False, device='cuda:1')
GPU core dump created: gpucore.10171
Aborted
root@tw024:/app# python ex.py
Memory access fault by GPU node-2 (Agent handle: 0xa5f71a0) on address 0x7f532b800000. Reason: Unknown.
GPU core dump created: gpucore.10255
Aborted
Operating System
Ubuntu 22.04.4 LTS (Jammy Jellyfish)
CPU
AMD EPYC 9654 96-Core Processor
GPU
AMD Instinct MI300X
ROCm Version
ROCm 6.3.1
ROCm Component
composable_kernel
Steps to Reproduce
- Install aiter from main branch.
- Run the following script
from aiter.ops.gemm_op_a8w8 import gemm_a8w8_CK
import torch
SIZE_LIST = [
(3840, 16384, 16384),
(56, 8192, 7392)
]
def main():
for size in SIZE_LIST:
M, N, K = size
A = torch.rand(size=(M, K), device="cuda:1").to(torch.int8)
B = torch.rand(size=(K, N), device="cuda:1").to(torch.int8)
scale_a = torch.ones((M, 1), device="cuda:1").to(torch.int32)
scale_b = torch.ones((N, 1), device="cuda:1").to(torch.int32)
result = gemm_a8w8_CK(A, B.t(), scale_a, scale_b, dtype=torch.bfloat16)
if __name__ == "__main__":
main()
(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
No response
Additional Information
No response
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels