Skip to content

Making torchao ABI stable and moving closer to python only #3516

@jerryzh168

Description

@jerryzh168

Follow up of #1747, here is what we plan to do to make torchao ABI compatible and closer to python only, after this is done, torchao will be compatible with all pytorch versions and we don't need to worry about #2919

please feel free to pick up the tasks by adding your name to Status column

Status Assignee File Description Plan
Deleted #3520 @howardzhang-cv torchao/csrc/cuda/fp6_llm/fp6_linear.cu FP6 linear layer delete
Deleted #3612 @howardzhang-cv torchao/csrc/cuda/marlin_qqq/marlin_qqq_kernel.cu Marlin QQQ quantization delete
Deleted #3744 @howardzhang-cv torchao/csrc/cuda/activation24/sparse_gemm.cu 2:4 sparse GEMM delete
Deleted #3744 @howardzhang-cv torchao/csrc/cuda/activation24/sparsify24.cu 2:4 sparsification delete
Deleted #3613 @howardzhang-cv torchao/csrc/cuda/sparse_marlin/marlin_kernel_nm.cu N:M sparse Marlin delete
Deleted #3722 @howardzhang-cv torchao/csrc/cuda/tensor_core_tiled_layout/tensor_core_tiled_layout.cu Tensor core tiled layout delete
ABI stable #3610 @andrewor14 @danielvegamyhre torchao/csrc/cuda/mx_kernels/mxfp8_cuda.cu MXFP8 CUDA kernels Make ABI compatible
ABI stable #3610 @andrewor14 @danielvegamyhre torchao/csrc/cuda/mx_kernels/mxfp8_extension.cpp MXFP8 CUDA kernels Make ABI compatible
No need to change @andrewor14 @danielvegamyhre torchao/csrc/cuda/mx_kernels/mx_block_rearrange_2d_M_groups.cu MXFP8 CUDA kernels Make ABI compatible
Deleted #3723 @jerryzh168 torchao/csrc/cuda/rowwise_scaled_linear_cutlass/rowwise_scaled_linear_cutlass_s4s4.cu S4S4 row-wise scaled linear delete
Deleted #3723 @jerryzh168 torchao/csrc/cuda/rowwise_scaled_linear_cutlass/rowwise_scaled_linear_cutlass_s8s4.cu S8S4 row-wise scaled linear delete
ABI stable #3725 @andrewor14 torchao/csrc/cuda/rowwise_scaled_linear_sparse_cutlass/rowwise_scaled_linear_sparse_cutlass_e4m3e4m3.cu Sparse E4M3xE4M3 Make ABI compatible, not build by default
ABI stable #3725 @andrewor14 torchao/csrc/cuda/rowwise_scaled_linear_sparse_cutlass/rowwise_scaled_linear_sparse_cutlass_e4m3e5m2.cu Sparse E4M3xE5M2 Make ABI compatible, not build by default
ABI stable #3725 @andrewor14 torchao/csrc/cuda/rowwise_scaled_linear_sparse_cutlass/rowwise_scaled_linear_sparse_cutlass_e5m2e4m3.cu Sparse E5M2xE4M3 Make ABI compatible, not build by default
ABI stable #3725 @andrewor14 torchao/csrc/cuda/rowwise_scaled_linear_sparse_cutlass/rowwise_scaled_linear_sparse_cutlass_e5m2e5m2.cu Sparse E5M2xE5M2 Make ABI compatible, not build by default
ABI stable #3725 @andrewor14 torchao/csrc/cuda/rowwise_scaled_linear_sparse_cutlass/rowwise_scaled_linear_sparse_cutlass_f8f8.cu Sparse FP8xFP8 Make ABI compatible, not build by default
Done #3727 @jerryzh168 torchao/csrc/cuda/to_sparse_semi_structured_cutlass_sm9x/to_sparse_semi_structured_cutlass_sm9x_f8.cu Semi-structured sparse FP8 Make ABI compatible, not build by default
No need to change torchao/_models/sam2/csrc/connected_components.cu Connected components (SAM2) Move sam2 to somewhere else? We can probably delete this - this shouldn't block ABI I think, I don't think it is built by default
Status Assignee File Description Plan
Deleted #3520 @howardzhang-cv torchao/csrc/cuda/fp6_llm/utils_core.cuh FP6 core utilities delete
Deleted #3520 @howardzhang-cv torchao/csrc/cuda/fp6_llm/kernel_reduction.cuh FP6 reduction kernel delete
Deleted #3520 @howardzhang-cv torchao/csrc/cuda/fp6_llm/ptx_mma.cuh FP6 PTX MMA delete
Deleted #3520 @howardzhang-cv torchao/csrc/cuda/fp6_llm/kernel_matmul.cuh FP6 matmul kernel delete
Deleted #3520 @howardzhang-cv torchao/csrc/cuda/fp6_llm/utils_gmem.cuh FP6 global memory utils delete
Deleted #3520 @howardzhang-cv torchao/csrc/cuda/fp6_llm/utils_parallel_dequant.cuh FP6 parallel dequant utils delete
Deleted #3520 @howardzhang-cv torchao/csrc/cuda/fp6_llm/ptx_cp.async.cuh FP6 PTX async copy delete
Deleted #3723 @jerryzh168 torchao/csrc/cuda/rowwise_scaled_linear_sparse_cutlass/rowwise_scaled_linear_cutlass.cuh CUTLASS header delete
ABI stable #3725 @andrewor14 torchao/csrc/cuda/rowwise_scaled_linear_sparse_cutlass/rowwise_scaled_linear_sparse_cutlass.cuh Sparse CUTLASS header Make ABI compatible
Done #3727 @jerryzh168 torchao/csrc/cuda/to_sparse_semi_structured_cutlass_sm9x/to_sparse_semi_structured_cutlass_sm9x.cuh Semi-structured sparse header Make ABI compatible
No need to change @andrewor14 torchao/csrc/cuda/mx_kernels/mxfp8_quantize.cuh MXFP8 quantize header Make ABI compatible (has no torch C++ anyway)
No need to change @andrewor14 torchao/csrc/cuda/mx_kernels/ptx.cuh MX PTX header Make ABI compatible

After the above is done, we can explore making torchao python only through:

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions