Skip to content

[QST] Is cutlass::bfloat16_t x cutlass::int2b_t GEMM possible? #1915

@areddy2022

Description

@areddy2022

While looking at example 55 (cutlass/examples/55_hopper_mixed_dtype_gemm/55_hopper_int4_bf16_gemm.cu), I was curious whether this modification would be legal:

From:
using MmaType = cutlass::bfloat16_t; using QuantType = cutlass::int4b_t;

To:
using MmaType = cutlass::bfloat16_t; using QuantType = cutlass::int2b_t;

According to the README.md, for the example, "For 8-bit x 4-bit or 2-bit, both inputs must be K-major." However, the internal comment states, "Only supports INT4 x { FP16, BF16 }." Furthermore, I'm having trouble finding documentation in the library over the use of int2b_t datatype for use in GEMM. I apologize if this question needs to be more detailed or if I missed some part of the documentation.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions