[QST] Is cutlass::bfloat16_t x cutlass::int2b_t GEMM possible?

While looking at example 55 (cutlass/examples/55_hopper_mixed_dtype_gemm/55_hopper_int4_bf16_gemm.cu), I was curious whether this modification would be legal:

From:
`using MmaType = cutlass::bfloat16_t;
using QuantType = cutlass::int4b_t;`

To:
`using MmaType = cutlass::bfloat16_t;
using QuantType = cutlass::int2b_t;`

According to the README.md, for the example, "For 8-bit x 4-bit or 2-bit, both inputs must be K-major." However, the internal comment states, "Only supports INT4 x { FP16, BF16 }." Furthermore, I'm having trouble finding documentation in the library over the use of `int2b_t` datatype for use in GEMM. I apologize if this question needs to be more detailed or if I missed some part of the documentation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QST] Is cutlass::bfloat16_t x cutlass::int2b_t GEMM possible? #1915

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[QST] Is cutlass::bfloat16_t x cutlass::int2b_t GEMM possible? #1915

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions