[QST] INT8 GEMM with FP16 output

Hey I have noticed INT8 gemm with the following settings where we have `float` bias and `float` output works okay:

```
using ElementAccumulator = int32_t;                   // <- data type of accumulator
using ElementComputeEpilogue = float;  // <- data type of epilogue operations
using ElementInputA = int8_t;              // <- data type of elements in input matrix A
using ElementInputB = int8_t;              // <- data type of elements in input matrix B
using ElementOutput = float;                        // <- data type of elements in output matrix D
```

But setting the outputs in `cutlass::half` does not? 
```
using ElementAccumulator = int32_t;                   // <- data type of accumulator
using ElementComputeEpilogue = float;  // <- data type of epilogue operations
using ElementInputA = int8_t;              // <- data type of elements in input matrix A
using ElementInputB = int8_t;              // <- data type of elements in input matrix B
using ElementOutput = cutlass::half;                        // <- data type of elements in output matrix D
```

Is there any specific restriction preventing it? How can this be enabled?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QST] INT8 GEMM with FP16 output #767

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[QST] INT8 GEMM with FP16 output #767

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions