Hey I have noticed INT8 gemm with the following settings where we have float bias and float output works okay:
using ElementAccumulator = int32_t; // <- data type of accumulator
using ElementComputeEpilogue = float; // <- data type of epilogue operations
using ElementInputA = int8_t; // <- data type of elements in input matrix A
using ElementInputB = int8_t; // <- data type of elements in input matrix B
using ElementOutput = float; // <- data type of elements in output matrix D
But setting the outputs in cutlass::half does not?
using ElementAccumulator = int32_t; // <- data type of accumulator
using ElementComputeEpilogue = float; // <- data type of epilogue operations
using ElementInputA = int8_t; // <- data type of elements in input matrix A
using ElementInputB = int8_t; // <- data type of elements in input matrix B
using ElementOutput = cutlass::half; // <- data type of elements in output matrix D
Is there any specific restriction preventing it? How can this be enabled?
Hey I have noticed INT8 gemm with the following settings where we have
floatbias andfloatoutput works okay:But setting the outputs in
cutlass::halfdoes not?Is there any specific restriction preventing it? How can this be enabled?