By default, matmul on Ampere cards will use tf32, and this leads to unacceptable accuracy loss for linalg ops.
When linalg ops are using magma, we are disabling tf32, but in other cases there's no systematic approach.
see #67948
cc @jianyuh @nikitaved @pearu @mruberry @walterddr @IvanYashchuk @xwang233 @lezcano @zasdfgbnm @ptrblck
By default, matmul on Ampere cards will use tf32, and this leads to unacceptable accuracy loss for linalg ops.
When linalg ops are using magma, we are disabling tf32, but in other cases there's no systematic approach.
see #67948
cc @jianyuh @nikitaved @pearu @mruberry @walterddr @IvanYashchuk @xwang233 @lezcano @zasdfgbnm @ptrblck