File tree Expand file tree Collapse file tree
Expand file tree Collapse file tree Original file line number Diff line number Diff line change @@ -288,7 +288,7 @@ class TORCH_API Context {
288288 int benchmark_limit_cudnn = 10 ;
289289 bool allow_tf32_cudnn = true ;
290290 bool allow_fp16_reduction_cublas = true ;
291- bool allow_bf16_reduction_cublas = false ;
291+ bool allow_bf16_reduction_cublas = true ;
292292 bool enabled_mkldnn = true ;
293293 at::LinalgBackend linalg_preferred_backend = at::LinalgBackend::Default;
294294#ifdef C10_MOBILE
Original file line number Diff line number Diff line change @@ -184,11 +184,12 @@ A similar flag (as above) exists for BFloat16 GEMMs. Note that this switch is
184184set to `False ` by default for BF16 as we have observed numerical instability in
185185PyTorch CI tests (e.g., test/test_matmul_cuda.py).
186186
187- If reduced precision reductions are desired, users can disable reduced precision reductions in bf16 GEMMs with:
187+ If reduced precision reductions are not desired, users can disable reduced
188+ precision reductions in bf16 GEMMs with:
188189
189190.. code :: python
190191
191- torch.backends.cuda.matmul.allow_bf16_reduced_precision_reduction = True
192+ torch.backends.cuda.matmul.allow_bf16_reduced_precision_reduction = False
192193
193194 To toggle the reduced precision reduction flags in C++, one can do
194195
Original file line number Diff line number Diff line change @@ -104,9 +104,9 @@ Half-precision GEMM operations are typically done with intermediate accumulation
104104If reduced-precision reductions are problematic, they can be turned off with
105105``torch.backends.cuda.matmul.allow_fp16_reduced_precision_reduction = False ``
106106
107- A similar flag exists for BF16 GEMM operations and is turned on by default. If
108- reduced-precision reductions are desired for BF16 , they can be turn on with
109- ``torch.backends.cuda.matmul.allow_bf16_reduced_precision_reduction = True ``
107+ A similar flag exists for BF16 GEMM operations and is turned off by default. If BF16
108+ reduced-precision reductions are problematic , they can be turned off with
109+ ``torch.backends.cuda.matmul.allow_bf16_reduced_precision_reduction = False ``
110110
111111For more information see :ref: `allow_fp16_reduced_precision_reduction<fp16reducedprecision> ` and :ref: `allow_bf16_reduced_precision_reduction<bf16reducedprecision> `
112112
You can’t perform that action at this time.
0 commit comments