I'm using numpy (1.14.1) linked against OpenBLAS 0.2.18 and it looks like np.dot
(that uses dgemm routine from openblas) is not thread-safe :
import numpy as np
from multiprocessing.pool import ThreadPool
dim = 4 # for larger value of dim, there's no issue
a = np.arange(10**5 / dim) / 10.**5
b = np.arange(10**5).reshape(-1, dim) / 10.**5
pp = ThreadPool(4)
threaded_result = pp.map(a.dot, [b] * 4)
pp.close()
pp.terminate()
result = a.dot(b)
print [np.max(np.abs(x - result)) for x in threaded_result]
# print
# [1822.7068840452998, 1540.2636287421, 96.10628199050007, 0.0]
# or other rather random results whereas it should return zeros
I don't know if this kind of behavior is expected, is it numpy or rather openblas bug ?
Note :
- numpy with MKL blas does not have this issue at all
- everything runs fine if openblas threading is turned off (
export OPENBLAS_NUM_THREADS=1)
- I don't know how to test openblas==0.2.20 version that maybe solves this
Some extra info if needed :
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 32
On-line CPU(s) list: 0-31
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 62
Model name: Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Stepping: 4
CPU MHz: 2500.060
BogoMIPS: 5000.12
Hypervisor vendor: Xen
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 25600K
NUMA node0 CPU(s): 0-7,16-23
NUMA node1 CPU(s): 8-15,24-31
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology eagerfpu pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm retpoline kaiser fsgsbase smep erms xsaveopt
np.show_config()
lapack_opt_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/local/lib']
define_macros = [('HAVE_CBLAS', None)]
language = c
blas_opt_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/local/lib']
define_macros = [('HAVE_CBLAS', None)]
language = c
openblas_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/local/lib']
define_macros = [('HAVE_CBLAS', None)]
language = c
blis_info:
NOT AVAILABLE
openblas_lapack_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/local/lib']
define_macros = [('HAVE_CBLAS', None)]
language = c
lapack_mkl_info:
NOT AVAILABLE
blas_mkl_info:
NOT AVAILABLE
I'm using numpy (
1.14.1) linked against OpenBLAS0.2.18and it looks likenp.dot(that uses
dgemmroutine from openblas) is not thread-safe :I don't know if this kind of behavior is expected, is it numpy or rather openblas bug ?
Note :
export OPENBLAS_NUM_THREADS=1)Some extra info if needed :