Skip to content

BUG: np.dot is not thread-safe with OpenBLAS #11046

@artemru

Description

@artemru

I'm using numpy (1.14.1) linked against OpenBLAS 0.2.18 and it looks like np.dot
(that uses dgemm routine from openblas) is not thread-safe :

import numpy as np
from multiprocessing.pool import ThreadPool

dim = 4   # for larger value of dim, there's no issue
a = np.arange(10**5 / dim) / 10.**5
b = np.arange(10**5).reshape(-1, dim) / 10.**5

pp = ThreadPool(4)
threaded_result = pp.map(a.dot, [b] * 4) 
pp.close()
pp.terminate()

result = a.dot(b)
print [np.max(np.abs(x - result)) for x in threaded_result]

# print
# [1822.7068840452998, 1540.2636287421, 96.10628199050007, 0.0]
# or other rather random results whereas it should return zeros

I don't know if this kind of behavior is expected, is it numpy or rather openblas bug ?

Note :

  • numpy with MKL blas does not have this issue at all
  • everything runs fine if openblas threading is turned off (export OPENBLAS_NUM_THREADS=1)
  • I don't know how to test openblas==0.2.20 version that maybe solves this

Some extra info if needed :

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                32
On-line CPU(s) list:   0-31
Thread(s) per core:    2
Core(s) per socket:    8
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 62
Model name:            Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Stepping:              4
CPU MHz:               2500.060
BogoMIPS:              5000.12
Hypervisor vendor:     Xen
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              25600K
NUMA node0 CPU(s):     0-7,16-23
NUMA node1 CPU(s):     8-15,24-31
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology eagerfpu pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm retpoline kaiser fsgsbase smep erms xsaveopt
np.show_config()
lapack_opt_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/lib']
    define_macros = [('HAVE_CBLAS', None)]
    language = c
blas_opt_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/lib']
    define_macros = [('HAVE_CBLAS', None)]
    language = c
openblas_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/lib']
    define_macros = [('HAVE_CBLAS', None)]
    language = c
blis_info:
  NOT AVAILABLE
openblas_lapack_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/lib']
    define_macros = [('HAVE_CBLAS', None)]
    language = c
lapack_mkl_info:
  NOT AVAILABLE
blas_mkl_info:
  NOT AVAILABLE

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions