Skip to content

PyTorch binaries for macOS has serial at::parallel_for because it is built with OpenMP which is not available #43036

@pbelevich

Description

@pbelevich

A silly performance test linearly depends(1000 times slower on 1000x larger tenor) on tensor size on macOS(with 6 cores i9) but can be 4-10 time faster on Linux or Windows:

In [2]: t = torch.ones(1000, device='cpu')

In [3]: timeit t.pow(123)
7.45 µs ± 343 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [4]: t = torch.ones(1000000, device='cpu')

In [5]: timeit t.pow(123)
5.83 ms ± 246 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

torch.__config__.parallel_info() says for all OSes that PyTorch was compiled with OpenMP but it's not actually available on macOS:

at::get_num_threads() : 1
at::get_num_interop_threads() : 6
OpenMP not found
...
ATen parallel backend: OpenMP

Version.cpp:

std::string get_openmp_version() {
#ifdef _OPENMP
...
#else
  ss << "OpenMP not found";
#endif
}

ParallelOpenMP.h:

inline void parallel_for(..., const F& f) {
...
#ifdef _OPENMP
...
#else
  f(begin, end);
#endif
}

which looks like serial invocation.

This affects at least PyTorch 1.3.0-1.6.0 both pip and conda

cc @ezyang @gchanan @zou3519 @malfet @VitalyFedyunin

Metadata

Metadata

Assignees

Labels

csprngFix or feature required for torchcsprnghigh prioritymodule: buildBuild system issuesmodule: cpuCPU specific problem (e.g., perf, algorithm)module: macosMac OS related issuesmodule: openmpRelated to OpenMP (omp) support in PyTorchtriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions