PyTorch binaries for macOS has serial at::parallel_for because it is built with OpenMP which  is not available

A silly performance test linearly depends(1000 times slower on 1000x larger tenor) on tensor size on macOS(with 6 cores i9) but  can be 4-10 time faster on Linux or Windows:
```
In [2]: t = torch.ones(1000, device='cpu')

In [3]: timeit t.pow(123)
7.45 µs ± 343 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [4]: t = torch.ones(1000000, device='cpu')

In [5]: timeit t.pow(123)
5.83 ms ± 246 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
```

`torch.__config__.parallel_info()` says for all OSes that PyTorch was compiled with OpenMP but it's not actually available on macOS:

```
at::get_num_threads() : 1
at::get_num_interop_threads() : 6
OpenMP not found
...
ATen parallel backend: OpenMP
```

Version.cpp:
```
std::string get_openmp_version() {
#ifdef _OPENMP
...
#else
  ss << "OpenMP not found";
#endif
}
```

ParallelOpenMP.h:
```
inline void parallel_for(..., const F& f) {
...
#ifdef _OPENMP
...
#else
  f(begin, end);
#endif
}
```
which looks like serial invocation.

This affects at least PyTorch 1.3.0-1.6.0 both pip and conda

cc @ezyang @gchanan @zou3519 @malfet @VitalyFedyunin

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PyTorch binaries for macOS has serial at::parallel_for because it is built with OpenMP which is not available #43036

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

PyTorch binaries for macOS has serial at::parallel_for because it is built with OpenMP which is not available #43036

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions