Skip to content

libtorch does not initialize OpenMP/MKL by default #20156

@EsdeathYZH

Description

@EsdeathYZH

I find that matrix multiplication is slower in C++ API, so I write the same code in C++ and python and record their execution times, code is as following:

C++:

#include<torch/torch.h>
#include<iostream>
#include <chrono>

int main(){
	torch::Tensor tensor = torch::randn({2708, 1433});
	torch::Tensor weight = torch::randn({1433, 16});
	auto start = std::chrono::high_resolution_clock::now();
	tensor.mm(weight);
	auto end = std::chrono::high_resolution_clock::now();
	std::cout<< "C++ Operation Time(s) " << std::chrono::duration<double>(end - start).count() << "s" << 	std::endl;
	return 0;
}

Result:

C++ Operation Time(s) 0.082496s

python:

import torch
import torch.nn as nn
import torch.nn.functional as F

tensor = torch.randn(2708, 1433)
weight = torch.randn(1433, 16)
t0 = time.time()
tensor.mm(weight)
t1 = time.time()
print("Python Operation Time(s) {:.4f}".format(t1 - t0))

Result:

Python Operation Time(s) 0.0114

Testing Environment:

ubuntu 16.04
gcc version 5.4.0
python version 3.7.3
pytorch version 1.0.1

It's not a small difference, why is it happen???

Metadata

Metadata

Assignees

Labels

high prioritymodule: cppRelated to C++ APImodule: docsRelated to our documentation, both in docs/ and docblocksmodule: multithreadingRelated to issues that occur when running on multiple CPU threadstriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions