-
Notifications
You must be signed in to change notification settings - Fork 27.4k
Torch's affinity setting lead to openvino using only one core. #91989
Copy link
Copy link
Open
Labels
intel prioritymatters to intel architecture from performance wisematters to intel architecture from performance wisemodule: intelSpecific to x86 architectureSpecific to x86 architecturetriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module
Description
🐛 Describe the bug
Torch's affinity setting lead to openvino using only one core.
I export a pytorch resnet50 model to OpenVINO IR Model with following steps.
- export resnet50 to onnx model
from torchvision.models import resnet50
from openvino.runtime import Core
import torch
from pathlib import Path
model = resnet50()
model.eval()
dummy_input = torch.randn((1, 3, 224, 224),
generator=None,
device="cpu",
dtype=torch.float32)
onnx_path = Path("resnet50.onnx")
if not onnx_path.exists():
torch.onnx.export(
model,
dummy_input,
onnx_path,
opset_version=11,
do_constant_folding=False,
)
- Convert onnx model to openvino IR model:
mo --input_model "resnet50.onnx" --input_shape "[1,3, 224, 224]" --mean_values="[123.675, 116.28 , 103.53]" --scale_values="[58.395, 57.12 , 57.375]" --data_type FP32 --output_dir model
- Export Intel KMP affinity env:
export LD_PRELOAD=conda_env_home/lib/libiomp5.so # like /usr/local/envs/test/lib/libiomp5.so
export KMP_AFFINITY=granularity=fine,compact,1,0
- At last, run openvino inference:
from openvino.runtime import Core
from pathlib import Path
import numpy as np
dummy_input = np.random.randn(1, 3, 224, 224)
config = {
"CPU_THREADS_NUM": "48"
}
ie = Core()
classification_model_xml = "model/resnet50.xml"
model = ie.read_model(model=classification_model_xml)
import torch # if import torch here, openvino will use only one core.
compiled_model = ie.compile_model(model, "CPU", config)
# import torch # if import torch here, openvino will use 48 cores.
input_layer = compiled_model.input(0)
output_layer = compiled_model.output(0)
ir = compiled_model.create_infer_request()
import time
s = time.time()
for i in range(200):
ir.infer(inputs={input_layer.any_name: dummy_input})
print("time cost: " + str(time.time() - s))
BTW: gomp affinity export OMP_PROC_BIND=CLOSE has the same hehaviour.
Versions
torch: 1.13.1
python: 3.7.10
cpu: Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz
Set up env with
conda create -n test -y python==3.7.10 setuptools==58.0.4
conda activate test
conda install intel-openmp
pip3 install torch torchvision
pip3 install openvino openvino-dev
cc @frank-wei @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
intel prioritymatters to intel architecture from performance wisematters to intel architecture from performance wisemodule: intelSpecific to x86 architectureSpecific to x86 architecturetriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module
Type
Projects
Status
In Progress