Skip to content

Torch's affinity setting lead to openvino using only one core. #91989

@qiuxin2012

Description

@qiuxin2012

🐛 Describe the bug

Torch's affinity setting lead to openvino using only one core.
I export a pytorch resnet50 model to OpenVINO IR Model with following steps.

  1. export resnet50 to onnx model
from torchvision.models import resnet50
from openvino.runtime import Core
import torch
from pathlib import Path

model = resnet50()
model.eval()

dummy_input = torch.randn((1, 3, 224, 224),
                             generator=None,
                             device="cpu",
                             dtype=torch.float32)
onnx_path = Path("resnet50.onnx")

if not onnx_path.exists():
   torch.onnx.export(
           model,
           dummy_input,
           onnx_path,
           opset_version=11,
           do_constant_folding=False,
       )
  1. Convert onnx model to openvino IR model:
mo --input_model "resnet50.onnx" --input_shape "[1,3, 224, 224]" --mean_values="[123.675, 116.28 , 103.53]" --scale_values="[58.395, 57.12 , 57.375]" --data_type FP32 --output_dir model
  1. Export Intel KMP affinity env:
export LD_PRELOAD=conda_env_home/lib/libiomp5.so # like /usr/local/envs/test/lib/libiomp5.so
export KMP_AFFINITY=granularity=fine,compact,1,0
  1. At last, run openvino inference:
from openvino.runtime import Core
from pathlib import Path
import numpy as np

dummy_input = np.random.randn(1, 3, 224, 224)

config = {
        "CPU_THREADS_NUM": "48"
    }

ie = Core()
classification_model_xml = "model/resnet50.xml"
model = ie.read_model(model=classification_model_xml)

import torch # if import torch here, openvino will use only one core.

compiled_model = ie.compile_model(model, "CPU", config)

# import torch # if import torch here, openvino will use 48 cores.

input_layer = compiled_model.input(0)
output_layer = compiled_model.output(0)
ir = compiled_model.create_infer_request()
import time
s = time.time()
for i in range(200):
    ir.infer(inputs={input_layer.any_name: dummy_input})

print("time cost: " + str(time.time() - s))

BTW: gomp affinity export OMP_PROC_BIND=CLOSE has the same hehaviour.

Versions

torch: 1.13.1
python: 3.7.10
cpu: Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz
Set up env with

conda create -n test -y python==3.7.10 setuptools==58.0.4 
conda activate test
conda install intel-openmp
pip3 install torch torchvision
pip3 install openvino openvino-dev

cc @frank-wei @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10

Metadata

Metadata

Assignees

No one assigned

    Labels

    intel prioritymatters to intel architecture from performance wisemodule: intelSpecific to x86 architecturetriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    Status

    In Progress

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions