Skip to content

KeyError: 'DistributedDataParallelWrapper is already registered in module wrapper' returned during unit testing #165

@del-zhenwu

Description

@del-zhenwu

Thanks for your bug report. We appreciate it a lot.

Checklist

Describe the bug

[2022-02-17T04:12:38.137Z] INFO     mmdeploy:init_plugins.py:36 Successfully loaded tensorrt plugins from /opt/mmdeploy/build/lib/libmmdeploy_tensorrt_ops.so

[2022-02-17T04:12:38.137Z] ___________________ test_top_down_forward[Backend.TENSORRT] ____________________

[2022-02-17T04:12:38.137Z] 

[2022-02-17T04:12:38.137Z] backend_type = <Backend.TENSORRT: 'tensorrt'>

[2022-02-17T04:12:38.137Z] 

[2022-02-17T04:12:38.137Z]     @pytest.mark.parametrize('backend_type',

[2022-02-17T04:12:38.137Z]                              [Backend.ONNXRUNTIME, Backend.TENSORRT])

[2022-02-17T04:12:38.137Z]     def test_top_down_forward(backend_type: Backend):

[2022-02-17T04:12:38.137Z]         check_backend(backend_type, True)

[2022-02-17T04:12:38.137Z]         model = get_top_down_model()

[2022-02-17T04:12:38.137Z]         model.cpu().eval()

[2022-02-17T04:12:38.137Z]         if backend_type == Backend.TENSORRT:

[2022-02-17T04:12:38.137Z]             deploy_cfg = mmcv.Config(

[2022-02-17T04:12:38.137Z]                 dict(

[2022-02-17T04:12:38.137Z]                     backend_config=dict(

[2022-02-17T04:12:38.137Z]                         type=backend_type.value,

[2022-02-17T04:12:38.137Z]                         common_config=dict(max_workspace_size=1 << 30),

[2022-02-17T04:12:38.137Z]                         model_inputs=[

[2022-02-17T04:12:38.137Z]                             dict(

[2022-02-17T04:12:38.137Z]                                 input_shapes=dict(

[2022-02-17T04:12:38.138Z]                                     input=dict(

[2022-02-17T04:12:38.138Z]                                         min_shape=[1, 3, 32, 32],

[2022-02-17T04:12:38.138Z]                                         opt_shape=[1, 3, 32, 32],

[2022-02-17T04:12:38.138Z]                                         max_shape=[1, 3, 32, 32])))

[2022-02-17T04:12:38.138Z]                         ]),

[2022-02-17T04:12:38.138Z]                     onnx_config=dict(

[2022-02-17T04:12:38.138Z]                         input_shape=[32, 32], output_names=['output']),

[2022-02-17T04:12:38.138Z]                     codebase_config=dict(

[2022-02-17T04:12:38.138Z]                         type=Codebase.MMPOSE.value,

[2022-02-17T04:12:38.138Z]                         task=Task.POSE_DETECTION.value)))

[2022-02-17T04:12:38.138Z]         else:

[2022-02-17T04:12:38.138Z]             deploy_cfg = mmcv.Config(

[2022-02-17T04:12:38.138Z]                 dict(

[2022-02-17T04:12:38.138Z]                     backend_config=dict(type=backend_type.value),

[2022-02-17T04:12:38.138Z]                     onnx_config=dict(input_shape=None, output_names=['output']),

[2022-02-17T04:12:38.138Z]                     codebase_config=dict(

[2022-02-17T04:12:38.138Z]                         type=Codebase.MMPOSE.value,

[2022-02-17T04:12:38.138Z]                         task=Task.POSE_DETECTION.value)))

[2022-02-17T04:12:38.138Z]         img = torch.rand((1, 3, 32, 32))

[2022-02-17T04:12:38.138Z]         img_metas = {

[2022-02-17T04:12:38.138Z]             'image_file':

[2022-02-17T04:12:38.138Z]             'tests/test_codebase/test_mmpose' + '/data/imgs/dataset/blank.jpg',

[2022-02-17T04:12:38.138Z]             'center': torch.tensor([0.5, 0.5]),

[2022-02-17T04:12:38.138Z]             'scale': 1.,

[2022-02-17T04:12:38.138Z]             'location': torch.tensor([0.5, 0.5]),

[2022-02-17T04:12:38.138Z]             'bbox_score': 0.5

[2022-02-17T04:12:38.138Z]         }

[2022-02-17T04:12:38.138Z]         model_outputs = model.forward(

[2022-02-17T04:12:38.138Z]             img, img_metas=[img_metas], return_loss=False, return_heatmap=True)

[2022-02-17T04:12:38.138Z]         model_outputs = model_outputs['output_heatmap']

[2022-02-17T04:12:38.138Z]         wrapped_model = WrapModel(model, 'forward', return_loss=False)

[2022-02-17T04:12:38.138Z]         rewrite_inputs = {'img': img}

[2022-02-17T04:12:38.138Z]         rewrite_outputs, is_backend_output = get_rewrite_outputs(

[2022-02-17T04:12:38.138Z]             wrapped_model=wrapped_model,

[2022-02-17T04:12:38.138Z]             model_inputs=rewrite_inputs,

[2022-02-17T04:12:38.138Z] >           deploy_cfg=deploy_cfg)

[2022-02-17T04:12:38.138Z] 

[2022-02-17T04:12:38.138Z] tests/test_codebase/test_mmpose/test_mmpose_models.py:278: 

[2022-02-17T04:12:38.138Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

[2022-02-17T04:12:38.138Z] mmdeploy/utils/test.py:519: in get_rewrite_outputs

[2022-02-17T04:12:38.138Z]     deploy_cfg)

[2022-02-17T04:12:38.138Z] mmdeploy/utils/test.py:411: in get_backend_outputs

[2022-02-17T04:12:38.138Z]     onnx_model=onnx_file_path)

[2022-02-17T04:12:38.138Z] mmdeploy/backend/tensorrt/onnx2tensorrt.py:72: in onnx2tensorrt

[2022-02-17T04:12:38.138Z]     device_id=device_id)

[2022-02-17T04:12:38.138Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

[2022-02-17T04:12:38.138Z] 

[2022-02-17T04:12:38.138Z] onnx_model = ir_version: 6

[2022-02-17T04:12:38.138Z] producer_name: "pytorch"

[2022-02-17T04:12:38.138Z] producer_version: "1.9"

[2022-02-17T04:12:38.138Z] graph {

[2022-02-17T04:12:38.138Z]   node {

[2022-02-17T04:12:38.138Z]     input: "img"

[2022-02-17T04:12:38.138Z]     input: "218"

[2022-02-17T04:12:38.138Z]     ...   }

[2022-02-17T04:12:38.138Z]           dim {

[2022-02-17T04:12:38.138Z]             dim_value: 8

[2022-02-17T04:12:38.138Z]           }

[2022-02-17T04:12:38.138Z]         }

[2022-02-17T04:12:38.138Z]       }

[2022-02-17T04:12:38.138Z]     }

[2022-02-17T04:12:38.138Z]   }

[2022-02-17T04:12:38.138Z] }

[2022-02-17T04:12:38.138Z] opset_import {

[2022-02-17T04:12:38.138Z]   version: 11

[2022-02-17T04:12:38.138Z] }

[2022-02-17T04:12:38.138Z] 

[2022-02-17T04:12:38.139Z] input_shapes = {'input': {'min_shape': [1, 3, 32, 32], 'opt_shape': [1, 3, 32, 32], 'max_shape': [1, 3, 32, 32]}}

[2022-02-17T04:12:38.139Z] log_level = <Severity.INFO: 3>, fp16_mode = False, int8_mode = False

[2022-02-17T04:12:38.139Z] int8_param = {}, max_workspace_size = 1073741824, device_id = 0, kwargs = {}

[2022-02-17T04:12:38.139Z] device = device(type='cuda', index=0)

[2022-02-17T04:12:38.139Z] logger = <tensorrt.tensorrt.Logger object at 0x7f47e4afe7b0>

[2022-02-17T04:12:38.139Z] builder = <tensorrt.tensorrt.Builder object at 0x7f47e4afe730>

[2022-02-17T04:12:38.139Z] EXPLICIT_BATCH = 1

[2022-02-17T04:12:38.139Z] network = <tensorrt.tensorrt.INetworkDefinition object at 0x7f47e4afe0b0>

[2022-02-17T04:12:38.139Z] parser = <tensorrt.tensorrt.OnnxParser object at 0x7f47e4afecb0>

[2022-02-17T04:12:38.139Z] 

[2022-02-17T04:12:38.139Z]     def create_trt_engine(onnx_model: Union[str, onnx.ModelProto],

[2022-02-17T04:12:38.139Z]                           input_shapes: Dict[str, Sequence[int]],

[2022-02-17T04:12:38.139Z]                           log_level: trt.Logger.Severity = trt.Logger.ERROR,

[2022-02-17T04:12:38.139Z]                           fp16_mode: bool = False,

[2022-02-17T04:12:38.139Z]                           int8_mode: bool = False,

[2022-02-17T04:12:38.139Z]                           int8_param: dict = None,

[2022-02-17T04:12:38.139Z]                           max_workspace_size: int = 0,

[2022-02-17T04:12:38.139Z]                           device_id: int = 0,

[2022-02-17T04:12:38.139Z]                           **kwargs) -> trt.ICudaEngine:

[2022-02-17T04:12:38.139Z]         """Create a tensorrt engine from ONNX.

[2022-02-17T04:12:38.139Z]     

[2022-02-17T04:12:38.139Z]         Args:

[2022-02-17T04:12:38.139Z]             onnx_model (str or onnx.ModelProto): Input onnx model to convert from.

[2022-02-17T04:12:38.139Z]             input_shapes (Dict[str, Sequence[int]]): The min/opt/max shape of

[2022-02-17T04:12:38.139Z]                 each input.

[2022-02-17T04:12:38.139Z]             log_level (trt.Logger.Severity): The log level of TensorRT. Defaults to

[2022-02-17T04:12:38.139Z]                 `trt.Logger.ERROR`.

[2022-02-17T04:12:38.139Z]             fp16_mode (bool): Specifying whether to enable fp16 mode.

[2022-02-17T04:12:38.139Z]                 Defaults to `False`.

[2022-02-17T04:12:38.139Z]             int8_mode (bool): Specifying whether to enable int8 mode.

[2022-02-17T04:12:38.139Z]                 Defaults to `False`.

[2022-02-17T04:12:38.139Z]             int8_param (dict): A dict of parameter  int8 mode. Defaults to `None`.

[2022-02-17T04:12:38.139Z]             max_workspace_size (int): To set max workspace size of TensorRT engine.

[2022-02-17T04:12:38.139Z]                 some tactics and layers need large workspace. Defaults to `0`.

[2022-02-17T04:12:38.139Z]             device_id (int): Choice the device to create engine. Defaults to `0`.

[2022-02-17T04:12:38.139Z]     

[2022-02-17T04:12:38.139Z]         Returns:

[2022-02-17T04:12:38.139Z]             tensorrt.ICudaEngine: The TensorRT engine created from onnx_model.

[2022-02-17T04:12:38.139Z]     

[2022-02-17T04:12:38.139Z]         Example:

[2022-02-17T04:12:38.139Z]             >>> from mmdeploy.apis.tensorrt import create_trt_engine

[2022-02-17T04:12:38.139Z]             >>> engine = create_trt_engine(

[2022-02-17T04:12:38.139Z]             >>>             "onnx_model.onnx",

[2022-02-17T04:12:38.139Z]             >>>             {'input': {"min_shape" : [1, 3, 160, 160],

[2022-02-17T04:12:38.139Z]             >>>                        "opt_shape" : [1, 3, 320, 320],

[2022-02-17T04:12:38.139Z]             >>>                        "max_shape" : [1, 3, 640, 640]}},

[2022-02-17T04:12:38.139Z]             >>>             log_level=trt.Logger.WARNING,

[2022-02-17T04:12:38.139Z]             >>>             fp16_mode=True,

[2022-02-17T04:12:38.139Z]             >>>             max_workspace_size=1 << 30,

[2022-02-17T04:12:38.139Z]             >>>             device_id=0)

[2022-02-17T04:12:38.139Z]             >>>             })

[2022-02-17T04:12:38.139Z]         """

[2022-02-17T04:12:38.139Z]         load_tensorrt_plugin()

[2022-02-17T04:12:38.139Z]         device = torch.device('cuda:{}'.format(device_id))

[2022-02-17T04:12:38.139Z]         # create builder and network

[2022-02-17T04:12:38.139Z]         logger = trt.Logger(log_level)

[2022-02-17T04:12:38.139Z]         builder = trt.Builder(logger)

[2022-02-17T04:12:38.140Z]         EXPLICIT_BATCH = 1 << (int)(

[2022-02-17T04:12:38.140Z]             trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)

[2022-02-17T04:12:38.140Z]         network = builder.create_network(EXPLICIT_BATCH)

[2022-02-17T04:12:38.140Z]     

[2022-02-17T04:12:38.140Z]         # parse onnx

[2022-02-17T04:12:38.140Z]         parser = trt.OnnxParser(network, logger)

[2022-02-17T04:12:38.140Z]     

[2022-02-17T04:12:38.140Z]         if isinstance(onnx_model, str):

[2022-02-17T04:12:38.140Z]             onnx_model = onnx.load(onnx_model)

[2022-02-17T04:12:38.140Z]     

[2022-02-17T04:12:38.140Z]         if not parser.parse(onnx_model.SerializeToString()):

[2022-02-17T04:12:38.140Z]             error_msgs = ''

[2022-02-17T04:12:38.140Z]             for error in range(parser.num_errors):

[2022-02-17T04:12:38.140Z]                 error_msgs += f'{parser.get_error(error)}\n'

[2022-02-17T04:12:38.140Z]             raise RuntimeError(f'Failed to parse onnx, {error_msgs}')

[2022-02-17T04:12:38.140Z]     

[2022-02-17T04:12:38.140Z]         # config builder

[2022-02-17T04:12:38.140Z]         if version.parse(trt.__version__) < version.parse('8'):

[2022-02-17T04:12:38.140Z]             builder.max_workspace_size = max_workspace_size

[2022-02-17T04:12:38.140Z]     

[2022-02-17T04:12:38.140Z]         config = builder.create_builder_config()

[2022-02-17T04:12:38.140Z]         config.max_workspace_size = max_workspace_size

[2022-02-17T04:12:38.140Z]         profile = builder.create_optimization_profile()

[2022-02-17T04:12:38.140Z]     

[2022-02-17T04:12:38.140Z]         for input_name, param in input_shapes.items():

[2022-02-17T04:12:38.140Z]             min_shape = param['min_shape']

[2022-02-17T04:12:38.140Z]             opt_shape = param['opt_shape']

[2022-02-17T04:12:38.140Z]             max_shape = param['max_shape']

[2022-02-17T04:12:38.140Z]             profile.set_shape(input_name, min_shape, opt_shape, max_shape)

[2022-02-17T04:12:38.140Z]         config.add_optimization_profile(profile)

[2022-02-17T04:12:38.140Z]     

[2022-02-17T04:12:38.140Z]         if fp16_mode:

[2022-02-17T04:12:38.140Z]             if version.parse(trt.__version__) < version.parse('8'):

[2022-02-17T04:12:38.140Z]                 builder.fp16_mode = fp16_mode

[2022-02-17T04:12:38.140Z]             config.set_flag(trt.BuilderFlag.FP16)

[2022-02-17T04:12:38.140Z]     

[2022-02-17T04:12:38.140Z]         if int8_mode:

[2022-02-17T04:12:38.140Z]             config.set_flag(trt.BuilderFlag.INT8)

[2022-02-17T04:12:38.140Z]             assert int8_param is not None

[2022-02-17T04:12:38.140Z]             config.int8_calibrator = HDF5Calibrator(

[2022-02-17T04:12:38.140Z]                 int8_param['calib_file'],

[2022-02-17T04:12:38.140Z]                 input_shapes,

[2022-02-17T04:12:38.140Z]                 model_type=int8_param['model_type'],

[2022-02-17T04:12:38.140Z]                 device_id=device_id,

[2022-02-17T04:12:38.140Z]                 algorithm=int8_param.get(

[2022-02-17T04:12:38.140Z]                     'algorithm', trt.CalibrationAlgoType.ENTROPY_CALIBRATION_2))

[2022-02-17T04:12:38.140Z]             if version.parse(trt.__version__) < version.parse('8'):

[2022-02-17T04:12:38.140Z]                 builder.int8_mode = int8_mode

[2022-02-17T04:12:38.140Z]                 builder.int8_calibrator = config.int8_calibrator

[2022-02-17T04:12:38.140Z]     

[2022-02-17T04:12:38.140Z]         # create engine

[2022-02-17T04:12:38.140Z]         with torch.cuda.device(device):

[2022-02-17T04:12:38.140Z]             engine = builder.build_engine(network, config)

[2022-02-17T04:12:38.140Z]     

[2022-02-17T04:12:38.140Z] >       assert engine is not None, 'Failed to create TensorRT engine'

[2022-02-17T04:12:38.140Z] E       AssertionError: Failed to create TensorRT engine

[2022-02-17T04:12:38.140Z] 

[2022-02-17T04:12:38.140Z] mmdeploy/backend/tensorrt/utils.py:116: AssertionError

[2022-02-17T04:12:38.140Z] ----------------------------- Captured stderr call -----------------------------

[2022-02-17T04:12:38.140Z] 2022-02-17 12:11:23,890 - mmdeploy - INFO - Successfully loaded tensorrt plugins from /opt/mmdeploy/build/lib/libmmdeploy_tensorrt_ops.so

[2022-02-17T04:12:38.140Z] [TensorRT] WARNING: The logger passed into createInferBuilder differs from one already provided for an existing builder, runtime, or refitter. TensorRT maintains only a single logger pointer at any given time, so the existing value, which can be retrieved with getLogger(), will be used instead. In order to use a new logger, first destroy all existing builder, runner or refitter objects.

[2022-02-17T04:12:38.140Z] 

[2022-02-17T04:12:38.141Z] [TensorRT] INFO: [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 3030, GPU 4653 (MiB)

[2022-02-17T04:12:38.141Z] [TensorRT] INFO: [MemUsageSnapshot] Builder begin: CPU 3147 MiB, GPU 4653 MiB

[2022-02-17T04:12:38.141Z] [TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +1, GPU +10, now: CPU 3164, GPU 4663 (MiB)

[2022-02-17T04:12:38.141Z] [TensorRT] INFO: [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 3164, GPU 4671 (MiB)

[2022-02-17T04:12:38.141Z] [TensorRT] WARNING: TensorRT was linked against cuDNN 8.2.1 but loaded cuDNN 8.2.0

[2022-02-17T04:12:38.141Z] [TensorRT] WARNING: Detected invalid timing cache, setup a local cache instead

[2022-02-17T04:12:38.141Z] [TensorRT] INFO: Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.

[2022-02-17T04:12:38.141Z] [TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 3164, GPU 4653 (MiB)

[2022-02-17T04:12:38.141Z] [TensorRT] ERROR: 2: [ltWrapper.cpp::setupHeuristic::327] Error Code 2: Internal Error (Assertion cublasStatus == CUBLAS_STATUS_SUCCESS failed.)

[2022-02-17T04:12:38.141Z] ------------------------------ Captured log call -------------------------------

[2022-02-17T04:12:38.141Z] INFO     mmdeploy:init_plugins.py:36 Successfully loaded tensorrt plugins from /opt/mmdeploy/build/lib/libmmdeploy_tensorrt_ops.so

[2022-02-17T04:12:38.141Z] ______________________________ test_create_input _______________________________

[2022-02-17T04:12:38.141Z] 

[2022-02-17T04:12:38.141Z]     def test_create_input():

[2022-02-17T04:12:38.141Z]         model_cfg = load_config(model_cfg_path)[0]

[2022-02-17T04:12:38.141Z]         deploy_cfg = mmcv.Config(

[2022-02-17T04:12:38.141Z]             dict(

[2022-02-17T04:12:38.141Z]                 backend_config=dict(type=Backend.ONNXRUNTIME.value),

[2022-02-17T04:12:38.141Z]                 codebase_config=dict(

[2022-02-17T04:12:38.141Z]                     type=Codebase.MMPOSE.value, task=Task.POSE_DETECTION.value),

[2022-02-17T04:12:38.141Z]                 onnx_config=dict(

[2022-02-17T04:12:38.141Z]                     type='onnx',

[2022-02-17T04:12:38.141Z]                     export_params=True,

[2022-02-17T04:12:38.141Z]                     keep_initializers_as_inputs=False,

[2022-02-17T04:12:38.141Z]                     opset_version=11,

[2022-02-17T04:12:38.141Z]                     save_file='end2end.onnx',

[2022-02-17T04:12:38.141Z]                     input_names=['input'],

[2022-02-17T04:12:38.141Z]                     output_names=['output'],

[2022-02-17T04:12:38.141Z]                     input_shape=None)))

[2022-02-17T04:12:38.141Z]         task_processor = build_task_processor(model_cfg, deploy_cfg, 'cpu')

[2022-02-17T04:12:38.141Z] >       inputs = task_processor.create_input(img, input_shape=img_shape)

[2022-02-17T04:12:38.141Z] 

[2022-02-17T04:12:38.141Z] tests/test_codebase/test_mmpose/test_pose_detection.py:65: 

[2022-02-17T04:12:38.141Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

[2022-02-17T04:12:38.141Z] mmdeploy/codebase/mmpose/deploy/pose_detection.py:84: in create_input

[2022-02-17T04:12:38.141Z]     from mmpose.apis.inference import LoadImage, _box2cs

[2022-02-17T04:12:38.141Z] ../conda/lib/python3.7/site-packages/mmpose/apis/__init__.py:10: in <module>

[2022-02-17T04:12:38.141Z]     from .train import init_random_seed, train_model

[2022-02-17T04:12:38.141Z] ../conda/lib/python3.7/site-packages/mmpose/apis/train.py:12: in <module>

[2022-02-17T04:12:38.141Z]     from mmpose.core.distributed_wrapper import DistributedDataParallelWrapper

[2022-02-17T04:12:38.141Z] ../conda/lib/python3.7/site-packages/mmpose/core/distributed_wrapper.py:10: in <module>

[2022-02-17T04:12:38.141Z]     class DistributedDataParallelWrapper(nn.Module):

[2022-02-17T04:12:38.141Z] ../mmcv/mmcv/utils/registry.py:312: in _register

[2022-02-17T04:12:38.141Z]     module_class=cls, module_name=name, force=force)

[2022-02-17T04:12:38.141Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

[2022-02-17T04:12:38.141Z] 

[2022-02-17T04:12:38.141Z] self = Registry(name=module wrapper, items={'DataParallel': <class 'torch.nn.parallel.data_parallel.DataParallel'>, 'Distribu...arallel'>, 'DistributedDataParallelWrapper': <class 'mmedit.core.distributed_wrapper.DistributedDataParallelWrapper'>})

[2022-02-17T04:12:38.141Z] module_class = <class 'mmpose.core.distributed_wrapper.DistributedDataParallelWrapper'>

[2022-02-17T04:12:38.141Z] module_name = ['DistributedDataParallelWrapper'], force = False

[2022-02-17T04:12:38.141Z] 

[2022-02-17T04:12:38.141Z]     def _register_module(self, module_class, module_name=None, force=False):

[2022-02-17T04:12:38.141Z]         if not inspect.isclass(module_class):

[2022-02-17T04:12:38.141Z]             raise TypeError('module must be a class, '

[2022-02-17T04:12:38.141Z]                             f'but got {type(module_class)}')

[2022-02-17T04:12:38.141Z]     

[2022-02-17T04:12:38.141Z]         if module_name is None:

[2022-02-17T04:12:38.141Z]             module_name = module_class.__name__

[2022-02-17T04:12:38.141Z]         if isinstance(module_name, str):

[2022-02-17T04:12:38.141Z]             module_name = [module_name]

[2022-02-17T04:12:38.141Z]         for name in module_name:

[2022-02-17T04:12:38.141Z]             if not force and name in self._module_dict:

[2022-02-17T04:12:38.142Z] >               raise KeyError(f'{name} is already registered '

[2022-02-17T04:12:38.142Z]                                f'in {self.name}')

[2022-02-17T04:12:38.142Z] E               KeyError: 'DistributedDataParallelWrapper is already registered in module wrapper'

[2022-02-17T04:12:38.142Z] 

[2022-02-17T04:12:38.142Z] ../mmcv/mmcv/utils/registry.py:246: KeyError

[2022-02-17T04:12:38.142Z] ___________________________ test_init_pytorch_model ____________________________

[2022-02-17T04:12:38.142Z] 

[2022-02-17T04:12:38.142Z]     def test_init_pytorch_model():

[2022-02-17T04:12:38.142Z]         from mmpose.models.detectors.base import BasePose

[2022-02-17T04:12:38.142Z] >       model = task_processor.init_pytorch_model(None)

[2022-02-17T04:12:38.142Z] 

[2022-02-17T04:12:38.142Z] tests/test_codebase/test_mmpose/test_pose_detection.py:71: 

[2022-02-17T04:12:38.142Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

[2022-02-17T04:12:38.142Z] mmdeploy/codebase/mmpose/deploy/pose_detection.py:63: in init_pytorch_model

[2022-02-17T04:12:38.142Z]     from mmpose.apis import init_pose_model

[2022-02-17T04:12:38.142Z] ../conda/lib/python3.7/site-packages/mmpose/apis/__init__.py:10: in <module>

[2022-02-17T04:12:38.142Z]     from .train import init_random_seed, train_model

[2022-02-17T04:12:38.142Z] ../conda/lib/python3.7/site-packages/mmpose/apis/train.py:12: in <module>

[2022-02-17T04:12:38.142Z]     from mmpose.core.distributed_wrapper import DistributedDataParallelWrapper

[2022-02-17T04:12:38.142Z] ../conda/lib/python3.7/site-packages/mmpose/core/distributed_wrapper.py:10: in <module>

[2022-02-17T04:12:38.142Z]     class DistributedDataParallelWrapper(nn.Module):

[2022-02-17T04:12:38.142Z] ../mmcv/mmcv/utils/registry.py:312: in _register

[2022-02-17T04:12:38.142Z]     module_class=cls, module_name=name, force=force)

[2022-02-17T04:12:38.142Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

[2022-02-17T04:12:38.142Z] 

[2022-02-17T04:12:38.142Z] self = Registry(name=module wrapper, items={'DataParallel': <class 'torch.nn.parallel.data_parallel.DataParallel'>, 'Distribu...arallel'>, 'DistributedDataParallelWrapper': <class 'mmedit.core.distributed_wrapper.DistributedDataParallelWrapper'>})

[2022-02-17T04:12:38.142Z] module_class = <class 'mmpose.core.distributed_wrapper.DistributedDataParallelWrapper'>

[2022-02-17T04:12:38.142Z] module_name = ['DistributedDataParallelWrapper'], force = False

[2022-02-17T04:12:38.142Z] 

[2022-02-17T04:12:38.142Z]     def _register_module(self, module_class, module_name=None, force=False):

[2022-02-17T04:12:38.142Z]         if not inspect.isclass(module_class):

[2022-02-17T04:12:38.142Z]             raise TypeError('module must be a class, '

[2022-02-17T04:12:38.142Z]                             f'but got {type(module_class)}')

[2022-02-17T04:12:38.142Z]     

[2022-02-17T04:12:38.142Z]         if module_name is None:

[2022-02-17T04:12:38.142Z]             module_name = module_class.__name__

[2022-02-17T04:12:38.142Z]         if isinstance(module_name, str):

[2022-02-17T04:12:38.142Z]             module_name = [module_name]

[2022-02-17T04:12:38.142Z]         for name in module_name:

[2022-02-17T04:12:38.142Z]             if not force and name in self._module_dict:

[2022-02-17T04:12:38.142Z] >               raise KeyError(f'{name} is already registered '

[2022-02-17T04:12:38.142Z]                                f'in {self.name}')

[2022-02-17T04:12:38.142Z] E               KeyError: 'DistributedDataParallelWrapper is already registered in module wrapper'

[2022-02-17T04:12:38.142Z] 

[2022-02-17T04:12:38.142Z] ../mmcv/mmcv/utils/registry.py:246: KeyError

[2022-02-17T04:12:38.142Z] ______________________ test_single_gpu_test_and_evaluate _______________________

[2022-02-17T04:12:38.142Z] 

[2022-02-17T04:12:38.142Z]     def test_single_gpu_test_and_evaluate():

[2022-02-17T04:12:38.142Z]         from mmcv.parallel import MMDataParallel

[2022-02-17T04:12:38.142Z]         dataset = task_processor.build_dataset(

[2022-02-17T04:12:38.142Z]             dataset_cfg=model_cfg, dataset_type='test')

[2022-02-17T04:12:38.142Z]         dataloader = task_processor.build_dataloader(dataset, 1, 1)

[2022-02-17T04:12:38.142Z]     

[2022-02-17T04:12:38.142Z]         # Prepare dummy model

[2022-02-17T04:12:38.142Z]         model = DummyModel(outputs=[torch.rand([1, 1000])])

[2022-02-17T04:12:38.142Z]         model = MMDataParallel(model, device_ids=[0])

[2022-02-17T04:12:38.142Z]         assert model is not None

[2022-02-17T04:12:38.142Z]         # Run test

[2022-02-17T04:12:38.142Z] >       outputs = task_processor.single_gpu_test(model, dataloader)

[2022-02-17T04:12:38.142Z] 

[2022-02-17T04:12:38.142Z] tests/test_codebase/test_mmpose/test_pose_detection.py:146: 

[2022-02-17T04:12:38.142Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

[2022-02-17T04:12:38.143Z] mmdeploy/codebase/base/task.py:138: in single_gpu_test

[2022-02-17T04:12:38.143Z]     out_dir, **kwargs)

[2022-02-17T04:12:38.143Z] mmdeploy/codebase/mmpose/deploy/mmpose.py:131: in single_gpu_test

[2022-02-17T04:12:38.143Z]     from mmpose.apis import single_gpu_test

[2022-02-17T04:12:38.143Z] ../conda/lib/python3.7/site-packages/mmpose/apis/__init__.py:10: in <module>

[2022-02-17T04:12:38.143Z]     from .train import init_random_seed, train_model

[2022-02-17T04:12:38.143Z] ../conda/lib/python3.7/site-packages/mmpose/apis/train.py:12: in <module>

[2022-02-17T04:12:38.143Z]     from mmpose.core.distributed_wrapper import DistributedDataParallelWrapper

[2022-02-17T04:12:38.143Z] ../conda/lib/python3.7/site-packages/mmpose/core/distributed_wrapper.py:10: in <module>

[2022-02-17T04:12:38.143Z]     class DistributedDataParallelWrapper(nn.Module):

[2022-02-17T04:12:38.143Z] ../mmcv/mmcv/utils/registry.py:312: in _register

[2022-02-17T04:12:38.143Z]     module_class=cls, module_name=name, force=force)

[2022-02-17T04:12:38.143Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

[2022-02-17T04:12:38.143Z] 

[2022-02-17T04:12:38.143Z] self = Registry(name=module wrapper, items={'DataParallel': <class 'torch.nn.parallel.data_parallel.DataParallel'>, 'Distribu...arallel'>, 'DistributedDataParallelWrapper': <class 'mmedit.core.distributed_wrapper.DistributedDataParallelWrapper'>})

[2022-02-17T04:12:38.143Z] module_class = <class 'mmpose.core.distributed_wrapper.DistributedDataParallelWrapper'>

[2022-02-17T04:12:38.143Z] module_name = ['DistributedDataParallelWrapper'], force = False

[2022-02-17T04:12:38.143Z] 

[2022-02-17T04:12:38.143Z]     def _register_module(self, module_class, module_name=None, force=False):

[2022-02-17T04:12:38.143Z]         if not inspect.isclass(module_class):

[2022-02-17T04:12:38.143Z]             raise TypeError('module must be a class, '

[2022-02-17T04:12:38.143Z]                             f'but got {type(module_class)}')

[2022-02-17T04:12:38.143Z]     

[2022-02-17T04:12:38.143Z]         if module_name is None:

[2022-02-17T04:12:38.143Z]             module_name = module_class.__name__

[2022-02-17T04:12:38.143Z]         if isinstance(module_name, str):

[2022-02-17T04:12:38.143Z]             module_name = [module_name]

[2022-02-17T04:12:38.143Z]         for name in module_name:

[2022-02-17T04:12:38.143Z]             if not force and name in self._module_dict:

[2022-02-17T04:12:38.143Z] >               raise KeyError(f'{name} is already registered '

[2022-02-17T04:12:38.143Z]                                f'in {self.name}')

[2022-02-17T04:12:38.143Z] E               KeyError: 'DistributedDataParallelWrapper is already registered in module wrapper'

[2022-02-17T04:12:38.143Z] 

[2022-02-17T04:12:38.143Z] ../mmcv/mmcv/utils/registry.py:246: KeyError

[2022-02-17T04:12:38.143Z] ----------------------------- Captured stdout call -----------------------------

[2022-02-17T04:12:38.143Z] loading annotations into memory...

[2022-02-17T04:12:38.143Z] Done (t=0.00s)

[2022-02-17T04:12:38.143Z] creating index...

[2022-02-17T04:12:38.143Z] index created!

[2022-02-17T04:12:38.143Z] => num_images: 1

[2022-02-17T04:12:38.143Z] => load 0 samples

[2022-02-17T04:12:38.143Z] ----------------------------- Captured stderr call -----------------------------

[2022-02-17T04:12:38.143Z] 2022-02-17 12:11:34,478 - mmdeploy - INFO - Sorting the dataset by 'height' and 'width' is not possible.

Reproduction

  1. What command or script did you run?
A placeholder for the command.
  1. Did you make any modifications on the code or config? Did you understand what you have modified?

Environment

  1. Please run python tools/check_env.py to collect necessary environment information and paste it here.
  2. You may add addition that may be helpful for locating the problem, such as
    • How you installed PyTorch [e.g., pip, conda, source]
    • Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

Error traceback

If applicable, paste the error trackback here.

A placeholder for trackback.

Bug fix

If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions