-
Notifications
You must be signed in to change notification settings - Fork 707
KeyError: 'DistributedDataParallelWrapper is already registered in module wrapper' returned during unit testing #165
Copy link
Copy link
Closed
Description
Thanks for your bug report. We appreciate it a lot.
Checklist
Describe the bug
[2022-02-17T04:12:38.137Z] INFO mmdeploy:init_plugins.py:36 Successfully loaded tensorrt plugins from /opt/mmdeploy/build/lib/libmmdeploy_tensorrt_ops.so
[2022-02-17T04:12:38.137Z] ___________________ test_top_down_forward[Backend.TENSORRT] ____________________
[2022-02-17T04:12:38.137Z]
[2022-02-17T04:12:38.137Z] backend_type = <Backend.TENSORRT: 'tensorrt'>
[2022-02-17T04:12:38.137Z]
[2022-02-17T04:12:38.137Z] @pytest.mark.parametrize('backend_type',
[2022-02-17T04:12:38.137Z] [Backend.ONNXRUNTIME, Backend.TENSORRT])
[2022-02-17T04:12:38.137Z] def test_top_down_forward(backend_type: Backend):
[2022-02-17T04:12:38.137Z] check_backend(backend_type, True)
[2022-02-17T04:12:38.137Z] model = get_top_down_model()
[2022-02-17T04:12:38.137Z] model.cpu().eval()
[2022-02-17T04:12:38.137Z] if backend_type == Backend.TENSORRT:
[2022-02-17T04:12:38.137Z] deploy_cfg = mmcv.Config(
[2022-02-17T04:12:38.137Z] dict(
[2022-02-17T04:12:38.137Z] backend_config=dict(
[2022-02-17T04:12:38.137Z] type=backend_type.value,
[2022-02-17T04:12:38.137Z] common_config=dict(max_workspace_size=1 << 30),
[2022-02-17T04:12:38.137Z] model_inputs=[
[2022-02-17T04:12:38.137Z] dict(
[2022-02-17T04:12:38.137Z] input_shapes=dict(
[2022-02-17T04:12:38.138Z] input=dict(
[2022-02-17T04:12:38.138Z] min_shape=[1, 3, 32, 32],
[2022-02-17T04:12:38.138Z] opt_shape=[1, 3, 32, 32],
[2022-02-17T04:12:38.138Z] max_shape=[1, 3, 32, 32])))
[2022-02-17T04:12:38.138Z] ]),
[2022-02-17T04:12:38.138Z] onnx_config=dict(
[2022-02-17T04:12:38.138Z] input_shape=[32, 32], output_names=['output']),
[2022-02-17T04:12:38.138Z] codebase_config=dict(
[2022-02-17T04:12:38.138Z] type=Codebase.MMPOSE.value,
[2022-02-17T04:12:38.138Z] task=Task.POSE_DETECTION.value)))
[2022-02-17T04:12:38.138Z] else:
[2022-02-17T04:12:38.138Z] deploy_cfg = mmcv.Config(
[2022-02-17T04:12:38.138Z] dict(
[2022-02-17T04:12:38.138Z] backend_config=dict(type=backend_type.value),
[2022-02-17T04:12:38.138Z] onnx_config=dict(input_shape=None, output_names=['output']),
[2022-02-17T04:12:38.138Z] codebase_config=dict(
[2022-02-17T04:12:38.138Z] type=Codebase.MMPOSE.value,
[2022-02-17T04:12:38.138Z] task=Task.POSE_DETECTION.value)))
[2022-02-17T04:12:38.138Z] img = torch.rand((1, 3, 32, 32))
[2022-02-17T04:12:38.138Z] img_metas = {
[2022-02-17T04:12:38.138Z] 'image_file':
[2022-02-17T04:12:38.138Z] 'tests/test_codebase/test_mmpose' + '/data/imgs/dataset/blank.jpg',
[2022-02-17T04:12:38.138Z] 'center': torch.tensor([0.5, 0.5]),
[2022-02-17T04:12:38.138Z] 'scale': 1.,
[2022-02-17T04:12:38.138Z] 'location': torch.tensor([0.5, 0.5]),
[2022-02-17T04:12:38.138Z] 'bbox_score': 0.5
[2022-02-17T04:12:38.138Z] }
[2022-02-17T04:12:38.138Z] model_outputs = model.forward(
[2022-02-17T04:12:38.138Z] img, img_metas=[img_metas], return_loss=False, return_heatmap=True)
[2022-02-17T04:12:38.138Z] model_outputs = model_outputs['output_heatmap']
[2022-02-17T04:12:38.138Z] wrapped_model = WrapModel(model, 'forward', return_loss=False)
[2022-02-17T04:12:38.138Z] rewrite_inputs = {'img': img}
[2022-02-17T04:12:38.138Z] rewrite_outputs, is_backend_output = get_rewrite_outputs(
[2022-02-17T04:12:38.138Z] wrapped_model=wrapped_model,
[2022-02-17T04:12:38.138Z] model_inputs=rewrite_inputs,
[2022-02-17T04:12:38.138Z] > deploy_cfg=deploy_cfg)
[2022-02-17T04:12:38.138Z]
[2022-02-17T04:12:38.138Z] tests/test_codebase/test_mmpose/test_mmpose_models.py:278:
[2022-02-17T04:12:38.138Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
[2022-02-17T04:12:38.138Z] mmdeploy/utils/test.py:519: in get_rewrite_outputs
[2022-02-17T04:12:38.138Z] deploy_cfg)
[2022-02-17T04:12:38.138Z] mmdeploy/utils/test.py:411: in get_backend_outputs
[2022-02-17T04:12:38.138Z] onnx_model=onnx_file_path)
[2022-02-17T04:12:38.138Z] mmdeploy/backend/tensorrt/onnx2tensorrt.py:72: in onnx2tensorrt
[2022-02-17T04:12:38.138Z] device_id=device_id)
[2022-02-17T04:12:38.138Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
[2022-02-17T04:12:38.138Z]
[2022-02-17T04:12:38.138Z] onnx_model = ir_version: 6
[2022-02-17T04:12:38.138Z] producer_name: "pytorch"
[2022-02-17T04:12:38.138Z] producer_version: "1.9"
[2022-02-17T04:12:38.138Z] graph {
[2022-02-17T04:12:38.138Z] node {
[2022-02-17T04:12:38.138Z] input: "img"
[2022-02-17T04:12:38.138Z] input: "218"
[2022-02-17T04:12:38.138Z] ... }
[2022-02-17T04:12:38.138Z] dim {
[2022-02-17T04:12:38.138Z] dim_value: 8
[2022-02-17T04:12:38.138Z] }
[2022-02-17T04:12:38.138Z] }
[2022-02-17T04:12:38.138Z] }
[2022-02-17T04:12:38.138Z] }
[2022-02-17T04:12:38.138Z] }
[2022-02-17T04:12:38.138Z] }
[2022-02-17T04:12:38.138Z] opset_import {
[2022-02-17T04:12:38.138Z] version: 11
[2022-02-17T04:12:38.138Z] }
[2022-02-17T04:12:38.138Z]
[2022-02-17T04:12:38.139Z] input_shapes = {'input': {'min_shape': [1, 3, 32, 32], 'opt_shape': [1, 3, 32, 32], 'max_shape': [1, 3, 32, 32]}}
[2022-02-17T04:12:38.139Z] log_level = <Severity.INFO: 3>, fp16_mode = False, int8_mode = False
[2022-02-17T04:12:38.139Z] int8_param = {}, max_workspace_size = 1073741824, device_id = 0, kwargs = {}
[2022-02-17T04:12:38.139Z] device = device(type='cuda', index=0)
[2022-02-17T04:12:38.139Z] logger = <tensorrt.tensorrt.Logger object at 0x7f47e4afe7b0>
[2022-02-17T04:12:38.139Z] builder = <tensorrt.tensorrt.Builder object at 0x7f47e4afe730>
[2022-02-17T04:12:38.139Z] EXPLICIT_BATCH = 1
[2022-02-17T04:12:38.139Z] network = <tensorrt.tensorrt.INetworkDefinition object at 0x7f47e4afe0b0>
[2022-02-17T04:12:38.139Z] parser = <tensorrt.tensorrt.OnnxParser object at 0x7f47e4afecb0>
[2022-02-17T04:12:38.139Z]
[2022-02-17T04:12:38.139Z] def create_trt_engine(onnx_model: Union[str, onnx.ModelProto],
[2022-02-17T04:12:38.139Z] input_shapes: Dict[str, Sequence[int]],
[2022-02-17T04:12:38.139Z] log_level: trt.Logger.Severity = trt.Logger.ERROR,
[2022-02-17T04:12:38.139Z] fp16_mode: bool = False,
[2022-02-17T04:12:38.139Z] int8_mode: bool = False,
[2022-02-17T04:12:38.139Z] int8_param: dict = None,
[2022-02-17T04:12:38.139Z] max_workspace_size: int = 0,
[2022-02-17T04:12:38.139Z] device_id: int = 0,
[2022-02-17T04:12:38.139Z] **kwargs) -> trt.ICudaEngine:
[2022-02-17T04:12:38.139Z] """Create a tensorrt engine from ONNX.
[2022-02-17T04:12:38.139Z]
[2022-02-17T04:12:38.139Z] Args:
[2022-02-17T04:12:38.139Z] onnx_model (str or onnx.ModelProto): Input onnx model to convert from.
[2022-02-17T04:12:38.139Z] input_shapes (Dict[str, Sequence[int]]): The min/opt/max shape of
[2022-02-17T04:12:38.139Z] each input.
[2022-02-17T04:12:38.139Z] log_level (trt.Logger.Severity): The log level of TensorRT. Defaults to
[2022-02-17T04:12:38.139Z] `trt.Logger.ERROR`.
[2022-02-17T04:12:38.139Z] fp16_mode (bool): Specifying whether to enable fp16 mode.
[2022-02-17T04:12:38.139Z] Defaults to `False`.
[2022-02-17T04:12:38.139Z] int8_mode (bool): Specifying whether to enable int8 mode.
[2022-02-17T04:12:38.139Z] Defaults to `False`.
[2022-02-17T04:12:38.139Z] int8_param (dict): A dict of parameter int8 mode. Defaults to `None`.
[2022-02-17T04:12:38.139Z] max_workspace_size (int): To set max workspace size of TensorRT engine.
[2022-02-17T04:12:38.139Z] some tactics and layers need large workspace. Defaults to `0`.
[2022-02-17T04:12:38.139Z] device_id (int): Choice the device to create engine. Defaults to `0`.
[2022-02-17T04:12:38.139Z]
[2022-02-17T04:12:38.139Z] Returns:
[2022-02-17T04:12:38.139Z] tensorrt.ICudaEngine: The TensorRT engine created from onnx_model.
[2022-02-17T04:12:38.139Z]
[2022-02-17T04:12:38.139Z] Example:
[2022-02-17T04:12:38.139Z] >>> from mmdeploy.apis.tensorrt import create_trt_engine
[2022-02-17T04:12:38.139Z] >>> engine = create_trt_engine(
[2022-02-17T04:12:38.139Z] >>> "onnx_model.onnx",
[2022-02-17T04:12:38.139Z] >>> {'input': {"min_shape" : [1, 3, 160, 160],
[2022-02-17T04:12:38.139Z] >>> "opt_shape" : [1, 3, 320, 320],
[2022-02-17T04:12:38.139Z] >>> "max_shape" : [1, 3, 640, 640]}},
[2022-02-17T04:12:38.139Z] >>> log_level=trt.Logger.WARNING,
[2022-02-17T04:12:38.139Z] >>> fp16_mode=True,
[2022-02-17T04:12:38.139Z] >>> max_workspace_size=1 << 30,
[2022-02-17T04:12:38.139Z] >>> device_id=0)
[2022-02-17T04:12:38.139Z] >>> })
[2022-02-17T04:12:38.139Z] """
[2022-02-17T04:12:38.139Z] load_tensorrt_plugin()
[2022-02-17T04:12:38.139Z] device = torch.device('cuda:{}'.format(device_id))
[2022-02-17T04:12:38.139Z] # create builder and network
[2022-02-17T04:12:38.139Z] logger = trt.Logger(log_level)
[2022-02-17T04:12:38.139Z] builder = trt.Builder(logger)
[2022-02-17T04:12:38.140Z] EXPLICIT_BATCH = 1 << (int)(
[2022-02-17T04:12:38.140Z] trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
[2022-02-17T04:12:38.140Z] network = builder.create_network(EXPLICIT_BATCH)
[2022-02-17T04:12:38.140Z]
[2022-02-17T04:12:38.140Z] # parse onnx
[2022-02-17T04:12:38.140Z] parser = trt.OnnxParser(network, logger)
[2022-02-17T04:12:38.140Z]
[2022-02-17T04:12:38.140Z] if isinstance(onnx_model, str):
[2022-02-17T04:12:38.140Z] onnx_model = onnx.load(onnx_model)
[2022-02-17T04:12:38.140Z]
[2022-02-17T04:12:38.140Z] if not parser.parse(onnx_model.SerializeToString()):
[2022-02-17T04:12:38.140Z] error_msgs = ''
[2022-02-17T04:12:38.140Z] for error in range(parser.num_errors):
[2022-02-17T04:12:38.140Z] error_msgs += f'{parser.get_error(error)}\n'
[2022-02-17T04:12:38.140Z] raise RuntimeError(f'Failed to parse onnx, {error_msgs}')
[2022-02-17T04:12:38.140Z]
[2022-02-17T04:12:38.140Z] # config builder
[2022-02-17T04:12:38.140Z] if version.parse(trt.__version__) < version.parse('8'):
[2022-02-17T04:12:38.140Z] builder.max_workspace_size = max_workspace_size
[2022-02-17T04:12:38.140Z]
[2022-02-17T04:12:38.140Z] config = builder.create_builder_config()
[2022-02-17T04:12:38.140Z] config.max_workspace_size = max_workspace_size
[2022-02-17T04:12:38.140Z] profile = builder.create_optimization_profile()
[2022-02-17T04:12:38.140Z]
[2022-02-17T04:12:38.140Z] for input_name, param in input_shapes.items():
[2022-02-17T04:12:38.140Z] min_shape = param['min_shape']
[2022-02-17T04:12:38.140Z] opt_shape = param['opt_shape']
[2022-02-17T04:12:38.140Z] max_shape = param['max_shape']
[2022-02-17T04:12:38.140Z] profile.set_shape(input_name, min_shape, opt_shape, max_shape)
[2022-02-17T04:12:38.140Z] config.add_optimization_profile(profile)
[2022-02-17T04:12:38.140Z]
[2022-02-17T04:12:38.140Z] if fp16_mode:
[2022-02-17T04:12:38.140Z] if version.parse(trt.__version__) < version.parse('8'):
[2022-02-17T04:12:38.140Z] builder.fp16_mode = fp16_mode
[2022-02-17T04:12:38.140Z] config.set_flag(trt.BuilderFlag.FP16)
[2022-02-17T04:12:38.140Z]
[2022-02-17T04:12:38.140Z] if int8_mode:
[2022-02-17T04:12:38.140Z] config.set_flag(trt.BuilderFlag.INT8)
[2022-02-17T04:12:38.140Z] assert int8_param is not None
[2022-02-17T04:12:38.140Z] config.int8_calibrator = HDF5Calibrator(
[2022-02-17T04:12:38.140Z] int8_param['calib_file'],
[2022-02-17T04:12:38.140Z] input_shapes,
[2022-02-17T04:12:38.140Z] model_type=int8_param['model_type'],
[2022-02-17T04:12:38.140Z] device_id=device_id,
[2022-02-17T04:12:38.140Z] algorithm=int8_param.get(
[2022-02-17T04:12:38.140Z] 'algorithm', trt.CalibrationAlgoType.ENTROPY_CALIBRATION_2))
[2022-02-17T04:12:38.140Z] if version.parse(trt.__version__) < version.parse('8'):
[2022-02-17T04:12:38.140Z] builder.int8_mode = int8_mode
[2022-02-17T04:12:38.140Z] builder.int8_calibrator = config.int8_calibrator
[2022-02-17T04:12:38.140Z]
[2022-02-17T04:12:38.140Z] # create engine
[2022-02-17T04:12:38.140Z] with torch.cuda.device(device):
[2022-02-17T04:12:38.140Z] engine = builder.build_engine(network, config)
[2022-02-17T04:12:38.140Z]
[2022-02-17T04:12:38.140Z] > assert engine is not None, 'Failed to create TensorRT engine'
[2022-02-17T04:12:38.140Z] E AssertionError: Failed to create TensorRT engine
[2022-02-17T04:12:38.140Z]
[2022-02-17T04:12:38.140Z] mmdeploy/backend/tensorrt/utils.py:116: AssertionError
[2022-02-17T04:12:38.140Z] ----------------------------- Captured stderr call -----------------------------
[2022-02-17T04:12:38.140Z] 2022-02-17 12:11:23,890 - mmdeploy - INFO - Successfully loaded tensorrt plugins from /opt/mmdeploy/build/lib/libmmdeploy_tensorrt_ops.so
[2022-02-17T04:12:38.140Z] [TensorRT] WARNING: The logger passed into createInferBuilder differs from one already provided for an existing builder, runtime, or refitter. TensorRT maintains only a single logger pointer at any given time, so the existing value, which can be retrieved with getLogger(), will be used instead. In order to use a new logger, first destroy all existing builder, runner or refitter objects.
[2022-02-17T04:12:38.140Z]
[2022-02-17T04:12:38.141Z] [TensorRT] INFO: [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 3030, GPU 4653 (MiB)
[2022-02-17T04:12:38.141Z] [TensorRT] INFO: [MemUsageSnapshot] Builder begin: CPU 3147 MiB, GPU 4653 MiB
[2022-02-17T04:12:38.141Z] [TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +1, GPU +10, now: CPU 3164, GPU 4663 (MiB)
[2022-02-17T04:12:38.141Z] [TensorRT] INFO: [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 3164, GPU 4671 (MiB)
[2022-02-17T04:12:38.141Z] [TensorRT] WARNING: TensorRT was linked against cuDNN 8.2.1 but loaded cuDNN 8.2.0
[2022-02-17T04:12:38.141Z] [TensorRT] WARNING: Detected invalid timing cache, setup a local cache instead
[2022-02-17T04:12:38.141Z] [TensorRT] INFO: Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
[2022-02-17T04:12:38.141Z] [TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 3164, GPU 4653 (MiB)
[2022-02-17T04:12:38.141Z] [TensorRT] ERROR: 2: [ltWrapper.cpp::setupHeuristic::327] Error Code 2: Internal Error (Assertion cublasStatus == CUBLAS_STATUS_SUCCESS failed.)
[2022-02-17T04:12:38.141Z] ------------------------------ Captured log call -------------------------------
[2022-02-17T04:12:38.141Z] INFO mmdeploy:init_plugins.py:36 Successfully loaded tensorrt plugins from /opt/mmdeploy/build/lib/libmmdeploy_tensorrt_ops.so
[2022-02-17T04:12:38.141Z] ______________________________ test_create_input _______________________________
[2022-02-17T04:12:38.141Z]
[2022-02-17T04:12:38.141Z] def test_create_input():
[2022-02-17T04:12:38.141Z] model_cfg = load_config(model_cfg_path)[0]
[2022-02-17T04:12:38.141Z] deploy_cfg = mmcv.Config(
[2022-02-17T04:12:38.141Z] dict(
[2022-02-17T04:12:38.141Z] backend_config=dict(type=Backend.ONNXRUNTIME.value),
[2022-02-17T04:12:38.141Z] codebase_config=dict(
[2022-02-17T04:12:38.141Z] type=Codebase.MMPOSE.value, task=Task.POSE_DETECTION.value),
[2022-02-17T04:12:38.141Z] onnx_config=dict(
[2022-02-17T04:12:38.141Z] type='onnx',
[2022-02-17T04:12:38.141Z] export_params=True,
[2022-02-17T04:12:38.141Z] keep_initializers_as_inputs=False,
[2022-02-17T04:12:38.141Z] opset_version=11,
[2022-02-17T04:12:38.141Z] save_file='end2end.onnx',
[2022-02-17T04:12:38.141Z] input_names=['input'],
[2022-02-17T04:12:38.141Z] output_names=['output'],
[2022-02-17T04:12:38.141Z] input_shape=None)))
[2022-02-17T04:12:38.141Z] task_processor = build_task_processor(model_cfg, deploy_cfg, 'cpu')
[2022-02-17T04:12:38.141Z] > inputs = task_processor.create_input(img, input_shape=img_shape)
[2022-02-17T04:12:38.141Z]
[2022-02-17T04:12:38.141Z] tests/test_codebase/test_mmpose/test_pose_detection.py:65:
[2022-02-17T04:12:38.141Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
[2022-02-17T04:12:38.141Z] mmdeploy/codebase/mmpose/deploy/pose_detection.py:84: in create_input
[2022-02-17T04:12:38.141Z] from mmpose.apis.inference import LoadImage, _box2cs
[2022-02-17T04:12:38.141Z] ../conda/lib/python3.7/site-packages/mmpose/apis/__init__.py:10: in <module>
[2022-02-17T04:12:38.141Z] from .train import init_random_seed, train_model
[2022-02-17T04:12:38.141Z] ../conda/lib/python3.7/site-packages/mmpose/apis/train.py:12: in <module>
[2022-02-17T04:12:38.141Z] from mmpose.core.distributed_wrapper import DistributedDataParallelWrapper
[2022-02-17T04:12:38.141Z] ../conda/lib/python3.7/site-packages/mmpose/core/distributed_wrapper.py:10: in <module>
[2022-02-17T04:12:38.141Z] class DistributedDataParallelWrapper(nn.Module):
[2022-02-17T04:12:38.141Z] ../mmcv/mmcv/utils/registry.py:312: in _register
[2022-02-17T04:12:38.141Z] module_class=cls, module_name=name, force=force)
[2022-02-17T04:12:38.141Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
[2022-02-17T04:12:38.141Z]
[2022-02-17T04:12:38.141Z] self = Registry(name=module wrapper, items={'DataParallel': <class 'torch.nn.parallel.data_parallel.DataParallel'>, 'Distribu...arallel'>, 'DistributedDataParallelWrapper': <class 'mmedit.core.distributed_wrapper.DistributedDataParallelWrapper'>})
[2022-02-17T04:12:38.141Z] module_class = <class 'mmpose.core.distributed_wrapper.DistributedDataParallelWrapper'>
[2022-02-17T04:12:38.141Z] module_name = ['DistributedDataParallelWrapper'], force = False
[2022-02-17T04:12:38.141Z]
[2022-02-17T04:12:38.141Z] def _register_module(self, module_class, module_name=None, force=False):
[2022-02-17T04:12:38.141Z] if not inspect.isclass(module_class):
[2022-02-17T04:12:38.141Z] raise TypeError('module must be a class, '
[2022-02-17T04:12:38.141Z] f'but got {type(module_class)}')
[2022-02-17T04:12:38.141Z]
[2022-02-17T04:12:38.141Z] if module_name is None:
[2022-02-17T04:12:38.141Z] module_name = module_class.__name__
[2022-02-17T04:12:38.141Z] if isinstance(module_name, str):
[2022-02-17T04:12:38.141Z] module_name = [module_name]
[2022-02-17T04:12:38.141Z] for name in module_name:
[2022-02-17T04:12:38.141Z] if not force and name in self._module_dict:
[2022-02-17T04:12:38.142Z] > raise KeyError(f'{name} is already registered '
[2022-02-17T04:12:38.142Z] f'in {self.name}')
[2022-02-17T04:12:38.142Z] E KeyError: 'DistributedDataParallelWrapper is already registered in module wrapper'
[2022-02-17T04:12:38.142Z]
[2022-02-17T04:12:38.142Z] ../mmcv/mmcv/utils/registry.py:246: KeyError
[2022-02-17T04:12:38.142Z] ___________________________ test_init_pytorch_model ____________________________
[2022-02-17T04:12:38.142Z]
[2022-02-17T04:12:38.142Z] def test_init_pytorch_model():
[2022-02-17T04:12:38.142Z] from mmpose.models.detectors.base import BasePose
[2022-02-17T04:12:38.142Z] > model = task_processor.init_pytorch_model(None)
[2022-02-17T04:12:38.142Z]
[2022-02-17T04:12:38.142Z] tests/test_codebase/test_mmpose/test_pose_detection.py:71:
[2022-02-17T04:12:38.142Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
[2022-02-17T04:12:38.142Z] mmdeploy/codebase/mmpose/deploy/pose_detection.py:63: in init_pytorch_model
[2022-02-17T04:12:38.142Z] from mmpose.apis import init_pose_model
[2022-02-17T04:12:38.142Z] ../conda/lib/python3.7/site-packages/mmpose/apis/__init__.py:10: in <module>
[2022-02-17T04:12:38.142Z] from .train import init_random_seed, train_model
[2022-02-17T04:12:38.142Z] ../conda/lib/python3.7/site-packages/mmpose/apis/train.py:12: in <module>
[2022-02-17T04:12:38.142Z] from mmpose.core.distributed_wrapper import DistributedDataParallelWrapper
[2022-02-17T04:12:38.142Z] ../conda/lib/python3.7/site-packages/mmpose/core/distributed_wrapper.py:10: in <module>
[2022-02-17T04:12:38.142Z] class DistributedDataParallelWrapper(nn.Module):
[2022-02-17T04:12:38.142Z] ../mmcv/mmcv/utils/registry.py:312: in _register
[2022-02-17T04:12:38.142Z] module_class=cls, module_name=name, force=force)
[2022-02-17T04:12:38.142Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
[2022-02-17T04:12:38.142Z]
[2022-02-17T04:12:38.142Z] self = Registry(name=module wrapper, items={'DataParallel': <class 'torch.nn.parallel.data_parallel.DataParallel'>, 'Distribu...arallel'>, 'DistributedDataParallelWrapper': <class 'mmedit.core.distributed_wrapper.DistributedDataParallelWrapper'>})
[2022-02-17T04:12:38.142Z] module_class = <class 'mmpose.core.distributed_wrapper.DistributedDataParallelWrapper'>
[2022-02-17T04:12:38.142Z] module_name = ['DistributedDataParallelWrapper'], force = False
[2022-02-17T04:12:38.142Z]
[2022-02-17T04:12:38.142Z] def _register_module(self, module_class, module_name=None, force=False):
[2022-02-17T04:12:38.142Z] if not inspect.isclass(module_class):
[2022-02-17T04:12:38.142Z] raise TypeError('module must be a class, '
[2022-02-17T04:12:38.142Z] f'but got {type(module_class)}')
[2022-02-17T04:12:38.142Z]
[2022-02-17T04:12:38.142Z] if module_name is None:
[2022-02-17T04:12:38.142Z] module_name = module_class.__name__
[2022-02-17T04:12:38.142Z] if isinstance(module_name, str):
[2022-02-17T04:12:38.142Z] module_name = [module_name]
[2022-02-17T04:12:38.142Z] for name in module_name:
[2022-02-17T04:12:38.142Z] if not force and name in self._module_dict:
[2022-02-17T04:12:38.142Z] > raise KeyError(f'{name} is already registered '
[2022-02-17T04:12:38.142Z] f'in {self.name}')
[2022-02-17T04:12:38.142Z] E KeyError: 'DistributedDataParallelWrapper is already registered in module wrapper'
[2022-02-17T04:12:38.142Z]
[2022-02-17T04:12:38.142Z] ../mmcv/mmcv/utils/registry.py:246: KeyError
[2022-02-17T04:12:38.142Z] ______________________ test_single_gpu_test_and_evaluate _______________________
[2022-02-17T04:12:38.142Z]
[2022-02-17T04:12:38.142Z] def test_single_gpu_test_and_evaluate():
[2022-02-17T04:12:38.142Z] from mmcv.parallel import MMDataParallel
[2022-02-17T04:12:38.142Z] dataset = task_processor.build_dataset(
[2022-02-17T04:12:38.142Z] dataset_cfg=model_cfg, dataset_type='test')
[2022-02-17T04:12:38.142Z] dataloader = task_processor.build_dataloader(dataset, 1, 1)
[2022-02-17T04:12:38.142Z]
[2022-02-17T04:12:38.142Z] # Prepare dummy model
[2022-02-17T04:12:38.142Z] model = DummyModel(outputs=[torch.rand([1, 1000])])
[2022-02-17T04:12:38.142Z] model = MMDataParallel(model, device_ids=[0])
[2022-02-17T04:12:38.142Z] assert model is not None
[2022-02-17T04:12:38.142Z] # Run test
[2022-02-17T04:12:38.142Z] > outputs = task_processor.single_gpu_test(model, dataloader)
[2022-02-17T04:12:38.142Z]
[2022-02-17T04:12:38.142Z] tests/test_codebase/test_mmpose/test_pose_detection.py:146:
[2022-02-17T04:12:38.142Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
[2022-02-17T04:12:38.143Z] mmdeploy/codebase/base/task.py:138: in single_gpu_test
[2022-02-17T04:12:38.143Z] out_dir, **kwargs)
[2022-02-17T04:12:38.143Z] mmdeploy/codebase/mmpose/deploy/mmpose.py:131: in single_gpu_test
[2022-02-17T04:12:38.143Z] from mmpose.apis import single_gpu_test
[2022-02-17T04:12:38.143Z] ../conda/lib/python3.7/site-packages/mmpose/apis/__init__.py:10: in <module>
[2022-02-17T04:12:38.143Z] from .train import init_random_seed, train_model
[2022-02-17T04:12:38.143Z] ../conda/lib/python3.7/site-packages/mmpose/apis/train.py:12: in <module>
[2022-02-17T04:12:38.143Z] from mmpose.core.distributed_wrapper import DistributedDataParallelWrapper
[2022-02-17T04:12:38.143Z] ../conda/lib/python3.7/site-packages/mmpose/core/distributed_wrapper.py:10: in <module>
[2022-02-17T04:12:38.143Z] class DistributedDataParallelWrapper(nn.Module):
[2022-02-17T04:12:38.143Z] ../mmcv/mmcv/utils/registry.py:312: in _register
[2022-02-17T04:12:38.143Z] module_class=cls, module_name=name, force=force)
[2022-02-17T04:12:38.143Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
[2022-02-17T04:12:38.143Z]
[2022-02-17T04:12:38.143Z] self = Registry(name=module wrapper, items={'DataParallel': <class 'torch.nn.parallel.data_parallel.DataParallel'>, 'Distribu...arallel'>, 'DistributedDataParallelWrapper': <class 'mmedit.core.distributed_wrapper.DistributedDataParallelWrapper'>})
[2022-02-17T04:12:38.143Z] module_class = <class 'mmpose.core.distributed_wrapper.DistributedDataParallelWrapper'>
[2022-02-17T04:12:38.143Z] module_name = ['DistributedDataParallelWrapper'], force = False
[2022-02-17T04:12:38.143Z]
[2022-02-17T04:12:38.143Z] def _register_module(self, module_class, module_name=None, force=False):
[2022-02-17T04:12:38.143Z] if not inspect.isclass(module_class):
[2022-02-17T04:12:38.143Z] raise TypeError('module must be a class, '
[2022-02-17T04:12:38.143Z] f'but got {type(module_class)}')
[2022-02-17T04:12:38.143Z]
[2022-02-17T04:12:38.143Z] if module_name is None:
[2022-02-17T04:12:38.143Z] module_name = module_class.__name__
[2022-02-17T04:12:38.143Z] if isinstance(module_name, str):
[2022-02-17T04:12:38.143Z] module_name = [module_name]
[2022-02-17T04:12:38.143Z] for name in module_name:
[2022-02-17T04:12:38.143Z] if not force and name in self._module_dict:
[2022-02-17T04:12:38.143Z] > raise KeyError(f'{name} is already registered '
[2022-02-17T04:12:38.143Z] f'in {self.name}')
[2022-02-17T04:12:38.143Z] E KeyError: 'DistributedDataParallelWrapper is already registered in module wrapper'
[2022-02-17T04:12:38.143Z]
[2022-02-17T04:12:38.143Z] ../mmcv/mmcv/utils/registry.py:246: KeyError
[2022-02-17T04:12:38.143Z] ----------------------------- Captured stdout call -----------------------------
[2022-02-17T04:12:38.143Z] loading annotations into memory...
[2022-02-17T04:12:38.143Z] Done (t=0.00s)
[2022-02-17T04:12:38.143Z] creating index...
[2022-02-17T04:12:38.143Z] index created!
[2022-02-17T04:12:38.143Z] => num_images: 1
[2022-02-17T04:12:38.143Z] => load 0 samples
[2022-02-17T04:12:38.143Z] ----------------------------- Captured stderr call -----------------------------
[2022-02-17T04:12:38.143Z] 2022-02-17 12:11:34,478 - mmdeploy - INFO - Sorting the dataset by 'height' and 'width' is not possible.
Reproduction
- What command or script did you run?
A placeholder for the command.
- Did you make any modifications on the code or config? Did you understand what you have modified?
Environment
- Please run
python tools/check_env.pyto collect necessary environment information and paste it here. - You may add addition that may be helpful for locating the problem, such as
- How you installed PyTorch [e.g., pip, conda, source]
- Other environment variables that may be related (such as
$PATH,$LD_LIBRARY_PATH,$PYTHONPATH, etc.)
Error traceback
If applicable, paste the error trackback here.
A placeholder for trackback.
Bug fix
If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels