Skip to content
This repository was archived by the owner on Nov 3, 2023. It is now read-only.
This repository was archived by the owner on Nov 3, 2023. It is now read-only.

Using LightningCLI to parse plugin options from the config file fails when using the RayPlugin. #151

@subhashbylaiah

Description

@subhashbylaiah

Using LightningCLI to parse plugin options from the config file fails when you try to use the RayPlugin.
Here's how I am specifying the plugin option in the config file:

trainer:
  plugins:
    - class_path: ray_lightning.RayPlugin
      init_args:
        num_workers: 2
        use_gpu: false

The error stack I get is below:

Value "[Namespace(class_path='ray_lightning.RayPlugin', init_args=Namespace(checkpoint_io=None, cluster_environment=None, ddp_comm_state=None, init_hook=None, num_cpus_per_worker=1, num_nodes=None, num_workers=2, parallel_devices=None, sync_batchnorm=None, use_gpu=False))]" does not validate against any of the types in typing.Union[pytorch_lightning.plugins.training_type.training_type_plugin.TrainingTypePlugin, pytorch_lightning.plugins.precision.precision_plugin.PrecisionPlugin, pytorch_lightning.plugins.environments.cluster_environment.ClusterEnvironment, pytorch_lightning.plugins.io.checkpoint_plugin.CheckpointIO, str, typing.List[typing.Union[pytorch_lightning.plugins.training_type.training_type_plugin.TrainingTypePlugin, pytorch_lightning.plugins.precision.precision_plugin.PrecisionPlugin, pytorch_lightning.plugins.environments.cluster_environment.ClusterEnvironment, pytorch_lightning.plugins.io.checkpoint_plugin.CheckpointIO, str]], NoneType]:
  - Type <class 'pytorch_lightning.plugins.training_type.training_type_plugin.TrainingTypePlugin'> expects an str or a Dict/Namespace with a class_path entry but got "[Namespace(class_path='ray_lightning.RayPlugin', init_args=Namespace(checkpoint_io=None, cluster_environment=None, ddp_comm_state=None, init_hook=None, num_cpus_per_worker=1, num_nodes=None, num_workers=2, parallel_devices=None, sync_batchnorm=None, use_gpu=False))]"
  - Type <class 'pytorch_lightning.plugins.precision.precision_plugin.PrecisionPlugin'> expects an str or a Dict/Namespace with a class_path entry but got "[Namespace(class_path='ray_lightning.RayPlugin', init_args=Namespace(checkpoint_io=None, cluster_environment=None, ddp_comm_state=None, init_hook=None, num_cpus_per_worker=1, num_nodes=None, num_workers=2, parallel_devices=None, sync_batchnorm=None, use_gpu=False))]"
  - Type <class 'pytorch_lightning.plugins.environments.cluster_environment.ClusterEnvironment'> expects an str or a Dict/Namespace with a class_path entry but got "[Namespace(class_path='ray_lightning.RayPlugin', init_args=Namespace(checkpoint_io=None, cluster_environment=None, ddp_comm_state=None, init_hook=None, num_cpus_per_worker=1, num_nodes=None, num_workers=2, parallel_devices=None, sync_batchnorm=None, use_gpu=False))]"
  - Type <class 'pytorch_lightning.plugins.io.checkpoint_plugin.CheckpointIO'> expects an str or a Dict/Namespace with a class_path entry but got "[Namespace(class_path='ray_lightning.RayPlugin', init_args=Namespace(checkpoint_io=None, cluster_environment=None, ddp_comm_state=None, init_hook=None, num_cpus_per_worker=1, num_nodes=None, num_workers=2, parallel_devices=None, sync_batchnorm=None, use_gpu=False))]"
  - Expected a <class 'str'> but got "[Namespace(class_path='ray_lightning.RayPlugin', init_args=Namespace(checkpoint_io=None, cluster_environment=None, ddp_comm_state=None, init_hook=None, num_cpus_per_worker=1, num_nodes=None, num_workers=2, parallel_devices=None, sync_batchnorm=None, use_gpu=False))]"
  - Value "Namespace(class_path='ray_lightning.RayPlugin', init_args=Namespace(checkpoint_io=None, cluster_environment=None, ddp_comm_state=None, init_hook=None, num_cpus_per_worker=1, num_nodes=None, num_workers=2, parallel_devices=None, sync_batchnorm=None, use_gpu=False))" does not validate against any of the types in typing.Union[pytorch_lightning.plugins.training_type.training_type_plugin.TrainingTypePlugin, pytorch_lightning.plugins.precision.precision_plugin.PrecisionPlugin, pytorch_lightning.plugins.environments.cluster_environment.ClusterEnvironment, pytorch_lightning.plugins.io.checkpoint_plugin.CheckpointIO, str]:
    - __init__() got multiple values for keyword argument 'parallel_devices'
    - "ray_lightning.RayPlugin" is not a subclass of PrecisionPlugin
    - "ray_lightning.RayPlugin" is not a subclass of ClusterEnvironment
    - "ray_lightning.RayPlugin" is not a subclass of CheckpointIO
    - Expected a <class 'str'> but got "Namespace(class_path='ray_lightning.RayPlugin', init_args=Namespace(checkpoint_io=None, cluster_environment=None, ddp_comm_state=None, init_hook=None, num_cpus_per_worker=1, num_nodes=None, num_workers=2, parallel_devices=None, sync_batchnorm=None, use_gpu=False))"
  - Expected a <class 'NoneType'> but got "[Namespace(class_path='ray_lightning.RayPlugin', init_args=Namespace(checkpoint_io=None, cluster_environment=None, ddp_comm_state=None, init_hook=None, num_cpus_per_worker=1, num_nodes=None, num_workers=2, parallel_devices=None, sync_batchnorm=None, use_gpu=False))]"
  File "/Users/sbylaiah/miniconda3/envs/cviz/lib/python3.8/site-packages/jsonargparse/typehints.py", line 462, in adapt_typehints
    raise ValueError(f'Value "{val}" does not validate against any of the types in {typehint}:{e}')
  File "/Users/sbylaiah/miniconda3/envs/cviz/lib/python3.8/site-packages/jsonargparse/typehints.py", line 344, in instantiate_classes
    value[num] = adapt_typehints(val, self._typehint, instantiate_classes=True, sub_add_kwargs=sub_add_kwargs)
  File "/Users/sbylaiah/miniconda3/envs/cviz/lib/python3.8/site-packages/jsonargparse/core.py", line 1054, in instantiate_classes
    parent[key] = component.instantiate_classes(value)
  File "/Users/sbylaiah/miniconda3/envs/cviz/lib/python3.8/site-packages/jsonargparse/deprecated.py", line 127, in patched_instantiate_classes
    cfg = self._unpatched_instantiate_classes(cfg, **kwargs)
  File "/Users/sbylaiah/miniconda3/envs/cviz/lib/python3.8/site-packages/pytorch_lightning/utilities/cli.py", line 820, in instantiate_classes
    self.config_init = self.parser.instantiate_classes(self.config)
  File "/Users/sbylaiah/miniconda3/envs/cviz/lib/python3.8/site-packages/pytorch_lightning/utilities/cli.py", line 625, in __init__
    self.instantiate_classes()
  File "/Users/sbylaiah/Development/cibo/rs-inference/python/deepcdl/deepcdl/scripts/ray/train_deepcdl.py", line 70, in <module>
    cli = LightningCLI(UNetModule, CDLDataModule, run=False)
  File "/Users/sbylaiah/miniconda3/envs/cviz/lib/python3.8/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/Users/sbylaiah/miniconda3/envs/cviz/lib/python3.8/runpy.py", line 96, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/Users/sbylaiah/miniconda3/envs/cviz/lib/python3.8/runpy.py", line 263, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/Users/sbylaiah/miniconda3/envs/cviz/lib/python3.8/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/Users/sbylaiah/miniconda3/envs/cviz/lib/python3.8/runpy.py", line 193, in _run_module_as_main (Current frame)
    return _run_code(code, main_globals, None,

The relevant error from the stack trace is this: __init__() got multiple values for keyword argument 'parallel_devices'
This seems to be because the jsonargparse includes the the parallel_devices=None and cluster_environment=None in the init_args Namespace. But when super().__init__ is called from the RayPlugin, the kwargs are passed in for parallel_devices and cluster_environment again, resulting in the multiple values error above. Looks like we don't really need to pass those in into the super.init call, as these arguments are defaulted to None anyways.

Version info:
ray-lightning: 0.2.0
pytorch-lightning: 1.5.10
jsonargparse: 4.2.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions