Using LightningCLI to parse plugin options from the config file fails when you try to use the RayPlugin.
Here's how I am specifying the plugin option in the config file:
Value "[Namespace(class_path='ray_lightning.RayPlugin', init_args=Namespace(checkpoint_io=None, cluster_environment=None, ddp_comm_state=None, init_hook=None, num_cpus_per_worker=1, num_nodes=None, num_workers=2, parallel_devices=None, sync_batchnorm=None, use_gpu=False))]" does not validate against any of the types in typing.Union[pytorch_lightning.plugins.training_type.training_type_plugin.TrainingTypePlugin, pytorch_lightning.plugins.precision.precision_plugin.PrecisionPlugin, pytorch_lightning.plugins.environments.cluster_environment.ClusterEnvironment, pytorch_lightning.plugins.io.checkpoint_plugin.CheckpointIO, str, typing.List[typing.Union[pytorch_lightning.plugins.training_type.training_type_plugin.TrainingTypePlugin, pytorch_lightning.plugins.precision.precision_plugin.PrecisionPlugin, pytorch_lightning.plugins.environments.cluster_environment.ClusterEnvironment, pytorch_lightning.plugins.io.checkpoint_plugin.CheckpointIO, str]], NoneType]:
- Type <class 'pytorch_lightning.plugins.training_type.training_type_plugin.TrainingTypePlugin'> expects an str or a Dict/Namespace with a class_path entry but got "[Namespace(class_path='ray_lightning.RayPlugin', init_args=Namespace(checkpoint_io=None, cluster_environment=None, ddp_comm_state=None, init_hook=None, num_cpus_per_worker=1, num_nodes=None, num_workers=2, parallel_devices=None, sync_batchnorm=None, use_gpu=False))]"
- Type <class 'pytorch_lightning.plugins.precision.precision_plugin.PrecisionPlugin'> expects an str or a Dict/Namespace with a class_path entry but got "[Namespace(class_path='ray_lightning.RayPlugin', init_args=Namespace(checkpoint_io=None, cluster_environment=None, ddp_comm_state=None, init_hook=None, num_cpus_per_worker=1, num_nodes=None, num_workers=2, parallel_devices=None, sync_batchnorm=None, use_gpu=False))]"
- Type <class 'pytorch_lightning.plugins.environments.cluster_environment.ClusterEnvironment'> expects an str or a Dict/Namespace with a class_path entry but got "[Namespace(class_path='ray_lightning.RayPlugin', init_args=Namespace(checkpoint_io=None, cluster_environment=None, ddp_comm_state=None, init_hook=None, num_cpus_per_worker=1, num_nodes=None, num_workers=2, parallel_devices=None, sync_batchnorm=None, use_gpu=False))]"
- Type <class 'pytorch_lightning.plugins.io.checkpoint_plugin.CheckpointIO'> expects an str or a Dict/Namespace with a class_path entry but got "[Namespace(class_path='ray_lightning.RayPlugin', init_args=Namespace(checkpoint_io=None, cluster_environment=None, ddp_comm_state=None, init_hook=None, num_cpus_per_worker=1, num_nodes=None, num_workers=2, parallel_devices=None, sync_batchnorm=None, use_gpu=False))]"
- Expected a <class 'str'> but got "[Namespace(class_path='ray_lightning.RayPlugin', init_args=Namespace(checkpoint_io=None, cluster_environment=None, ddp_comm_state=None, init_hook=None, num_cpus_per_worker=1, num_nodes=None, num_workers=2, parallel_devices=None, sync_batchnorm=None, use_gpu=False))]"
- Value "Namespace(class_path='ray_lightning.RayPlugin', init_args=Namespace(checkpoint_io=None, cluster_environment=None, ddp_comm_state=None, init_hook=None, num_cpus_per_worker=1, num_nodes=None, num_workers=2, parallel_devices=None, sync_batchnorm=None, use_gpu=False))" does not validate against any of the types in typing.Union[pytorch_lightning.plugins.training_type.training_type_plugin.TrainingTypePlugin, pytorch_lightning.plugins.precision.precision_plugin.PrecisionPlugin, pytorch_lightning.plugins.environments.cluster_environment.ClusterEnvironment, pytorch_lightning.plugins.io.checkpoint_plugin.CheckpointIO, str]:
- __init__() got multiple values for keyword argument 'parallel_devices'
- "ray_lightning.RayPlugin" is not a subclass of PrecisionPlugin
- "ray_lightning.RayPlugin" is not a subclass of ClusterEnvironment
- "ray_lightning.RayPlugin" is not a subclass of CheckpointIO
- Expected a <class 'str'> but got "Namespace(class_path='ray_lightning.RayPlugin', init_args=Namespace(checkpoint_io=None, cluster_environment=None, ddp_comm_state=None, init_hook=None, num_cpus_per_worker=1, num_nodes=None, num_workers=2, parallel_devices=None, sync_batchnorm=None, use_gpu=False))"
- Expected a <class 'NoneType'> but got "[Namespace(class_path='ray_lightning.RayPlugin', init_args=Namespace(checkpoint_io=None, cluster_environment=None, ddp_comm_state=None, init_hook=None, num_cpus_per_worker=1, num_nodes=None, num_workers=2, parallel_devices=None, sync_batchnorm=None, use_gpu=False))]"
File "/Users/sbylaiah/miniconda3/envs/cviz/lib/python3.8/site-packages/jsonargparse/typehints.py", line 462, in adapt_typehints
raise ValueError(f'Value "{val}" does not validate against any of the types in {typehint}:{e}')
File "/Users/sbylaiah/miniconda3/envs/cviz/lib/python3.8/site-packages/jsonargparse/typehints.py", line 344, in instantiate_classes
value[num] = adapt_typehints(val, self._typehint, instantiate_classes=True, sub_add_kwargs=sub_add_kwargs)
File "/Users/sbylaiah/miniconda3/envs/cviz/lib/python3.8/site-packages/jsonargparse/core.py", line 1054, in instantiate_classes
parent[key] = component.instantiate_classes(value)
File "/Users/sbylaiah/miniconda3/envs/cviz/lib/python3.8/site-packages/jsonargparse/deprecated.py", line 127, in patched_instantiate_classes
cfg = self._unpatched_instantiate_classes(cfg, **kwargs)
File "/Users/sbylaiah/miniconda3/envs/cviz/lib/python3.8/site-packages/pytorch_lightning/utilities/cli.py", line 820, in instantiate_classes
self.config_init = self.parser.instantiate_classes(self.config)
File "/Users/sbylaiah/miniconda3/envs/cviz/lib/python3.8/site-packages/pytorch_lightning/utilities/cli.py", line 625, in __init__
self.instantiate_classes()
File "/Users/sbylaiah/Development/cibo/rs-inference/python/deepcdl/deepcdl/scripts/ray/train_deepcdl.py", line 70, in <module>
cli = LightningCLI(UNetModule, CDLDataModule, run=False)
File "/Users/sbylaiah/miniconda3/envs/cviz/lib/python3.8/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/Users/sbylaiah/miniconda3/envs/cviz/lib/python3.8/runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/Users/sbylaiah/miniconda3/envs/cviz/lib/python3.8/runpy.py", line 263, in run_path
return _run_module_code(code, init_globals, run_name,
File "/Users/sbylaiah/miniconda3/envs/cviz/lib/python3.8/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/Users/sbylaiah/miniconda3/envs/cviz/lib/python3.8/runpy.py", line 193, in _run_module_as_main (Current frame)
return _run_code(code, main_globals, None,
Using LightningCLI to parse plugin options from the config file fails when you try to use the RayPlugin.
Here's how I am specifying the plugin option in the config file:
The error stack I get is below:
The relevant error from the stack trace is this:
__init__() got multiple values for keyword argument 'parallel_devices'This seems to be because the
jsonargparseincludes the theparallel_devices=Noneandcluster_environment=Nonein theinit_args Namespace. But whensuper().__init__is called from theRayPlugin, the kwargs are passed in for parallel_devices and cluster_environment again, resulting in the multiple values error above. Looks like we don't really need to pass those in into thesuper.initcall, as these arguments are defaulted to None anyways.Version info:
ray-lightning: 0.2.0
pytorch-lightning: 1.5.10
jsonargparse: 4.2.0