In the current ray strategy, since the strategy shows in three places: two are obvious and one is hidden.
one is in the ray launcher:
class RayLauncher(_SpawnLauncher):
def __init__(self, strategy: "RayPlugin") -> None:
self._strategy = strategy
self._start_method = "ray"
self._workers = []
self._futures = []
self._master_addr = None
https://github.com/JiahaoYao/ray_lightning/blob/2727fd441a62e0e6763fd1f25ed97575dc5a6733/ray_lightning/ray_ddp.py#L38-L48
And later we use these in _wrapped_function_
https://github.com/JiahaoYao/ray_lightning/blob/main/ray_lightning/ray_ddp.py#L241-L242
self._strategy.set_remote(True)
self._strategy.set_global_to_local(global_to_local)
The second is an attributed in the trainer.strategy.
The last hidden one is in the
https://github.com/JiahaoYao/ray_lightning/blob/2727fd441a62e0e6763fd1f25ed97575dc5a6733/ray_lightning/ray_ddp.py#L222-L226
self._futures = [
w.execute.remote(self._wrapping_function, i, self._global_to_local,
trainer, function, args, kwargs, self.tune_queue)
for i, w in enumerate(self._workers)
]
ray remote functions create the copy of trainer.
Thus, the actual call of the strategy.teardown is the one from the copies of trainer.
support of the assumption is


printing out the pid of strategy, and it turns out they are different.
Proposal: might removing the redundant use of strategy