This concretely means tune.run(PPOTrainer) works (printing out logs etc, similar to in a normal Ray cluster).