Skip to content

[rllib] Best workflow to train, save, and test agent #9123

@stefanbschneider

Description

@stefanbschneider

What is your question?

This is a great framework, but after reading the documentation and playing around for weeks, I'm still struggeling to get the simple workflow working: Train a PPO agent, save a checkpoint at the end, save stats, and use the trained agent for evaluation or visualization in the end.

It starts with my confusion about the two ways of training an RL agent.
Either

trainer = PPOTrainer(env="CartPole-v0", config={"train_batch_size": 4000})
while True:
    print(trainer.train())

Which makes saving my agent simple with trainer.save(path) and I can use the trained agent afterwards for testing with trainer.compute_action(observation). But: Afaik, I cannot change the log directory, which always defaults to ~/ray-results.

Or I use ray.tune.run():

from ray import tune
tune.run(PPOTrainer, config={"env": "CartPole-v0", "train_batch_size": 4000}, local_dir=my_path, checkpoint_at_end=True)

Which allows me to configure a custom local_dir to put my logs in and create a checkpoint at the end. But: Afaik, I don't have access to my trained agent. ray.tune.run() just returns an ExperimentAnalysis object but not my trained agent nor the exact path of the checkpoints (which includes some random hash) such that I could load the agent. The experiment_id in the results does not correspond to the hash that's used in the dir name, so I cannot reconstruct the dir name.

My only resort at the moment is to split training with ray.tune.run and then loading and testing the agent into two separate steps, where I have to find and copy & past the path of the last checkpoint manually in between. Very inconvenient.

There must be a more convenient way to do what I want, right?

Ray version and other system information (Python version, TensorFlow version, OS):

  • Ray 0.8.5
  • Tensorflow 2.2.0
  • Python 3.8.3
  • OS: Ubuntu 20.04 on WSL (Win 10)

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionJust a question :)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions