[rllib] pi.update_kl(fetches[pi_id]["kl"]) KeyError: 'kl'

### Describe the problem


I have been trying to do multi-agent training using the `PPOTrainer` class. 
I got the following error:

```console
2019-05-22 19:22:33,151	ERROR trial_runner.py:497 -- Error processing event.
Traceback (most recent call last):
  File "/anaconda3/envs/flow/lib/python3.7/site-packages/ray/tune/trial_runner.py", line 446, in _process_trial
    result = self.trial_executor.fetch_result(trial)
  File "/anaconda3/envs/flow/lib/python3.7/site-packages/ray/tune/ray_trial_executor.py", line 316, in fetch_result
    result = ray.get(trial_future[0])
  File "/anaconda3/envs/flow/lib/python3.7/site-packages/ray/worker.py", line 2197, in get
    raise value
ray.exceptions.RayTaskError: ray_PPOTrainer:train() (pid=98952, host=Nathans-MBP)
  File "/anaconda3/envs/flow/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 354, in train
    raise e
  File "/anaconda3/envs/flow/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 340, in train
    result = Trainable.train(self)
  File "/anaconda3/envs/flow/lib/python3.7/site-packages/ray/tune/trainable.py", line 151, in train
    result = self._train()
  File "/anaconda3/envs/flow/lib/python3.7/site-packages/ray/rllib/agents/ppo/ppo.py", line 133, in _train
    self.local_evaluator.foreach_trainable_policy(update)
  File "/anaconda3/envs/flow/lib/python3.7/site-packages/ray/rllib/evaluation/policy_evaluator.py", line 638, in foreach_trainable_policy
    func(policy, pid) for pid, policy in self.policy_map.items()
  File "/anaconda3/envs/flow/lib/python3.7/site-packages/ray/rllib/evaluation/policy_evaluator.py", line 639, in <listcomp>
    if pid in self.policies_to_train
  File "/anaconda3/envs/flow/lib/python3.7/site-packages/ray/rllib/agents/ppo/ppo.py", line 127, in update
    pi.update_kl(fetches[pi_id]["kl"])
KeyError: 'kl'
```

Going into the problematic file

```python
        if "kl" in fetches:
            # single-agent
            self.local_evaluator.for_policy(
                lambda pi: pi.update_kl(fetches["kl"]))
        else:

            def update(pi, pi_id):
                if pi_id in fetches:
                    pi.update_kl(fetches[pi_id]["kl"])  # key error here
                else:
                    logger.debug(
                        "No data for {}, not updating kl".format(pi_id))
```

I printed `fetches` and got the following:

```console
(pid=98952) {'adversary': {'learner_stats': {'cur_kl_coeff': 0.2,
(pid=98952)                                  'cur_lr': 4.999999873689376e-05,
(pid=98952)                                  'entropy': 1.4204963,
(pid=98952)                                  'kl': 2.2946597e-05,
(pid=98952)                                  'model': {},
(pid=98952)                                  'policy_loss': 2.1760285,
(pid=98952)                                  'total_loss': 7.3781013,
(pid=98952)                                  'vf_explained_var': -2.2411346e-05,
(pid=98952)                                  'vf_loss': 5.2020683}},
(pid=98952)  'av': {'learner_stats': {'cur_kl_coeff': 0.2,
(pid=98952)                           'cur_lr': 4.999999873689376e-05,
(pid=98952)                           'entropy': 1.4173344,
(pid=98952)                           'kl': 2.3267268e-05,
(pid=98952)                           'model': {},
(pid=98952)                           'policy_loss': -2.187273,
(pid=98952)                           'total_loss': 3.0322573,
(pid=98952)                           'vf_explained_var': -0.00042819977,
(pid=98952)                           'vf_loss': 5.219526}}}
```

Shouldn't the function `get_learner_stats` inside `metrics.py` be called on `fetches` in order to remove the middle `'learner_stats'` layer? Doing this made the code work for me, but I might have missed something.

Looking into the issues, I saw a solved one mentioning `KeyError: 'kl'` but I still have the error (and I think it was for single agent anyway). I installed ray using `pip install -U ray`.

Any insights? Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[rllib] pi.update_kl(fetches[pi_id]["kl"]) KeyError: 'kl' #4839

Describe the problem

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[rllib] pi.update_kl(fetches[pi_id]["kl"]) KeyError: 'kl' #4839

Description

Describe the problem

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions