Diverse PSRO is a variation of the Policy Space Response Oracle algorithm which promotes training a behaviourally diverse set of policies by using the theory of determinantal point processes (DPPs). This approach allows to train less exploitable more diverse strategies as well as bringing a new geometrically interpretable way of measuring population diversity.
The code on this repository can be run by cloning the repository
git clone https://github.com/diversepsro/diverse_psroCreating a new Anaconda environment
conda env create -f environment.yml
conda activate diverse_psroYou can now run Random Games of Skill by executing
python3 random_games_skill.pyYou can now run Real World Meta-Games by executing
python3 spinning_tops_dpp.pyYou can now run Non-transitive Mixture Model by executing
python3 non_mixture_model.pyDiverse PSRO is evaluated in three different settings, each of them using a different version of diverse oracle.
| Game | Oracle |
|---|---|
| Random Games of Skill | Diverse BR |
| Real World Meta-Games | Diverse BR |
| Non-transitive mixture model | Diverse gradient ascent |


