without.training.mp4
with.training.mp4
This is a passion project aimed at honing my skills, with the long term goal of reproducing something conceptually similar to AlphaStar. To work toward this, I built a custom environment for a miniature RTS game and applied reinforcement learning techniques to train agents within it.
The game is currently very simple. A Town Center produces villagers, which can move around the map, gather gold, and return it to buildings. Villagers can also convert into additional Town Centers or Barracks. Barracks can produce troops, which are able to deal damage to enemy units.
The reinforcement learning algorithm currently implemented is Advantage Actor Critic (A2C). Each tile of the map is one hot encoded into roughly a dozen feature channels and passed into the neural network. The policy is trained through self play. The opponent is only updated to the new policy once it achieves a significantly higher win rate compared to the previous version.
As shown above, without training the villagers wander around aimlessly. After training, the behavior of collecting resources and producing additional villagers becomes much more apparent.