https://www.nature.com/articles/nature16961
My inspiration to develop and learn about AlphaGo came from the documentary: AlphaGo - The Movie | Full award-winning documentary. This is my first-ever neural net written and trained. I have learnt this from scratch as I implemented this.
To understand the intuition behind tree search and older methods, I followed the textbook, and reused some of the game environment from Deep Learning and the Game of Go
The following methods helped me develop an intuition around tree search:
- MCTS
- AlphaBeta pruning
- Random Policy
You can find the following games to compare performances:
AlphaGo uses 3 training pipelines to improve its policy and value estimation.
I've added code to create training and test data from the KGS server games.
Steps to prepare data:
- Download the tar.gz game files from KGS server and put them in data dir.
- Run run_dataprocessor.py to generate the dataset.
The Training Pipeline for Supervised Learning is written in the notebook alphago.ipynb
Uses the REINFORCE algorithm to update the policy towards winning moves. The Training pipeline for Reinforcement Learning of Policy Network is written in the notebook alphago.ipynb