Skip to content

thedevwonder/realphago

Repository files navigation

An attempt to replicate the methods used in the AlphaGo paper

Mastering the game of Go with deep neural networks and tree search

https://www.nature.com/articles/nature16961

My inspiration to develop and learn about AlphaGo came from the documentary: AlphaGo - The Movie | Full award-winning documentary. This is my first-ever neural net written and trained. I have learnt this from scratch as I implemented this.

To understand the intuition behind tree search and older methods, I followed the textbook, and reused some of the game environment from Deep Learning and the Game of Go

The following methods helped me develop an intuition around tree search:

  1. MCTS
  2. AlphaBeta pruning
  3. Random Policy

You can find the following games to compare performances:

  1. random bot vs random bot
  2. abprune bot vs random bot
  3. mcts bot vs random bot
  4. abprune bot vs mcts bot

AlphaGo uses 3 training pipelines to improve its policy and value estimation.

Supervised Learning Policy Network

I've added code to create training and test data from the KGS server games.

Steps to prepare data:

  1. Download the tar.gz game files from KGS server and put them in data dir.
  2. Run run_dataprocessor.py to generate the dataset.

The Training Pipeline for Supervised Learning is written in the notebook alphago.ipynb

Reinforcement Learning Policy Network

Uses the REINFORCE algorithm to update the policy towards winning moves. The Training pipeline for Reinforcement Learning of Policy Network is written in the notebook alphago.ipynb

Reinforcement Learning of Value Network [TODO]

About

Reverse Engineering AlphaGo

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors