Inspiration
I am interested in learning different AI algorithms/toolkits and wanted to explore OpenAI Gym's capabilities. OpenAI Gym is an open source Python library for developing and comparing reinforcement learning algorithms by providing a standard API to communicate between learning algorithms and environments.
What it does
This application trains an AI to solve the Frozen Lake environment using reinforcement learning. Frozen Lake is a simple environment composed of tiles, where the AI has to move from an initial tile to a goal. Tiles can be a safe frozen lake, or a hole that gets you stuck forever. The AI, or agent, has four possible actions: go left, down, right, or up. The agent must learn to avoid holes in order to reach the goal in a minimal number of actions.
How I built it
I used Q-Learning to train the agent. Q-Learning is a basic form of Reinforcement Learning which uses Q-values (action values) to iteratively improve the behavior of the learning agent. I first initialized the environment with the gym library. Next, I created a Q-table with 16 rows and 4 columns. Rows represent every state, and the columns represent every action. There are 16 tiles, which means the agent can be found in 16 different states. For each state, there are 4 possible actions (go left, down, right, or up). I initialized the Q-table values with all zeros. This represents the value of each action in each state. As the AI learns, these values are updated using the Bellman equation. Once training was completed, I used the final version of the Q-table to test the AI.
The OpenAI Gym documentation and online tutorials were very helpful in guiding me step by step through this process.
Challenges I ran into
Windows machine did not seem to work for all OpenAI Gym projects, MAC worked better. Jupyter Notebook experienced Kernel issues when running OpenAI Gym code; restarting the kernel each time resolved the issue. I am still exploring better solutions to this issue.
Accomplishments that I am proud of
I am happy that I was able to fully train the AI and that the tests passed. The calculations and plots proved that the algorithm had indeed learned.
What I learned
I learned how to use OpenAI Gym to train AI. I also learned to test and analyze the trained AI using NumPy for calculations and MatPlotLib for visuals. I also learned that it takes a lot of patience to train the AI, as it may take a very long time and various rounds of modifying/optimizing the rewards and algorithms.
What's next
I really want to use OpenAI Gym to train a self-driving car next. I plan to start by learning to code the taxi problem: https://www.gymlibrary.dev/environments/toy_text/taxi/
Log in or sign up for Devpost to join the conversation.