An implementation of an algorithm that plays TicTacToe. The algorithm is based in reinforcement learning, using the Monte Carlo Algorithm.
Author: Oliver Zhang
Last Modified: 3/19/18
Goal: Learn how to play Tic Tac Toe. I'm using the implementation of TicTacToe by nczempin:
I use Monte Carlo Learning to train a model which predicts the value of an action given a state. Observation is the board state. My code then makes every possible move, and picks the best resulting board state.
Then it can learn from its wins/losses and figure out which board state is actually the best.
For learning reinforcement learning, I suggest David Silver's youtube lectures
- Copy the files and to your computer.
- Add the TicTacToe environment to your gym. Check here for more details:
- Modify path variable to point to a folder for saving weights.
- Run it on python3.
- By changing 'debug' variable to true, you can print debugging information.
- By changing 'display_img' variable to true, you can visualize what your program is doing.
Note: This version is pretty messy; I will be cleaning up the code in the future.