Find the Cheese!

The goal of this project was to design an environment where a mouse learns to find the cheese. Below are demonstrations of every 10th game (up to game 30):

How It Works

This problem is similar to the gridworld problem described in chapter 4 of Reinforcement Learning: An Introduction (Second Edition) by Richard S. Sutton and Andrew G. Barto. The differences are as follows:

The agent is given a reward of 300 for entering the terminal state.
There is only one terminal state in the bottom right.
The gridworld is 48x40 (rows x columns).

The problem is formulated as a finite undiscounted episodic MDP. To add difficulty to the problem the agent can only see at most 2 tiles in any direction and also starts in a random position every time. Every frame the value of all the visible tiles is updated using the value iteration algorithm from chapter 4. As the agent explores the gridworld the value function will eventually converge. Using the greedy policy with respect to the value function, the agent will eventually be able to find the terminal state from anywhere using the shortest possible path every time.

An interesting consequence of having a negative reward on every transition in this problem is that in the beginning the agent is motivated to go where it hasn't been before, i.e. explore the gridworld. This is because the longer time it spends in an area, the lower the expected reward will become for those tiles and the agent will move towards unexplored tiles (unexplored tiles have an initiated value of zero).

This Project Was Built Using

Pyglet (graphics)
Numpy (gridworld representation using a matrix)

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
resources		resources
utils		utils
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Find the Cheese!

How It Works

This Project Was Built Using

About

Releases

Packages

Languages

License

alexgran875/find_the_cheese

Folders and files

Latest commit

History

Repository files navigation

Find the Cheese!

How It Works

This Project Was Built Using

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages