Policy Iteration: Dynamic Programming

Based on Example 4.2: Jack's Car Rental from chapter 4 of the textbook, Reinforcement Learning: An Introduction by Richard S. Sutton and Andrew G. Barto (2nd edition). This repo contains the example, policy iteration algorithm, and exercise 4.7-related code.

Note that I have forced usage of some itertools recipies and higher-order functions because I wanted to practice using them.

The code has been tested with Python 3, though it could also work with Python 2 with some minor tweaks.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.py		environment.py
plotting.py		plotting.py
policy.py		policy.py
policy_iteration.py		policy_iteration.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Policy Iteration: Dynamic Programming

About

Releases

Packages

Languages

License

aayn/policy-iteration-dp

Folders and files

Latest commit

History

Repository files navigation

Policy Iteration: Dynamic Programming

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages