Based on Example 4.2: Jack's Car Rental from chapter 4 of the textbook, Reinforcement Learning: An Introduction by Richard S. Sutton and Andrew G. Barto (2nd edition). This repo contains the example, policy iteration algorithm, and exercise 4.7-related code.
Note that I have forced usage of some itertools recipies and higher-order functions because I wanted to practice using them.
The code has been tested with Python 3, though it could also work with Python 2 with some minor tweaks.