This exercise introduces the fundamentals of dynamic programming based on our knowledge about MDP.
- policy evaluation for a stochastic policy
- exhaustive policy search and it's computational effort
- value iteration within a deterministic environment
- value iteration within a stochastic environment