Tags: learnables/cherry
v0.2.0 ====== Added ----- * Introduce cherry.nn.Policy, cherry.nn.ActionValue, and cherry.nn.StateValue. * Algorithm class utilities for: A2C, PPO, TRPO, DDPG, TD3, SAC, and DrQ/DrQv2. * DMC examples for SAC, DrQ, and DrQv2. * N-steps returns sampling in ExperienceReplay. Changed ------- * Discontinue most of cherry.wrappers. Fixed ----- * Fixes return value of StateNormalizer and RewardNormalizer wrappers. * Requirements to generate docs.
v0.1.4 ====== Fixed ----- * Support for torch 1.5 and new `_parse_to` behavior in ExperienceReplay. (thanks @ManifoldFR)
v0.1.3 ====== Added ----- * A CHANGELOG.md file. Changed ------- * Travis testing with different versions of Python (3.6, 3.7), torch (1.1, 1.2, 1.3, 1.4), and torchvision (0.3, 0.4, 0.5). Fixed ----- * Bugfix when using `td.discount` with replays coming from vectorized environments (@galatolofederico) * env.action_size and env.state_size when the number of vectorized environments is 1. (thanks @galatolofederico) * Actor-critic integration test being to finicky. * `cherry.onehot` support for numpy's float and integer types. (thanks @ngoby)