Release TensorFlow Agents 1.1.0 · google-research/batch-ppo

Features:

Policy networks are now defined as functions mapping sequences of observations to sequences of actions. As a result, feed forward policies are faster now, and memory based agents are easier to implement. Previously, networks were restricted to be defined as RNNCells.
All functions of the agent interface receive a tensor of agent indices now. This adds the flexibility to process observations in smaller batches. Previously, perform() and experience() was defined on data from all the environments.

Provide feedback