Skip to content

My reproduction of various reinforcement learning algorithms (DQN variants, A3C, DPPO, RND with PPO) in Tensorflow.

License

Notifications You must be signed in to change notification settings

ChuaCheowHuan/reinforcement_learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

96 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

What's in this repository?

This repository contains codes that I have reproduced (while learning RL) for various reinforcement learning algorithms. The codes were tested on Colab.

If Github is not loading the Jupyter notebooks, a known Github issue, click here to view the notebooks on Jupyter's nbviewer.


Implemented Algorithms

Algorithms Discrete Continuous Multithreaded Multiprocessing Tested on
DQN ✔️ CartPole-v0
Double DQN (DDQN) ✔️ CartPole-v0
Dueling DDQN ✔️ CartPole-v0
Dueling DDQN + PER ✔️ CartPole-v0
A3C (1) ✔️ ✔️ ✔️ ✔️(3) CartPole-v0, Pendulum-v0
DPPO (2) ✔️ ✔️(3) Pendulum-v0
RND + PPO ✔️ MountainCarContinuous-v0 (4), Pendulum-v0 (5)

(1): N-step returns used for critic's target.
(2): GAE used for computation of TD lambda return (for critic's target) & policy's advantage.
(3): Distributed Tensorflow & Python's multiprocessing package used.
(4): State featurization (approximates feature map of an RBF kernel) is used.
(5): Fast-slow LSTM with an overly simplified VAE like "variational unit" (VU) is used.


misc folder

The misc folder contains related example codes that I have put together while learning RL. See the README.md in the misc folder for more details.


Blog

Check out my blog for more information on my repositories.