RL-Adventure-2: Policy Gradients

PyTorch tutorial of: actor critic / proximal policy optimization / acer / ddpg / twin dueling ddpg / soft actor critic / generative adversarial imitation learning / hindsight experience replay

The deep reinforcement learning community has made several improvements to the policy gradient algorithms. This tutorial presents latest extensions in the following order:

Advantage Actor Critic (A2C)

High-Dimensional Continuous Control Using Generalized Advantage Estimation

Proximal Policy Optimization Algorithms

Sample Efficient Actor-Critic with Experience Replay

Continuous control with deep reinforcement learning

Addressing Function Approximation Error in Actor-Critic Methods

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

Generative Adversarial Imitation Learning

Hindsight Experience Replay

If you get stuck…

Remember you are not stuck unless you have spent more than a week on a single algorithm. It is perfectly normal if you do not have all the required knowledge of mathematics and CS.
Carefully go through the paper. Try to see what is the problem the authors are solving. Understand a high-level idea of the approach, then read the code (skipping the proofs), and after go over the mathematical details and proofs.

RL Algorithms

Deep Q Learning tutorial: DQN Adventure: from Zero to State of the Art Awesome RL libs: rlkit @vitchyr, pytorch-a2c-ppo-acktr @ikostrikov, ACER @Kaixhin

Best RL courses

Berkeley deep RL link
Deep RL Bootcamp link
David Silver's course link
Practical RL link

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.ipynb_checkpoints		.ipynb_checkpoints
.vscode		.vscode
__pycache__		__pycache__
common		common
custom_gym		custom_gym
env		env
jupyternotebooks		jupyternotebooks
.gitignore		.gitignore
README.md		README.md
add_pid.py		add_pid.py
expert_traj.npy		expert_traj.npy
finding_random_states_and_actions.py		finding_random_states_and_actions.py
init_actor_critic.py		init_actor_critic.py
init_main.py		init_main.py
main.py		main.py
model_u_v_x_y_constant_hid_size_400.pt		model_u_v_x_y_constant_hid_size_400.pt
neural_network.py		neural_network.py
ppo_method.py		ppo_method.py
reward_step.csv		reward_step.csv
try.ipynb		try.ipynb
try.py		try.py
try_PID.py		try_PID.py
try_neural_network.py		try_neural_network.py
try_pid_2.py		try_pid_2.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RL-Adventure-2: Policy Gradients

If you get stuck…

RL Algorithms

Best RL courses

About

Releases

Packages

Contributors 4

Languages

Kamyab-Majid/ppo_RL

Folders and files

Latest commit

History

Repository files navigation

RL-Adventure-2: Policy Gradients

If you get stuck…

RL Algorithms

Best RL courses

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages