PPO Agent playing LunarLander-v2

This is a trained model of a PPO agent playing LunarLander-v2 using the stable-baselines3 library.

Using Google Colab, I trained my first Deep Reinforcement Learning agent, a Lunar Lander agent that will learn to land correctly on the moon using Stable-Baselines3.

I trained the agent for 1,000,000 timesteps, resulting in a mean reward of 206.92 +/- 53.53.

To improve the model:

Train more steps
Try different hyperparameters for PPO. Check out: https://stable-baselines3.readthedocs.io/en/master/modules/ppo.html#parameters.
Try another model such as DQN

Visual of agent

replay.mp4

Information about the model

Environment: LunarLander-v2 Library: stable-baselines3 Model: Proximal Policy Optimization (PPO) Mean Reward +/- Std. Dev.: 206.92 +/- 53.53

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
ppo-LunarLander-v2		ppo-LunarLander-v2
HuggingFace_U1.ipynb		HuggingFace_U1.ipynb
README.md		README.md
config.json		config.json
gitattributes		gitattributes
ppo-LunarLander-v2.zip		ppo-LunarLander-v2.zip
replay.mp4		replay.mp4
results.json		results.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PPO Agent playing LunarLander-v2

Visual of agent

Information about the model

About

Releases

Packages

Languages

rishisim/LunarLander-v2

Folders and files

Latest commit

History

Repository files navigation

PPO Agent playing LunarLander-v2

Visual of agent

Information about the model

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages