PPO Agent playing LunarLander-v2

This is a trained model of a PPO agent playing LunarLander-v2 using the stable-baselines3 library.

Using Google Colab, I trained my first Deep Reinforcement Learning agent, a Lunar Lander agent that will learn to land correctly on the moon using Stable-Baselines3.

I trained the agent for 1,000,000 timesteps, resulting in a mean reward of 206.92 +/- 53.53.

To improve the model:

Train more steps
Try different hyperparameters for PPO. Check out: https://stable-baselines3.readthedocs.io/en/master/modules/ppo.html#parameters.
Try another model such as DQN

Visual of agent

replay.mp4

Information about the model

Environment: LunarLander-v2 Library: stable-baselines3 Model: Proximal Policy Optimization (PPO) Mean Reward +/- Std. Dev.: 206.92 +/- 53.53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

PPO Agent playing LunarLander-v2

Visual of agent

Information about the model

Files

README.md

Latest commit

History

README.md

File metadata and controls

PPO Agent playing LunarLander-v2

Visual of agent

Information about the model