Skip to content

Latest commit

 

History

History
22 lines (15 loc) · 1.04 KB

README.md

File metadata and controls

22 lines (15 loc) · 1.04 KB

PPO Agent playing LunarLander-v2

This is a trained model of a PPO agent playing LunarLander-v2 using the stable-baselines3 library.

Using Google Colab, I trained my first Deep Reinforcement Learning agent, a Lunar Lander agent that will learn to land correctly on the moon using Stable-Baselines3.

I trained the agent for 1,000,000 timesteps, resulting in a mean reward of 206.92 +/- 53.53.

To improve the model:

Visual of agent

replay.mp4

Information about the model

Environment: LunarLander-v2 Library: stable-baselines3 Model: Proximal Policy Optimization (PPO) Mean Reward +/- Std. Dev.: 206.92 +/- 53.53