Skip to content

Latest commit

 

History

History
executable file
·
31 lines (23 loc) · 1.05 KB

README.md

File metadata and controls

executable file
·
31 lines (23 loc) · 1.05 KB

Reproduce SAC with PARL

Based on PARL, the SAC algorithm of deep reinforcement learning has been reproduced, reaching the same level of indicators as the paper in Mujoco benchmarks.

Paper: SAC in Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

Mujoco games introduction

Please see here to know more about Mujoco games.

Benchmark result

SAC_results

  • Each experiment was run three times with different seeds

How to use

Dependencies:

Start Training:

Train

# To train for HalfCheetah-v1(default),Hopper-v1,Walker2d-v1,Ant-v1
# --alpha 0.2(default)
python train.py --env [ENV_NAME]

# To reproduce the performance of Humanoid-v1
python train.py --env Humanoid-v1 --alpha 0.05