Skip to content

🔍 Codebase for the ICML '20 paper "Ready Policy One: World Building Through Active Learning" (arxiv: 2002.02693)

License

Notifications You must be signed in to change notification settings

philipjball/ReadyPolicyOne

Repository files navigation

Ready Policy One (RP1)

Code to complement "Ready Policy One: World Building through Active Learning".

Trains an agent inside an ensemble of dynamics models.

General instructions

All the experiments run in this paper exist in the args_yml directory. On the machines we trained on, we could run 5 seeds concurrently, hence the macro-level script run_experiments.py launches 5 at once, with a binary flag to toggle seeds 0-4 or 5-9.

To run the HalfCheetah Ready Policy One experiments for seeds 5-9, type the following:

python run_experiments.py --yaml ./args_yml/main_exp/halfcheetah-rp1.yml --seeds5to9

Citation

@article{rpone2020,
title={Ready Policy One: World Building Through Active Learning},
author={Ball, Philip and Parker-Holder, Jack and Pacchiano, Aldo and Choromanski, Krzysztof and Roberts, Stephen},
journal={Proceedings of the 37th International Conference on Machine Learning},
year={2020}
}

FAQs

Why is model free running so slowly?

Two reasons: 1) It is non-parallelised; 2) This code tries to find GPUs where possible, try forcing it to run on CPU

Acknowledgements

The authors acknowledge Nikhil Barhate for his PPO-PyTorch repo. The ppo.py file here is a heavily modified version of this code.

About

🔍 Codebase for the ICML '20 paper "Ready Policy One: World Building Through Active Learning" (arxiv: 2002.02693)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages