Name		Name	Last commit message	Last commit date
parent directory ..
configs		configs
reproduce		reproduce
wrappers		wrappers
README.md		README.md
actor.py		actor.py
common.py		common.py
create_gif.py		create_gif.py
critic.py		critic.py
dataset_utils.py		dataset_utils.py
evaluation.py		evaluation.py
learner.py		learner.py
policy.py		policy.py
requirements.txt		requirements.txt
requirements_exact.txt		requirements_exact.txt
test.py		test.py
train_finetune.py		train_finetune.py
train_offline.py		train_offline.py
value_net.py		value_net.py

README.md

Offline Reinforcement Learning with Extreme Q-Learning

How to run the code

Install dependencies

These are the same setup instructions as in Implicit Q-Learning.

pip install --upgrade pip

pip install -r requirements.txt

# Installs the wheel compatible with Cuda 11 and cudnn 8.
pip install --upgrade "jax[cuda]>=0.2.27" -f https://storage.googleapis.com/jax-releases/jax_releases.html

Also, see other configurations for CUDA here.

Example training code

Locomotion

python train_offline.py --env_name=halfcheetah-medium-expert-v2 --config=configs/mujoco_config.py --max_clip=5 --sample_random_times=1 --temp=1

AntMaze

python train_offline.py --env_name=antmaze-large-play-v0 --config=configs/antmaze_config.py --eval_episodes=100 --eval_interval=100000  --max_clip=5  --temp=0.8

Kitchen and Adroit

python train_offline.py --env_name=pen-human-v0 --config=configs/kitchen_config.py --max_clip=5 --sample_random_times=1 --temp=8

Finetuning on AntMaze tasks

python train_finetune.py --env_name=antmaze-large-play-v0 --config=configs/antmaze_finetune_config.py --eval_episodes=100 --eval_interval=100000 --replay_buffer_size 2000000 --max_clip=5 --num_v_updates=4  --temp=0.8 --num_pretraining_steps=1000000

Reproduction

For reproducing our experiments, please run the scripts in the reproduce folder for the settings we use for each environment.

This code was built on top of the IQL codebase here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

offline

offline

README.md

Offline Reinforcement Learning with Extreme Q-Learning

How to run the code

Install dependencies

Example training code

Reproduction

Files

offline

Directory actions

More options

Directory actions

More options

Latest commit

History

offline

Folders and files

parent directory

README.md

Offline Reinforcement Learning with Extreme Q-Learning

How to run the code

Install dependencies

Example training code

Reproduction