In one experiment (muzero_atari_conf.py), the max_gpu_reserved is 8491MB, the max_gpu_allocated is 5892, a whopping 2.5GB difference. This can limit the batch size used by the algorithm.
Investigating whether it is possible to reduce max_gpu_reserved.