Tricks of A3C on TensorFlow2 + Multiprocessing

當初寫太快，中英夾雜請見諒

Introduction

在Tensorflow 2.0中，引入了動態圖的實作，取代1.x的靜態圖概念，同時Google移除了Session API，在TF 1.x中可以直接使用tf.Session來管理並分配硬體計算資源到任務上，但在2.0卻完全行不通；同時因為TF是以C實作，在開新Process的時候會導致無法Pickle現有的資源到新的Process去，正因為上述兩個原因，在TF 2.0 用Multiprocessing會遇到相當大的麻煩，網路上的範例幾乎清一色都是TF 1.x或Pytorch的實作方式。

在本文中，我們會著重在於A3C 在Tensorflow的實作上，尤其是各種奇怪的坑，如果要深究Actor-Critic和A3C的原理的話，推薦這幾篇：

Deriving Policy Gradients and Implementing REINFORCE
当我们在谈论 DRL：从AC、PG 到 A3C、DDPG

Problems

經過漫長踩坑之旅後，終於用TF 2.0 + Multiprocessing 實作出A3C，大致上可以整理出三個重點

Functional Process instead of Inherited Process Class
Use with tf.device() to specify the wanted device
Limit the CUDA_VISIBLE_DEVICES

接下來會細說各點

Problem1: Functional Process instead of Inherited Process Class

眾所皆知，在Python要Spawn一個新的Process有兩種方式，一種是繼承mutiprocessing.process並修改run() method，另一種是直接將Function傳入Process。如果使用繼承process的方式實作，會出現Cannot pickle的Error，網路上普遍的說法是因為Tensorflow底層是由C實作，所以很多物件無法轉換成Python的binary pickle檔，所以才會出現此錯誤。

Problem2: Use `with tf.device()` to specify the wanted device

在很多TF 1.x的A3C實作，可以看到都用了server = tf.train.server這個API，然後在每個Worker的Session會用tf.Session(target = server.target)，給不同Worker指定不同的計算資源，但TF2.0移除了tf.Session，如果直接在新Process呼叫Tensorflow的API的話，就會出現Blas GEMM的Error，所以如果要只用TF 2.0的API指定計算資源的話，就可以用tf.device()完成。

另外，Blas GEMM的錯誤，可以用限制Tesorflow占用的GPU memory解決

tf_config = tf.ConfigProto()
tf_config.gpu_options.allow_growth = True
tf_config.gpu_options.per_process_gpu_memory_fraction = 0.9
tf_config.allow_soft_placement = True

Reference:

Github Issue: Eager Execution error: Blas GEMM launch failed #25403

keras 或 tensorflow 调用GPU报错：Blas GEMM launch failed

Problem3: Limit the `CUDA_VISIBLE_DEVICES`

避免Run out of memory的問題，因為Tensorflow預設的執行方式會盡量Allocate所有能用的GPU記憶體來加快執行速度，如果同時間又有其他任務占用該GPU，就會導致TF沒辦法Allocate足夠資源導致錯誤，所以在有多個GPU的共用機器上，最好就直接指定一個沒有使用的GPU來使用。但如果只有一個GPU且沒有其他任務占用該GPU的話，一般來說不用設定。

Run The Demo

Talk is less, show me the code

Tensorflow2.0 + Multiprocessing DEMO on Cart Pole

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tricks_of_A3C_with_tf2.md

tricks_of_A3C_with_tf2.md

Tricks of A3C on TensorFlow2 + Multiprocessing

Introduction

Problems

Problem1: Functional Process instead of Inherited Process Class

Problem2: Use `with tf.device()` to specify the wanted device

Problem3: Limit the `CUDA_VISIBLE_DEVICES`

Run The Demo

Files

tricks_of_A3C_with_tf2.md

Latest commit

History

tricks_of_A3C_with_tf2.md

File metadata and controls

Tricks of A3C on TensorFlow2 + Multiprocessing

Introduction

Problems

Problem1: Functional Process instead of Inherited Process Class

Problem2: Use with tf.device() to specify the wanted device

Problem3: Limit the CUDA_VISIBLE_DEVICES

Run The Demo

Problem2: Use `with tf.device()` to specify the wanted device

Problem3: Limit the `CUDA_VISIBLE_DEVICES`