You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+6-5Lines changed: 6 additions & 5 deletions
Original file line number
Diff line number
Diff line change
@@ -18,12 +18,13 @@ There are so many code-level tricks in the Multi-agent Reinforcement Learning (
18
18
- Reward scaling
19
19
- Orthogonal initialization and layer scaling
20
20
-**Adam**
21
+
-**Neural networks hidden size**
21
22
- learning rate annealing
22
23
- Reward Clipping
23
24
- Observation Normalization
24
25
- Gradient Clipping
25
26
-**Large Batch Size**
26
-
-**N-step Returns(including GAE($\lambda$) and Q($\lambda$))**
27
+
-**N-step Returns(including GAE($\lambda$) and Q($\lambda$) ...)**
27
28
-**Rollout Process Number**
28
29
-**$\epsilon$-greedy annealing steps**
29
30
- Death Agent Masking
@@ -50,17 +51,17 @@ Using a few of tricks above (bold texts), we enabled QMIX (qmix.yaml) to solve a
50
51
| 2c_vs_64zg | Hard |**100\%**|**100\%**|
51
52
| corridor | Super Hard | 0% |**100\%**|
52
53
| MMM2 | Super Hard | 98% |**100\%**|
53
-
| 3s5z_vs_3s6z | Super Hard | 3% |**85\%**(Number of Envs = 4) |
54
+
| 3s5z_vs_3s6z | Super Hard | 3% |**93\%**(hidden_size = 256, qmix_large.yaml) |
54
55
| 27m_vs_30m | Super Hard | 56% |**100\%**|
55
56
| 6h_vs_8z | Super Hard | 0% |**93\%**($\lambda$ = 0.3) |
56
57
57
58
58
59
## Re-Evaluation
59
-
Afterwards, we re-evaluate numerous QMIX variants with normalized the tricks (a **genaral** set of hyperparameters), and find that QMIX achieves the SOTA.
60
+
Afterwards, we re-evaluate numerous QMIX variants with normalized the tricks (a **general** set of hyperparameters), and find that QMIX achieves the SOTA.
-[**RMC**: Revisiting the Monotonicity Constraint in Cooperative Multi-Agent Reinforcement Learning](https://arxiv.org/abs/2102.03479)
106
+
-[**RIIT**: Rethinking the Implementation Tricks and Monotonicity Constraint in Cooperative Multi-Agent Reinforcement Learning.](https://arxiv.org/abs/2102.03479)
0 commit comments