You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+26-9Lines changed: 26 additions & 9 deletions
Original file line number
Diff line number
Diff line change
@@ -1,10 +1,13 @@
1
+
1
2
# RMC
2
3
Open-source code for [Revisiting the Monotonicity Constraint in Cooperative Multi-Agent Reinforcement Learning](https://arxiv.org/abs/2102.03479).
3
4
4
5
This repository is fine-tuned for StarCraft Multi-agent Challenge (SMAC). For other multi-agent tasks, we also recommend an optimized implementation of QMIX: https://github.com/marlbenchmark/off-policy.
5
6
6
-
7
-
## Code-level Optimizations
7
+
```
8
+
2021.10.4 update: add QMIX with attention (qmix_att.yaml) as a baseline for Communication tasks.
9
+
```
10
+
## Finetuned-QMIX
8
11
There are so many code-level tricks in the Multi-agent Reinforcement Learning (MARL), such as:
9
12
- Value function clipping (clip max Q values for QMIX)
10
13
- Value Normalization
@@ -26,8 +29,7 @@ There are so many code-level tricks in the Multi-agent Reinforcement Learning (
26
29
- What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study
27
30
- The Surprising Effectiveness of MAPPO in Cooperative, Multi-Agent Games
28
31
29
-
### Finetuned-QMIX
30
-
Using a few of tricks above (bold texts), we enabled QMIX to solve almost all hard scenarios of SMAC (fine-tuned QMIX for each scenarios).
32
+
Using a few of tricks above (bold texts), we enabled QMIX to solve almost all hard scenarios of SMAC (Fine-tuned hyperparameters for each scenarios). (StarCraft 2 version: SC2.4.10)
@@ -50,7 +52,7 @@ Using a few of tricks above (bold texts), we enabled QMIX to solve almost all ha
50
52
51
53
52
54
## Re-Evaluation
53
-
Afterwards, we re-evaluate numerous QMIX variants with normalized the tricks (a **genaral** set of hyperparameters), and find that QMIX achieves the SOTA (StarCraft 2, SC2.4.10).
55
+
Afterwards, we re-evaluate numerous QMIX variants with normalized the tricks (a **genaral** set of hyperparameters), and find that QMIX achieves the SOTA (StarCraft 2 version: SC2.4.10).
We also tested our QMIX-with-attention (qmix_att.yaml, $\lambda=0.3$, attention\_heads=4) on some maps (fron [NDQ](https://github.com/TonghanWang/NDQ)) that require communication (StarCraft 2 version: SC2.4.10).
PyMARL is [WhiRL](http://whirl.cs.ox.ac.uk)'s framework for deep multi-agent reinforcement learning and includes implementations of the following algorithms:
0 commit comments