Skip to content

Commit 86ba5d5

Browse files
kamil-kaczmarekaslonnieavibasnet31dayshahsampan-s-nayak
authored
[RLlib] Add tests for the Footsies environment (#55041)
<!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? * RLlib tests for Footsies: multi-agent / self-play reinforcement learning environment (for two-players). <!-- Please give a short summary of the change and the problem this solves. --> ## Related issue number n.a. <!-- For example: "Closes #1234" --> ## Checks - [x] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [x] I've run `scripts/format.sh` to lint the changes in this PR. - [x] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [x] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: Kamil Kaczmarek <[email protected]> Signed-off-by: Lonnie Liu <[email protected]> Signed-off-by: avibasnet31 <[email protected]> Signed-off-by: sampan <[email protected]> Signed-off-by: harshit <[email protected]> Signed-off-by: dragongu <[email protected]> Signed-off-by: Balaji Veeramani <[email protected]> Signed-off-by: Yiwen Xiang <[email protected]> Signed-off-by: Yevet <[email protected]> Signed-off-by: Linkun Chen <[email protected]> Signed-off-by: lkchen <[email protected]> Signed-off-by: Ryan O'Leary <[email protected]> Signed-off-by: Ryan O'Leary <[email protected]> Signed-off-by: Potato <[email protected]> Signed-off-by: cong.qian <[email protected]> Signed-off-by: joshlee <[email protected]> Signed-off-by: elliot-barn <[email protected]> Signed-off-by: Stephanie wang <[email protected]> Signed-off-by: Stephanie Wang <[email protected]> Signed-off-by: Mengjin Yan <[email protected]> Signed-off-by: Goku Mohandas <[email protected]> Signed-off-by: doyoung <[email protected]> Signed-off-by: dayshah <[email protected]> Signed-off-by: Daraan <[email protected]> Signed-off-by: Goutam V <[email protected]> Signed-off-by: Rui Qiao <[email protected]> Signed-off-by: Cuong Nguyen <[email protected]> Signed-off-by: omkar <[email protected]> Signed-off-by: Seiji Eicher <[email protected]> Signed-off-by: lmsh7 <[email protected]> Signed-off-by: lmsh7 <[email protected]> Signed-off-by: irabbani <[email protected]> Signed-off-by: Ibrahim Rabbani <[email protected]> Signed-off-by: Kamil Kaczmarek <[email protected]> Signed-off-by: Matthew Deng <[email protected]> Signed-off-by: xgui <[email protected]> Signed-off-by: Xinyuan <[email protected]> Signed-off-by: Emanuele Petriglia <[email protected]> Signed-off-by: Sampan S Nayak <[email protected]> Signed-off-by: Chi-Sheng Liu <[email protected]> Signed-off-by: Rueian <[email protected]> Signed-off-by: Rueian <[email protected]> Signed-off-by: zhilong <[email protected]> Signed-off-by: zhaoch23 <[email protected]> Signed-off-by: zhilong <[email protected]> Signed-off-by: Ibrahim Rabbani <[email protected]> Signed-off-by: Krishna Kalyan <[email protected]> Signed-off-by: myan <[email protected]> Signed-off-by: abrar <[email protected]> Signed-off-by: iamjustinhsu <[email protected]> Signed-off-by: Lonnie Liu <[email protected]> Signed-off-by: Edward Oakes <[email protected]> Signed-off-by: avigyabb <[email protected]> Signed-off-by: akyang-anyscale <[email protected]> Signed-off-by: Lehui Liu <[email protected]> Signed-off-by: fscnick <[email protected]> Signed-off-by: Alexey Kudinkin <[email protected]> Signed-off-by: Ricardo Decal <[email protected]> Signed-off-by: Matthew Owen <[email protected]> Signed-off-by: JasonLi1909 <[email protected]> Signed-off-by: Jason Li <[email protected]> Co-authored-by: Lonnie Liu <[email protected]> Co-authored-by: avibasnet31 <[email protected]> Co-authored-by: Dhyey Shah <[email protected]> Co-authored-by: Sampan S Nayak <[email protected]> Co-authored-by: sampan <[email protected]> Co-authored-by: harshit-anyscale <[email protected]> Co-authored-by: dragongu <[email protected]> Co-authored-by: Balaji Veeramani <[email protected]> Co-authored-by: Yevet <[email protected]> Co-authored-by: Kai-Hsun Chen <[email protected]> Co-authored-by: lkchen <[email protected]> Co-authored-by: Ryan O'Leary <[email protected]> Co-authored-by: Jiajun Yao <[email protected]> Co-authored-by: Mengjin Yan <[email protected]> Co-authored-by: Potato <[email protected]> Co-authored-by: coqian <[email protected]> Co-authored-by: Joshua Lee <[email protected]> Co-authored-by: Elliot Barnwell <[email protected]> Co-authored-by: Stephanie Wang <[email protected]> Co-authored-by: Edward Oakes <[email protected]> Co-authored-by: Goku Mohandas <[email protected]> Co-authored-by: angelinalg <[email protected]> Co-authored-by: Masoud <[email protected]> Co-authored-by: Rueian <[email protected]> Co-authored-by: Doyoung Kim <[email protected]> Co-authored-by: Daniel Sperber <[email protected]> Co-authored-by: Sven Mika <[email protected]> Co-authored-by: goutamvenkat-anyscale <[email protected]> Co-authored-by: Rui Qiao <[email protected]> Co-authored-by: Cuong Nguyen <[email protected]> Co-authored-by: Omkar Kulkarni <[email protected]> Co-authored-by: Seiji Eicher <[email protected]> Co-authored-by: lmsh7 <[email protected]> Co-authored-by: Ibrahim Rabbani <[email protected]> Co-authored-by: matthewdeng <[email protected]> Co-authored-by: Xinyuan <[email protected]> Co-authored-by: Justin Yu <[email protected]> Co-authored-by: Emanuele Petriglia <[email protected]> Co-authored-by: Chi-Sheng Liu <[email protected]> Co-authored-by: Rueian <[email protected]> Co-authored-by: zhilong <[email protected]> Co-authored-by: zhaoch23 <[email protected]> Co-authored-by: Krishna Kalyan <[email protected]> Co-authored-by: Nikhil G <[email protected]> Co-authored-by: Qiaolin Yu <[email protected]> Co-authored-by: Abrar Sheikh <[email protected]> Co-authored-by: Saihajpreet Singh <[email protected]> Co-authored-by: iamjustinhsu <[email protected]> Co-authored-by: Seiji Eicher <[email protected]> Co-authored-by: Balaji Veeramani <[email protected]> Co-authored-by: avigyabb <[email protected]> Co-authored-by: akyang-anyscale <[email protected]> Co-authored-by: Lehui Liu <[email protected]> Co-authored-by: fscnick <[email protected]> Co-authored-by: Alexey Kudinkin <[email protected]> Co-authored-by: Ricardo Decal <[email protected]> Co-authored-by: Matthew Owen <[email protected]> Co-authored-by: iamjustinhsu <[email protected]> Co-authored-by: Jason Li <[email protected]> Co-authored-by: simonsays1980 <[email protected]>
1 parent fd63d20 commit 86ba5d5

20 files changed

+2262
-24
lines changed

doc/source/rllib/rllib-examples.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -363,6 +363,11 @@ Multi-agent RL
363363
Uses OpenSpiel to demonstrate league-based self-play, where agents play against various
364364
versions of themselves, frozen or in-training, to improve through competitive interaction.
365365

366+
- `Self-play with Footsies and PPO algorithm <https://github.com/ray-project/ray/blob/master/rllib/tuned_examples/ppo/multi_agent_footsies_ppo.py>`__:
367+
Implements self-play with the Footsies environment (two player zero-sum game).
368+
This example highlights RLlib's capabilities in connecting to the external binaries running the game engine, as well as
369+
setting up a multi-agent self-play training scenario.
370+
366371
- `Self-play with OpenSpiel <https://github.com/ray-project/ray/blob/master/rllib/examples/multi_agent/self_play_with_open_spiel.py>`__:
367372
Similar to the league-based self-play, but simpler. This script leverages OpenSpiel for two-player games, allowing agents to improve
368373
through direct self-play without building a complex, structured league.

pyproject.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,8 @@ extend-exclude = [
1010
"python/build/",
1111
"python/ray/workflow/tests/mock_server.py",
1212
"python/ray/serve/tests/test_config_files/syntax_error.py",
13+
"rllib/examples/envs/classes/multi_agent/footsies/game/proto/footsies_service_pb2.py",
14+
"rllib/examples/envs/classes/multi_agent/footsies/game/proto/footsies_service_pb2_grpc.py",
1315
]
1416

1517
[tool.ruff.lint]

rllib/BUILD

Lines changed: 117 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1538,6 +1538,86 @@ py_test(
15381538
],
15391539
)
15401540

1541+
# Footsies
1542+
py_test(
1543+
name = "learning_tests_multi_agent_footsies_ppo",
1544+
size = "large",
1545+
srcs = ["tuned_examples/ppo/multi_agent_footsies_ppo.py"],
1546+
args = [
1547+
"--as-test",
1548+
"--num-env-runners=6",
1549+
"--evaluation-num-env-runners=2",
1550+
],
1551+
main = "tuned_examples/ppo/multi_agent_footsies_ppo.py",
1552+
tags = [
1553+
"exclusive",
1554+
"learning_tests",
1555+
"learning_tests_discrete",
1556+
"team:rllib",
1557+
],
1558+
)
1559+
1560+
py_test(
1561+
name = "learning_tests_multi_agent_footsies_ppo_gpu",
1562+
size = "large",
1563+
srcs = ["tuned_examples/ppo/multi_agent_footsies_ppo.py"],
1564+
args = [
1565+
"--as-test",
1566+
"--num-env-runners=20",
1567+
"--evaluation-num-env-runners=3",
1568+
"--num-learners=1",
1569+
"--num-gpus-per-learner=1",
1570+
],
1571+
main = "tuned_examples/ppo/multi_agent_footsies_ppo.py",
1572+
tags = [
1573+
"exclusive",
1574+
"learning_tests",
1575+
"learning_tests_discrete",
1576+
"multi_gpu",
1577+
"team:rllib",
1578+
],
1579+
)
1580+
1581+
py_test(
1582+
name = "learning_tests_multi_agent_footsies_ppo_multi_cpu",
1583+
size = "large",
1584+
srcs = ["tuned_examples/ppo/multi_agent_footsies_ppo.py"],
1585+
args = [
1586+
"--as-test",
1587+
"--num-env-runners=6",
1588+
"--evaluation-num-env-runners=2",
1589+
"--num-learners=2",
1590+
],
1591+
main = "tuned_examples/ppo/multi_agent_footsies_ppo.py",
1592+
tags = [
1593+
"exclusive",
1594+
"learning_tests",
1595+
"learning_tests_discrete",
1596+
"team:rllib",
1597+
],
1598+
)
1599+
1600+
py_test(
1601+
name = "learning_tests_multi_agent_footsies_ppo_multi_gpu",
1602+
size = "large",
1603+
srcs = ["tuned_examples/ppo/multi_agent_footsies_ppo.py"],
1604+
args = [
1605+
"--as-test",
1606+
"--num-env-runners=20",
1607+
"--evaluation-num-env-runners=3",
1608+
"--num-learners=2",
1609+
"--num-gpus-per-learner=1",
1610+
],
1611+
main = "tuned_examples/ppo/multi_agent_footsies_ppo.py",
1612+
tags = [
1613+
"exclusive",
1614+
"learning_tests",
1615+
"learning_tests_discrete",
1616+
"multi_gpu",
1617+
"team:rllib",
1618+
],
1619+
)
1620+
15411621
# Pendulum
15421622
py_test(
15431623
name = "learning_tests_pendulum_ppo",
@@ -4084,14 +4164,14 @@ py_test(
40844164
# subdirectory: envs/
40854165
# ....................................
40864166
py_test(
4087-
name = "examples/envs/agents_act_simultaneously",
4167+
name = "examples/envs/agents_act_in_sequence",
40884168
size = "medium",
4089-
srcs = ["examples/envs/agents_act_simultaneously.py"],
4169+
srcs = ["examples/envs/agents_act_in_sequence.py"],
40904170
args = [
40914171
"--num-agents=2",
40924172
"--stop-iters=3",
40934173
],
4094-
main = "examples/envs/agents_act_simultaneously.py",
4174+
main = "examples/envs/agents_act_in_sequence.py",
40954175
tags = [
40964176
"examples",
40974177
"exclusive",
@@ -4100,14 +4180,14 @@ py_test(
41004180
)
41014181

41024182
py_test(
4103-
name = "examples/envs/agents_act_in_sequence",
4183+
name = "examples/envs/agents_act_simultaneously",
41044184
size = "medium",
4105-
srcs = ["examples/envs/agents_act_in_sequence.py"],
4185+
srcs = ["examples/envs/agents_act_simultaneously.py"],
41064186
args = [
41074187
"--num-agents=2",
41084188
"--stop-iters=3",
41094189
],
4110-
main = "examples/envs/agents_act_in_sequence.py",
4190+
main = "examples/envs/agents_act_simultaneously.py",
41114191
tags = [
41124192
"examples",
41134193
"exclusive",
@@ -5014,13 +5094,34 @@ py_test(
50145094
)
50155095

50165096
py_test(
5017-
name = "examples/multi_agent/shared_encoder_cartpole",
5018-
size = "medium",
5019-
srcs = ["examples/multi_agent/shared_encoder_cartpole.py"],
5097+
name = "examples/multi_agent/self_play_footsies",
5098+
size = "large",
5099+
srcs = ["examples/multi_agent/self_play_footsies.py"],
50205100
args = [
5021-
"--stop-iter=10",
5101+
"--as-test",
5102+
"--num-cpus=4",
50225103
],
5023-
main = "examples/multi_agent/shared_encoder_cartpole.py",
5104+
main = "examples/multi_agent/self_play_footsies.py",
5105+
tags = [
5106+
"examples",
5107+
"examples_use_all_core",
5108+
"exclusive",
5109+
"team:rllib",
5110+
],
5111+
)
5112+
5113+
py_test(
5114+
name = "examples/multi_agent/self_play_league_based_with_open_spiel_connect_4_ppo_torch",
5115+
size = "large",
5116+
srcs = ["examples/multi_agent/self_play_league_based_with_open_spiel.py"],
5117+
args = [
5118+
"--framework=torch",
5119+
"--env=connect_four",
5120+
"--win-rate-threshold=0.8",
5121+
"--num-episodes-human-play=0",
5122+
"--min-league-size=8",
5123+
],
5124+
main = "examples/multi_agent/self_play_league_based_with_open_spiel.py",
50245125
tags = [
50255126
"examples",
50265127
"exclusive",
@@ -5090,17 +5191,13 @@ py_test(
50905191
)
50915192

50925193
py_test(
5093-
name = "examples/multi_agent/self_play_league_based_with_open_spiel_connect_4_ppo_torch",
5094-
size = "large",
5095-
srcs = ["examples/multi_agent/self_play_league_based_with_open_spiel.py"],
5194+
name = "examples/multi_agent/shared_encoder_cartpole",
5195+
size = "medium",
5196+
srcs = ["examples/multi_agent/shared_encoder_cartpole.py"],
50965197
args = [
5097-
"--framework=torch",
5098-
"--env=connect_four",
5099-
"--win-rate-threshold=0.8",
5100-
"--num-episodes-human-play=0",
5101-
"--min-league-size=8",
5198+
"--stop-iter=10",
51025199
],
5103-
main = "examples/multi_agent/self_play_league_based_with_open_spiel.py",
5200+
main = "examples/multi_agent/shared_encoder_cartpole.py",
51045201
tags = [
51055202
"examples",
51065203
"exclusive",

rllib/algorithms/algorithm.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2211,11 +2211,11 @@ def add_module(
22112211
EnvRunnerGroup (with its o EnvRunners plus the local one).
22122212
22132213
Returns:
2214-
The new MultiAgentRLModuleSpec (after the RLModule has been added).
2214+
The new MultiRLModuleSpec (after the RLModule has been added).
22152215
"""
22162216
validate_module_id(module_id, error=True)
22172217

2218-
# The to-be-returned new MultiAgentRLModuleSpec.
2218+
# The to-be-returned new MultiRLModuleSpec.
22192219
multi_rl_module_spec = None
22202220

22212221
if not self.config.is_multi_agent:
@@ -2337,9 +2337,9 @@ def remove_module(
23372337
EnvRunnerGroup (with its o EnvRunners plus the local one).
23382338
23392339
Returns:
2340-
The new MultiAgentRLModuleSpec (after the RLModule has been removed).
2340+
The new MultiRLModuleSpec (after the RLModule has been removed).
23412341
"""
2342-
# The to-be-returned new MultiAgentRLModuleSpec.
2342+
# The to-be-returned new MultiRLModuleSpec.
23432343
multi_rl_module_spec = None
23442344

23452345
# Remove RLModule from the LearnerGroup.

rllib/algorithms/algorithm_config.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -143,6 +143,7 @@ def DEFAULT_AGENT_TO_MODULE_MAPPING_FN(agent_id, episode):
143143
# Map any agent ID to "default_policy".
144144
return DEFAULT_MODULE_ID
145145

146+
# @OldAPIStack
146147
# TODO (sven): Deprecate in new API stack.
147148
@staticmethod
148149
def DEFAULT_POLICY_MAPPING_FN(aid, episode, worker, **kwargs):
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
# Footsies Environment
2+
3+
This environment implementation is based on the [FootsiesGym project](https://github.com/chasemcd/FootsiesGym),
4+
specifically the version as of **July 28, 2025**.
5+
6+
## Notes
7+
8+
All examples in the RLlib documentation that use the Footsies environment are self-contained.
9+
This means that you do not need to install anything from the FootsiesGym repository or other places.
10+
Examples handle binary automatically (downloading, extracting, starting, stopping, etc.).

rllib/examples/envs/classes/multi_agent/footsies/__init__.py

Whitespace-only changes.

0 commit comments

Comments
 (0)