-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[RLlib] Add tests for the Footsies environment #55041
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @kamil-kaczmarek, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request introduces the Footsies multi-agent reinforcement learning environment to RLlib. My goal is to provide all necessary components for RLlib to interact with the Footsies game server, including the environment definition, observation encoding, and utilities for managing the game binary. This integration will enable future testing and training of reinforcement learning agents within this two-player fighting game context.
Highlights
- New Environment Integration: I've integrated the Footsies multi-agent fighting game as a new environment within RLlib. This supports two-player reinforcement learning scenarios.
- Game Server Management: I've added functionality to automatically download, extract, and launch the Footsies game server binary. This supports various operating systems (Linux, Mac) and modes (server, windowed).
- Observation and Action Spaces: I've defined the observation and action spaces for the Footsies environment, including a custom encoder for processing raw game state into structured observations and handling of special charge actions.
- gRPC Communication Layer: A gRPC communication layer has been established to interact with the Footsies game server, enabling seamless stepping of the environment, retrieval of game states, and management of game resets.
- Environment Constants and Utilities: I've introduced dedicated files for environment constants, game actions, action IDs, and observation feature indices, along with utility functions for action-to-bit conversion and observation encoding.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a new multi-agent environment, Footsies, which is a great addition for RLlib examples. The implementation is thorough, including the environment itself, an encoder for observations, and logic to manage the game binary.
I've found a few critical issues related to incorrect state handling and invalid function arguments that could lead to runtime crashes. I've also identified a high-severity issue where a missing timeout could cause the application to hang. Additionally, I've provided several medium-severity suggestions to improve code maintainability, robustness, and adherence to best practices.
Please review the detailed comments. Addressing the critical and high-severity issues is highly recommended before merging.
rllib/examples/envs/classes/multi_agent/footsies/footsies_env.py
Outdated
Show resolved
Hide resolved
rllib/examples/envs/classes/multi_agent/footsies/footsies_binary.py
Outdated
Show resolved
Hide resolved
rllib/examples/envs/classes/multi_agent/footsies/footsies_env.py
Outdated
Show resolved
Hide resolved
rllib/examples/envs/classes/multi_agent/footsies/game/footsies_game.py
Outdated
Show resolved
Hide resolved
rllib/examples/envs/classes/multi_agent/footsies/game/footsies_game.py
Outdated
Show resolved
Hide resolved
rllib/examples/envs/classes/multi_agent/footsies/game/proto/footsies_service.proto
Show resolved
Hide resolved
rllib/examples/envs/classes/multi_agent/footsies/footsies_env.py
Outdated
Show resolved
Hide resolved
@aslonnie for changes in pyproject.toml. |
rllib/examples/envs/classes/multi_agent/footsies/footsies_env.py
Outdated
Show resolved
Hide resolved
rllib/examples/envs/classes/multi_agent/footsies/game/footsies_binary.py
Outdated
Show resolved
Hide resolved
rllib/examples/envs/classes/multi_agent/footsies/game/footsies_game.py
Outdated
Show resolved
Hide resolved
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
removes the incorrect genrule query and "preclean" function. the "preclean" function does not work with non-local genrules, because the `ci/run/bazel.py` preclean assumes all output files of the genrules under `//:all` are local genrules, and it is trying to temper with the output, which is unsafe for non-local, normal genrules. we are no longer building windows python wheels of different python versions any more. each wheel building for each python version runs on a clean windows CI machine now. Signed-off-by: Lonnie Liu <[email protected]> Signed-off-by: Kamil Kaczmarek <[email protected]>
The CheckAlive functionality checks if a node is alive given a node address. However, multiple nodes can have the same address. The change substituted usages of node address for checking a node's alive status with node id which is unique. Signed-off-by: avibasnet31 <[email protected]> Co-authored-by: Dhyey Shah <[email protected]> Signed-off-by: Kamil Kaczmarek <[email protected]>
…ag support (#54886) This pr ensures that the new metric record API behaviour matches the legacy API behaviour especially wrt tag support (only tags registered during metric definition will be recorded + changes to ensure that global tags overwrite locally specified tag values). New test cases have been added to cover these cases and some other missing cases. Signed-off-by: sampan <[email protected]> Co-authored-by: sampan <[email protected]> Signed-off-by: Kamil Kaczmarek <[email protected]>
migrating the below files from `_private` to `_common` as part of #53478 - usage (migrating the whole folder as it only had the usage_lib and usage_constants) - three variables from ray_constants - LOGGING_ROTATE_BYTES - LOGGING_ROTATE_BACKUP_COUNT - DEFAULT_MAX_CONCURRENCY_ASYNC --------- Signed-off-by: harshit <[email protected]> Signed-off-by: Kamil Kaczmarek <[email protected]>
<!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? <!-- Please give a short summary of the change and the problem this solves. --> - Return `SampleInfo` with `None` values for empty fragments,prevents errors in downstream processing. ## Related issue number Run the test with debug output to see the full error: ``` def test_read_parquet_with_zero_row_groups(shutdown_only, tmp_path): """Test reading a parquet file with 0 row groups.""" # Create an empty parquet file (0 row groups) empty_path = os.path.join(tmp_path, "empty.parquet") schema = pa.schema({"": pa.int64()}) with pq.ParquetWriter(empty_path, schema) as writer: pass parquet_file = pq.ParquetFile(empty_path) assert parquet_file.num_row_groups == 0 #Test reading the empty parquet file dataset = ray.data.read_parquet(empty_path) assert dataset.count() == 0 ``` ``` values = self.deserialize_objects(serialized_objects, object_refs) if not return_exceptions: # Raise exceptions instead of returning them to the user. for i, value in enumerate(values): if isinstance(value, RayError): if isinstance(value, ray.exceptions.ObjectLostError): global_worker.core_worker.log_plasma_usage() if isinstance(value, RayTaskError): > raise value.as_instanceof_cause() E ray.exceptions.RayTaskError(ArrowIndexError): ray::_sample_fragment() E File "/opt/anaconda3/envs/github_rayenv/lib/python3.10/site-packages/ray/data/_internal/datasource/parquet_datasource.py", line 536, in _sample_fragment E fragment = fragment.subset(row_group_ids=[0]) E File "pyarrow/_dataset_parquet.pyx", line 484, in pyarrow._dataset_parquet.ParquetFileFragment.subset E File "pyarrow/error.pxi", line 154, in pyarrow.lib.pyarrow_internal_check_status E File "pyarrow/error.pxi", line 91, in pyarrow.lib.check_status E pyarrow.lib.ArrowIndexError: ParquetFileFragment references row group 0 but /private/var/folders/kd/vvlnk55927s6l7lty1x2szbr0000gp/T/pytest-of-jyxc-dz-0100532/pytest-11/test_read_parquet_with_zero_ro0/empty.parquet only has 0 row groups /opt/anaconda3/envs/github_rayenv/lib/python3.10/site-packages/ray/_private/worker.py:948: RayTaskError(ArrowIndexError) ``` <!-- For example: "Closes #1234" --> ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( Signed-off-by: dragongu <[email protected]> Signed-off-by: Kamil Kaczmarek <[email protected]>
…obal limit (#54986) <!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? <!-- Please give a short summary of the change and the problem this solves. --> If an actor pool operator uses GPUs and concurrency is set higher than the total number of GPUs in the cluster, the resource manager calculates a negative resource budget due to a bug. This PR fixes that issue. ## Related issue number <!-- For example: "Closes #1234" --> ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: Balaji Veeramani <[email protected]> Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Yiwen Xiang <[email protected]> Signed-off-by: Yevet <[email protected]> Co-authored-by: Kai-Hsun Chen <[email protected]> Signed-off-by: Kamil Kaczmarek <[email protected]>
…undle` (#54996) <!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? This is documented feature to allow custom `resources_per_bundle`. We used to set `num_gpus=1` which might raise exception if user provide `GPU < 1` in the config. `num_gpus < 1` can be useful if we want to use only partial of the GPU ## Related issue number <!-- For example: "Closes #1234" --> ## Checks - [x] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [x] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [x] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: Linkun Chen <[email protected]> Signed-off-by: lkchen <[email protected]> Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Ryan O'Leary <[email protected]> Signed-off-by: Ryan O'Leary <[email protected]> Co-authored-by: Jiajun Yao <[email protected]> Co-authored-by: Mengjin Yan <[email protected]> Signed-off-by: Kamil Kaczmarek <[email protected]>
<!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? As title said Before <img width="1808" height="270" alt="image" src="https://github.com/user-attachments/assets/cb35f7a5-ae82-4969-aa33-cb139fd63ed5" /> After <img width="1147" height="257" alt="image" src="https://github.com/user-attachments/assets/c122004b-79b2-4e7b-b24e-53a428e430d6" /> <!-- Please give a short summary of the change and the problem this solves. --> ## Related issue number <!-- For example: "Closes #1234" --> ## Checks - [x] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [x] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: Potato <[email protected]> Signed-off-by: Kamil Kaczmarek <[email protected]>
… not load (#55017) Signed-off-by: cong.qian <[email protected]> Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: joshlee <[email protected]> Signed-off-by: Kamil Kaczmarek <[email protected]>
adding raydepsets to test rules (only ci tests will be run when changes are made here) --------- Signed-off-by: elliot-barn <[email protected]> Signed-off-by: Kamil Kaczmarek <[email protected]>
refactoring expand operation for raydepsets to allow for depsets and requirement files as input. this makes depset expansion inherit the input depset's requirements --------- Signed-off-by: elliot-barn <[email protected]> Co-authored-by: Lonnie Liu <[email protected]> Signed-off-by: Kamil Kaczmarek <[email protected]>
using fewer local genrules Signed-off-by: Lonnie Liu <[email protected]> Signed-off-by: Kamil Kaczmarek <[email protected]>
unify bazel build file naming conventions across the repository Signed-off-by: Lonnie Liu <[email protected]> Signed-off-by: Kamil Kaczmarek <[email protected]>
…e env runner and eval env runner Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some nits. Awesome PR!
"multi_gpu", | ||
"team:rllib", | ||
], | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed in a meeting, there is probably a greater distinction in code coverage and system dynamics between [num_learners=0, num_learners=2] than [num_learners=1, num_learners=2] so we may want to use the former.
About: | ||
- Example is based on the Footsies environment (https://github.com/chasemcd/FootsiesGym). | ||
- Footsies is a two-player fighting game where each player controls a character and tries to hit the opponent while avoiding being hit. | ||
- Footsies is a zero-sum game, when one player wins (+1 reward) the other loses (-1 reward). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this actually true? I thought there could be situation where both players die?
:param input_buffer: The input buffer to encode | ||
:type input_buffer: list[int] | ||
:return: The encoded one-hot vector | ||
:rtype: np.ndarray |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't use this style of docstrings. Can we move to https://google.github.io/styleguide/pyguide.html#383-functions-and-methods , please?
return {agent: {} for agent in self.agents} | ||
|
||
def get_obs(self, game_state): | ||
return self.encoder.encode(game_state) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for the future: Let's move this out into a connector!
|
||
if self.special_charge_queue[agent_id] >= 0: | ||
self.special_charge_queue[agent_id] -= 1 | ||
actions[agent_id] = self._convert_to_charge_action(actions[agent_id]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just out of curiosity, because I think this is weird: If we let two policies control the two agents and one casts SPECIAL_CHARGE
, we override the action that the model choses and then we just learn on that?
rllib/examples/envs/classes/multi_agent/footsies/footsies_env.py
Outdated
Show resolved
Hide resolved
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stamping to unblock, but added feedback comments in Slack: https://anyscaleteam.slack.com/archives/C01SDRHDNPP/p1756993479333839?thread_ts=1756993080.660019&cid=C01SDRHDNPP
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Why are these changes needed?
Related issue number
n.a.
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.