Skip to content

Conversation

kamil-kaczmarek
Copy link
Contributor

@kamil-kaczmarek kamil-kaczmarek commented Jul 30, 2025

Why are these changes needed?

  • RLlib tests for Footsies: multi-agent / self-play reinforcement learning environment (for two-players).

Related issue number

n.a.

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@kamil-kaczmarek kamil-kaczmarek self-assigned this Jul 30, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @kamil-kaczmarek, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces the Footsies multi-agent reinforcement learning environment to RLlib. My goal is to provide all necessary components for RLlib to interact with the Footsies game server, including the environment definition, observation encoding, and utilities for managing the game binary. This integration will enable future testing and training of reinforcement learning agents within this two-player fighting game context.

Highlights

  • New Environment Integration: I've integrated the Footsies multi-agent fighting game as a new environment within RLlib. This supports two-player reinforcement learning scenarios.
  • Game Server Management: I've added functionality to automatically download, extract, and launch the Footsies game server binary. This supports various operating systems (Linux, Mac) and modes (server, windowed).
  • Observation and Action Spaces: I've defined the observation and action spaces for the Footsies environment, including a custom encoder for processing raw game state into structured observations and handling of special charge actions.
  • gRPC Communication Layer: A gRPC communication layer has been established to interact with the Footsies game server, enabling seamless stepping of the environment, retrieval of game states, and management of game resets.
  • Environment Constants and Utilities: I've introduced dedicated files for environment constants, game actions, action IDs, and observation feature indices, along with utility functions for action-to-bit conversion and observation encoding.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new multi-agent environment, Footsies, which is a great addition for RLlib examples. The implementation is thorough, including the environment itself, an encoder for observations, and logic to manage the game binary.

I've found a few critical issues related to incorrect state handling and invalid function arguments that could lead to runtime crashes. I've also identified a high-severity issue where a missing timeout could cause the application to hang. Additionally, I've provided several medium-severity suggestions to improve code maintainability, robustness, and adherence to best practices.

Please review the detailed comments. Addressing the critical and high-severity issues is highly recommended before merging.

@kamil-kaczmarek kamil-kaczmarek changed the title [RLlib] - add tests for the Footsies environment [RLlib] Add tests for the Footsies environment Jul 31, 2025
@kamil-kaczmarek kamil-kaczmarek requested a review from aslonnie July 31, 2025 06:19
@kamil-kaczmarek
Copy link
Contributor Author

@aslonnie for changes in pyproject.toml.

kamil-kaczmarek and others added 20 commits August 4, 2025 16:56
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
removes the incorrect genrule query and "preclean" function. the
"preclean" function does not work with non-local genrules, because the
`ci/run/bazel.py` preclean assumes all output files of the genrules
under `//:all` are local genrules, and it is trying to temper with the
output, which is unsafe for non-local, normal genrules.

we are no longer building windows python wheels of different python
versions any more. each wheel building for each python version runs on a
clean windows CI machine now.

Signed-off-by: Lonnie Liu <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
The CheckAlive functionality checks if a node is alive given a node address. However, multiple nodes can have the same address. The change substituted usages of node address for checking a node's alive status with node id which is unique.

Signed-off-by: avibasnet31 <[email protected]>
Co-authored-by: Dhyey Shah <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
…ag support (#54886)

This pr ensures that the new metric record API behaviour matches the legacy API behaviour especially wrt tag support (only tags registered during metric definition will be recorded + changes to ensure that global tags overwrite locally specified tag values). New test cases have been added to cover these cases and some other missing cases.

Signed-off-by: sampan <[email protected]>
Co-authored-by: sampan <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
migrating the below files from `_private` to `_common` as part of
#53478
- usage (migrating the whole folder as it only had the usage_lib and
usage_constants)
- three variables from ray_constants
  - LOGGING_ROTATE_BYTES
  - LOGGING_ROTATE_BACKUP_COUNT
  - DEFAULT_MAX_CONCURRENCY_ASYNC

---------

Signed-off-by: harshit <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

<!-- Please give a short summary of the change and the problem this
solves. -->

- Return `SampleInfo` with `None` values for empty fragments,prevents
errors in downstream processing.

## Related issue number

Run the test with debug output to see the full error:
```
def test_read_parquet_with_zero_row_groups(shutdown_only, tmp_path):
    """Test reading a parquet file with 0 row groups."""
    # Create an empty parquet file (0 row groups)
    empty_path = os.path.join(tmp_path, "empty.parquet")
    schema = pa.schema({"": pa.int64()})
    with pq.ParquetWriter(empty_path, schema) as writer:
        pass

    parquet_file = pq.ParquetFile(empty_path)
    assert parquet_file.num_row_groups == 0

    #Test reading the empty parquet file
    dataset = ray.data.read_parquet(empty_path)
    assert dataset.count() == 0
```

```
        values = self.deserialize_objects(serialized_objects, object_refs)
        if not return_exceptions:
            # Raise exceptions instead of returning them to the user.
            for i, value in enumerate(values):
                if isinstance(value, RayError):
                    if isinstance(value, ray.exceptions.ObjectLostError):
                        global_worker.core_worker.log_plasma_usage()
                    if isinstance(value, RayTaskError):
>                       raise value.as_instanceof_cause()
E                       ray.exceptions.RayTaskError(ArrowIndexError): ray::_sample_fragment()
E                         File "/opt/anaconda3/envs/github_rayenv/lib/python3.10/site-packages/ray/data/_internal/datasource/parquet_datasource.py", line 536, in _sample_fragment
E                           fragment = fragment.subset(row_group_ids=[0])
E                         File "pyarrow/_dataset_parquet.pyx", line 484, in pyarrow._dataset_parquet.ParquetFileFragment.subset
E                         File "pyarrow/error.pxi", line 154, in pyarrow.lib.pyarrow_internal_check_status
E                         File "pyarrow/error.pxi", line 91, in pyarrow.lib.check_status
E                       pyarrow.lib.ArrowIndexError: ParquetFileFragment references row group 0 but /private/var/folders/kd/vvlnk55927s6l7lty1x2szbr0000gp/T/pytest-of-jyxc-dz-0100532/pytest-11/test_read_parquet_with_zero_ro0/empty.parquet only has 0 row groups

/opt/anaconda3/envs/github_rayenv/lib/python3.10/site-packages/ray/_private/worker.py:948: RayTaskError(ArrowIndexError)
```

<!-- For example: "Closes #1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

Signed-off-by: dragongu <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
…obal limit (#54986)

<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

<!-- Please give a short summary of the change and the problem this
solves. -->

If an actor pool operator uses GPUs and concurrency is set higher than
the total number of GPUs in the cluster, the resource manager calculates
a negative resource budget due to a bug. This PR fixes that issue.

## Related issue number

<!-- For example: "Closes #1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Balaji Veeramani <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Yiwen Xiang <[email protected]>
Signed-off-by: Yevet <[email protected]>
Co-authored-by: Kai-Hsun Chen <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
…undle` (#54996)

<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

This is documented feature to allow custom `resources_per_bundle`. We
used to set `num_gpus=1` which might raise exception if user provide
`GPU < 1` in the config.

`num_gpus < 1` can be useful if we want to use only partial of the GPU

## Related issue number

<!-- For example: "Closes #1234" -->

## Checks

- [x] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [x] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [x] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Linkun Chen <[email protected]>
Signed-off-by: lkchen <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Ryan O'Leary <[email protected]>
Signed-off-by: Ryan O'Leary <[email protected]>
Co-authored-by: Jiajun Yao <[email protected]>
Co-authored-by: Mengjin Yan <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

As title said

Before
<img width="1808" height="270" alt="image"
src="https://github.com/user-attachments/assets/cb35f7a5-ae82-4969-aa33-cb139fd63ed5"
/>

After
<img width="1147" height="257" alt="image"
src="https://github.com/user-attachments/assets/c122004b-79b2-4e7b-b24e-53a428e430d6"
/>

<!-- Please give a short summary of the change and the problem this
solves. -->

## Related issue number

<!-- For example: "Closes #1234" -->

## Checks

- [x] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [x] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Potato <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
… not load (#55017)

Signed-off-by: cong.qian <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: joshlee <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
adding raydepsets to test rules (only ci tests will be run when changes
are made here)

---------

Signed-off-by: elliot-barn <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
refactoring expand operation for raydepsets to allow for depsets and
requirement files as input. this makes depset expansion inherit the
input depset's requirements

---------

Signed-off-by: elliot-barn <[email protected]>
Co-authored-by: Lonnie Liu <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
using fewer local genrules

Signed-off-by: Lonnie Liu <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
unify bazel build file naming conventions across the repository

Signed-off-by: Lonnie Liu <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
Copy link
Contributor

@ArturNiederfahrenhorst ArturNiederfahrenhorst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some nits. Awesome PR!

"multi_gpu",
"team:rllib",
],
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in a meeting, there is probably a greater distinction in code coverage and system dynamics between [num_learners=0, num_learners=2] than [num_learners=1, num_learners=2] so we may want to use the former.

About:
- Example is based on the Footsies environment (https://github.com/chasemcd/FootsiesGym).
- Footsies is a two-player fighting game where each player controls a character and tries to hit the opponent while avoiding being hit.
- Footsies is a zero-sum game, when one player wins (+1 reward) the other loses (-1 reward).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this actually true? I thought there could be situation where both players die?

:param input_buffer: The input buffer to encode
:type input_buffer: list[int]
:return: The encoded one-hot vector
:rtype: np.ndarray
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't use this style of docstrings. Can we move to https://google.github.io/styleguide/pyguide.html#383-functions-and-methods , please?

return {agent: {} for agent in self.agents}

def get_obs(self, game_state):
return self.encoder.encode(game_state)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for the future: Let's move this out into a connector!


if self.special_charge_queue[agent_id] >= 0:
self.special_charge_queue[agent_id] -= 1
actions[agent_id] = self._convert_to_charge_action(actions[agent_id])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just out of curiosity, because I think this is weird: If we let two policies control the two agents and one casts SPECIAL_CHARGE, we override the action that the model choses and then we just learn on that?

kamil-kaczmarek and others added 11 commits September 2, 2025 23:20
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
@simonsays1980 simonsays1980 enabled auto-merge (squash) September 4, 2025 13:36
Copy link
Contributor

@dstrodtman dstrodtman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@simonsays1980 simonsays1980 enabled auto-merge (squash) September 4, 2025 15:39
@github-actions github-actions bot disabled auto-merge September 4, 2025 15:43
@simonsays1980 simonsays1980 enabled auto-merge (squash) September 4, 2025 16:36
Copy link
Contributor

@simonsays1980 simonsays1980 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot disabled auto-merge September 4, 2025 19:23
@pcmoritz pcmoritz merged commit 86ba5d5 into master Sep 4, 2025
5 checks passed
@pcmoritz pcmoritz deleted the kk-self-play branch September 4, 2025 21:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
go add ONLY when ready to merge, run all tests rllib RLlib related issues testing topics about testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.