Skip to content

Quentin18/gymnasium-search-race

Repository files navigation

Gymnasium Search Race

Build Python Package Python PyPI PyPI Downloads pre-commit Code style: black Imports: isort

Gymnasium environments for the Search Race CodinGame optimization puzzle and Mad Pod Racing CodinGame bot programming game.

search_race_v2_demo.mp4
Action Space Box([-1, 0], [1, 1], float64)
Observation Space Box(-1, 1, shape=(10,), float64)
import gymnasium.make("gymnasium_search_race:gymnasium_search_race/SearchRace-v3")

Installation

To install gymnasium-search-race with pip, execute:

pip install gymnasium_search_race

From source:

git clone https://github.com/Quentin18/gymnasium-search-race
cd gymnasium-search-race/
pip install -e .

Environment

Action Space

The action is a ndarray with 2 continuous variables:

  • The rotation angle between -18 and 18 degrees, normalized between -1 and 1.
  • The thrust between 0 and 200, normalized between 0 and 1.

Observation Space

The observation is a ndarray of 10 continuous variables:

  • The relative x and y coordinates of the next two checkpoints in the car's frame.
  • The sine and cosine of the relative angle to the next two checkpoints in the car's frame.
  • The longitudinal and lateral speed in the car's frame.

The values are normalized between -1 and 1.

Reward

  • +1 when a checkpoint is visited.
  • 0 otherwise.

Starting State

The starting state is generated by choosing a random CodinGame test case.

Episode End

The episode ends if either of the following happens:

  1. Termination: The car visit all checkpoints before the time is out.
  2. Truncation: Episode length is greater than 600.

Arguments

  • laps: number of laps. The default value is 3.
  • car_max_thrust: maximum thrust. The default value is 200.
  • test_id: test case id to generate the checkpoints (see choices here). The default value is None which selects a test case randomly when the reset method is called.
  • sequential_maps: if True, the maps are generated sequentially. The default value is False.
import gymnasium as gym

gym.make(
    "gymnasium_search_race:gymnasium_search_race/SearchRace-v3",
    laps=3,
    car_max_thrust=200,
    test_id=1,
    sequential_maps=False,
)

Version History

  • v3: Update observation with relative positions and angles in car's frame
  • v2: Update observation with relative positions and angles
  • v1: Add boolean to indicate if the next checkpoint is the last checkpoint in observation
  • v0: Initial version

Discrete environment

The SearchRaceDiscrete environment is similar to the SearchRace environment except the action space is discrete.

import gymnasium as gym

gym.make(
    "gymnasium_search_race:gymnasium_search_race/SearchRaceDiscrete-v3",
    laps=3,
    car_max_thrust=200,
    test_id=1,
    sequential_maps=False,
)

Action Space

There are 74 discrete actions corresponding to the combinations of angles from -18 to 18 degrees and thrust 0 and 200.

Version History

  • v3: Update observation with relative positions and angles in car's frame
  • v2: Update observation with relative positions and angles
  • v1: Add all angles in action space
  • v0: Initial version

Mad Pod Racing

Runner

The MadPodRacing and MadPodRacingDiscrete environments can be used to train a runner for the Mad Pod Racing CodinGame bot programming game. They are similar to the SearchRace and SearchRaceDiscrete environments except the following differences:

  • The maps are generated the same way Codingame generates them.
  • The car position is rounded and not truncated.
import gymnasium as gym

gym.make("gymnasium_search_race:gymnasium_search_race/MadPodRacing-v2")
gym.make("gymnasium_search_race:gymnasium_search_race/MadPodRacingDiscrete-v2")
mad_pod_racing_v1_demo.mp4

Blocker

The MadPodRacingBlocker and MadPodRacingBlockerDiscrete environments can be used to train a blocker for the Mad Pod Racing CodinGame bot programming game.

import gymnasium as gym

gym.make("gymnasium_search_race:gymnasium_search_race/MadPodRacingBlocker-v2")
gym.make("gymnasium_search_race:gymnasium_search_race/MadPodRacingBlockerDiscrete-v2")
mad_pod_racing_blocker_v1_demo.mp4

Arguments

  • opponent_path: path to the opponent PPO model. The default value is None which means there is no opponent.
  • boost_on_first_move: if True, the car is boosted on the first move. The default value is False.
  • boost_opponent_on_first_move: if True, the opponent is boosted on the first move. The default value is False.

Version History

  • v2: Update observation with relative positions and angles in car's frame and add boost options
  • v1: Update observation with relative positions and angles and update maximum thrust
  • v0: Initial version

Usage

You can use RL Baselines3 Zoo to train and evaluate agents:

pip install rl_zoo3

Train an Agent

The hyperparameters are defined in hyperparams/ppo.yml.

To train a PPO agent for the Search Race game, execute:

python -m rl_zoo3.train \
  --algo ppo \
  --env gymnasium_search_race/SearchRaceDiscrete-v3 \
  --tensorboard-log logs \
  --eval-freq 20000 \
  --eval-episodes 50 \
  --gym-packages gymnasium_search_race \
  --env-kwargs "laps:1000" "sequential_maps:True" \
  --conf-file hyperparams/ppo.yml \
  --progress

Important

The agent is evaluated once per test case with --eval-episodes 50 and --env-kwargs "sequential_maps:True" (there are 50 different test cases).

For the Mad Pod Racing game, you can add an opponent with the opponent_path argument:

python -m rl_zoo3.train \
  --algo ppo \
  --env gymnasium_search_race/MadPodRacingBlockerDiscrete-v2 \
  --tensorboard-log logs \
  --eval-freq 20000 \
  --eval-episodes 52 \
  --gym-packages gymnasium_search_race \
  --env-kwargs \
  "opponent_path:'rl-trained-agents/ppo/gymnasium_search_race-MadPodRacingDiscrete-v2_1/best_model.zip'" \
  "laps:1000" \
  "sequential_maps:True" \
  "boost_opponent_on_first_move:True" \
  --conf-file hyperparams/ppo.yml \
  --progress

Important

The agent is evaluated four times per test case with --eval-episodes 52 and --env-kwargs "sequential_maps:True" (there are 13 different test cases).

Enjoy a Trained Agent

To see a trained agent in action on random test cases, execute:

python -m rl_zoo3.enjoy \
  --algo ppo \
  --env gymnasium_search_race/SearchRaceDiscrete-v3 \
  --n-timesteps 1000 \
  --deterministic \
  --gym-packages gymnasium_search_race \
  --load-best \
  --progress

Run Test Cases

To run test cases with a trained agent, execute:

python -m scripts.run_test_cases \
  --path rl-trained-agents/ppo/gymnasium_search_race-SearchRaceDiscrete-v3_1/best_model.zip \
  --env gymnasium_search_race:gymnasium_search_race/SearchRaceDiscrete-v3 \
  --record-video \
  --record-metrics

Record a Video of a Trained Agent

To record a video of a trained agent on Mad Pod Racing, execute:

python -m scripts.record_video \
  --path rl-trained-agents/ppo/gymnasium_search_race-MadPodRacingDiscrete-v2_1/best_model.zip \
  --env gymnasium_search_race:gymnasium_search_race/MadPodRacingDiscrete-v2

For Mad Pod Racing Blocker, execute:

python -m scripts.record_video \
  --path rl-trained-agents/ppo/gymnasium_search_race-MadPodRacingBlockerDiscrete-v2_1/best_model.zip \
  --opponent-path rl-trained-agents/ppo/gymnasium_search_race-MadPodRacingDiscrete-v2_1/best_model.zip \
  --env gymnasium_search_race:gymnasium_search_race/MadPodRacingBlockerDiscrete-v2

Tests

To run tests, execute:

pytest

Citing

To cite the repository in publications:

@misc{gymnasium-search-race,
  author = {Quentin Deschamps},
  title = {Gymnasium Search Race},
  year = {2024},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/Quentin18/gymnasium-search-race}},
}

References

Assets

Author

Quentin Deschamps

About

Gymnasium environments for the Search Race and Mad Pod Racing CG puzzles

Topics

Resources

License

Stars

Watchers

Forks

Languages