-
Notifications
You must be signed in to change notification settings - Fork 132
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: support interface of environment customization (#310)
- Loading branch information
Showing
37 changed files
with
5,253 additions
and
114 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8,3 +8,4 @@ sphinx-autoapi | |
sphinx-autobuild | ||
sphinx-autodoc-typehints | ||
furo | ||
sphinxcontrib-spelling |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
OmniSafe Customization Interface of Environments | ||
================================================ | ||
|
||
.. currentmodule:: omnisafe.envs.custom_env | ||
|
||
.. autosummary:: | ||
|
||
CustomEnv | ||
|
||
CustomEnv | ||
--------- | ||
|
||
.. card:: | ||
:class-header: sd-bg-success sd-text-white | ||
:class-card: sd-outline-success sd-rounded-1 | ||
|
||
Documentation | ||
^^^ | ||
|
||
.. autoclass:: CustomEnv | ||
:members: | ||
:private-members: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -485,3 +485,4 @@ UpdateActorCritic | |
UpdateDynamics | ||
mathbb | ||
meger | ||
Jupyter |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,73 @@ | ||
Environments Customization | ||
=========================== | ||
|
||
OmniSafe supports a flexible environment customization interface. Users only need to make minimal | ||
interface adaptations within the simplest template provided by OmniSafe to complete the environment | ||
customization. | ||
|
||
.. note:: | ||
The highlight of OmniSafe's environment customization is that **users only need to modify the code at the environment layer**, to enjoy OmniSafe's complete set of training, saving, and data logging mechanisms. This allows users who install from PyPI to use it easily and only focus on the dynamics of the environment. | ||
|
||
|
||
Get Started with the Simplest Template | ||
-------------------------------------- | ||
|
||
OmniSafe offers a minimal implementation of an environment template as an example of a customized | ||
environments, :doc:`../envs/custom`. | ||
We recommend reading this template in detail and customizing it based on it. | ||
|
||
.. card:: | ||
:class-header: sd-bg-success sd-text-white | ||
:class-card: sd-outline-success sd-rounded-1 | ||
|
||
Frequently Asked Questions | ||
^^^ | ||
1. What changes are necessary to embed the environment into OmniSafe? | ||
2. My environment requires specific parameters; can these be integrated into OmniSafe's parameter mechanism? | ||
3. I need to log information during training; how can I achieve this? | ||
4. After embedding the environment, how do I run the algorithms in OmniSafe for training? | ||
|
||
For the above questions, we provide a complete Jupyter Notebook example (Please see our tutorial on | ||
GitHub page). We will demonstrate how to start from the most common environments in | ||
`Gymnasium <https://gymnasium.farama.org/>`_ style, implement | ||
environment customization and complete the training process. | ||
|
||
|
||
Customization of Your Environments | ||
----------------------------------- | ||
|
||
From Source Code | ||
^^^^^^^^^^^^^^^^ | ||
|
||
If you are installing from the source code, you can follow the steps below: | ||
|
||
.. card:: | ||
:class-header: sd-bg-success sd-text-white | ||
:class-card: sd-outline-success sd-rounded-1 | ||
|
||
Build from Source Code | ||
^^^ | ||
1. Create a new file under `omnisafe/envs/`, for example, `omnisafe/envs/my_env.py`. | ||
2. Customize the environment in `omnisafe/envs/my_env.py`. Assuming the class name is `MyEnv`, and the environment name is `MyEnv-v0`. | ||
3. Add `from .my_env import MyEnv` in `omnisafe/envs/__init__.py`. | ||
4. Run the following command in the `omnisafe/examples` folder: | ||
|
||
.. code-block:: bash | ||
:linenos: | ||
python train_policy.py --algo PPOLag --env MyEnv-v0 | ||
From PyPI | ||
^^^^^^^^^ | ||
|
||
.. card:: | ||
:class-header: sd-bg-success sd-text-white | ||
:class-card: sd-outline-success sd-rounded-1 | ||
|
||
Build from PyPI | ||
^^^ | ||
1. Customize the environment in any folder. Assuming the class name is `MyEnv`, and the environment name is `MyEnv-v0`. | ||
2. Import OmniSafe and the environment registration decorator. | ||
3. Run the training. | ||
|
||
For a short but detailed example, please see `examples/train_from_custom_env.py` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,90 @@ | ||
# Copyright 2024 OmniSafe Team. All Rights Reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
# ============================================================================== | ||
"""Example and template for environment customization.""" | ||
|
||
from __future__ import annotations | ||
|
||
import random | ||
from typing import Any, ClassVar | ||
|
||
import torch | ||
from gymnasium import spaces | ||
|
||
import omnisafe | ||
from omnisafe.envs.core import CMDP, env_register | ||
|
||
|
||
# first, define the environment class. | ||
# the most important thing is to add the `env_register` decorator. | ||
@env_register | ||
class CustomExampleEnv(CMDP): | ||
|
||
# define what tasks the environment support. | ||
_support_envs: ClassVar[list[str]] = ['Custom-v0'] | ||
|
||
# automatically reset when `terminated` or `truncated` | ||
need_auto_reset_wrapper = True | ||
# set `truncated=True` when the total steps exceed the time limit. | ||
need_time_limit_wrapper = True | ||
|
||
def __init__(self, env_id: str, **kwargs: dict[str, Any]) -> None: | ||
self._count = 0 | ||
self._num_envs = 1 | ||
self._observation_space = spaces.Box(low=-1.0, high=1.0, shape=(3,)) | ||
self._action_space = spaces.Box(low=-1.0, high=1.0, shape=(2,)) | ||
|
||
def step( | ||
self, | ||
action: torch.Tensor, | ||
) -> tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor, dict]: | ||
self._count += 1 | ||
obs = torch.as_tensor(self._observation_space.sample()) | ||
reward = 2 * torch.as_tensor(random.random()) # noqa | ||
cost = 2 * torch.as_tensor(random.random()) # noqa | ||
terminated = torch.as_tensor(random.random() > 0.9) # noqa | ||
truncated = torch.as_tensor(self._count > self.max_episode_steps) | ||
return obs, reward, cost, terminated, truncated, {'final_observation': obs} | ||
|
||
@property | ||
def max_episode_steps(self) -> int: | ||
"""The max steps per episode.""" | ||
return 10 | ||
|
||
def reset( | ||
self, | ||
seed: int | None = None, | ||
options: dict[str, Any] | None = None, | ||
) -> tuple[torch.Tensor, dict]: | ||
self.set_seed(seed) | ||
obs = torch.as_tensor(self._observation_space.sample()) | ||
self._count = 0 | ||
return obs, {} | ||
|
||
def set_seed(self, seed: int) -> None: | ||
random.seed(seed) | ||
|
||
def close(self) -> None: | ||
pass | ||
|
||
def render(self) -> Any: | ||
pass | ||
|
||
|
||
# Then you can use it like this: | ||
agent = omnisafe.Agent( | ||
'PPOLag', | ||
'Custom-v0', | ||
) | ||
agent.learn() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.