Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds Anymal-C navigation rsl_rl example with obstacle and camera in the scene #2027

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

KingsleyLiu-NV
Copy link

@KingsleyLiu-NV KingsleyLiu-NV commented Mar 6, 2025

Description

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context.
List any dependencies that are required for this change.

This PR adds the anymal-c navigation rsl_rl example with obstacle and camera in the scene:

  • Extends the rsl_rl libraries to support convolution networks and ppo runner
  • Extends the anymal-c navigation environment to have rigid object obstacle and tiled camera sensor
    • Adds the event reset function to sample the position in a sector of ring area
    • Adds the observation function for the flattened normalized rgb data
    • Adds the pose3 command resample function to determine the target based on the obstacle position
  • Tune the RL reward terms to make the training converge

Type of change

  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

It can be run with the following command:

./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task Isaac-Navigation-Flat-Obstacle-Anymal-C-v0 --headless --num_envs 4096 --max_iterations 1501 --enable_cameras

./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/play.py --task Isaac-Navigation-Flat-Obstacle-Anymal-C-Play-v0 --headless --num_envs 16 --video

Screenshots

The training results are as below:
image

Memory footprint: (num_envs=4096, image_input_shape=(3, 64, 64))
image

Training speed:
image

Play with model_1300.pt:

e68abcd97722111e219462e88599bcba.mp4

Checklist

  • I have run the pre-commit checks with ./isaaclab.sh --format
  • I have made corresponding changes to the documentation
    • isaaclab changelog & extension.toml
    • isaaclab_rl changelog & extension.toml
    • isaaclab_tasks changelog & extension.toml
    • environments.rst
    • ??? More
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the changelog and the corresponding version in the extension's config/extension.toml file
  • I have added my name to the CONTRIBUTORS.md or my name already exists there

@KingsleyLiu-NV KingsleyLiu-NV force-pushed the fea/navigation-obstacle-camera branch from ac9d8b0 to 6f78579 Compare March 6, 2025 09:37
base_lin_vel = ObsTerm(func=mdp.base_lin_vel)
projected_gravity = ObsTerm(func=mdp.projected_gravity)
pose_command = ObsTerm(func=mdp.generated_commands, params={"command_name": "pose_command"})
camera_data = ObsTerm(func=mdp.image_flatten, params={"sensor_cfg": SceneEntityCfg("tiled_camera")})
Copy link
Contributor

@Toni-SM Toni-SM Mar 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation flattens and concatenates all the observations components (propio and image) in one vector (a flattened observation_space), leading the definition of the image shape to use in the model to the agent configuration image_input_shape, right?

Agent configuration must be agnostic to any environment information except those specified by the environment spaces (observation_space, state_space and action_space). Currently, this is only possible with the direct workflow and using composite spaces

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that understanding is correct. And I use a check to ensure the image_input_shape is consistent with the tiled camera configuration: train.py#L132-L140

from rsl_rl.modules.actor_critic import get_activation


class ResidualBlock(nn.Module):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is RSL-RL library related, I would recommend making an MR there for this actor critic structure :)

Copy link
Author

@KingsleyLiu-NV KingsleyLiu-NV Mar 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I can do that. Then I suppose this PR can only be merged after the RSL-RL accepts my changes.

Copy link
Contributor

@Mayankm96 Mayankm96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of the modifications in this MR should be handled by RSL-RL library and not IsaacLab. It would be good to keep the two separated to keep IsaacLab mainly as simulation infrastructure.

Would it be possible to make an MR to RSL-RL instead? We can review it there and deal with this usecase accordingly. It shouldn't be hard for us to support a dictionary of observations and avoid an ugly hack here to deal with it.

@KingsleyLiu-NV
Copy link
Author

A lot of the modifications in this MR should be handled by RSL-RL library and not IsaacLab. It would be good to keep the two separated to keep IsaacLab mainly as simulation infrastructure.

Would it be possible to make an MR to RSL-RL instead? We can review it there and deal with this usecase accordingly. It shouldn't be hard for us to support a dictionary of observations and avoid an ugly hack here to deal with it.

@Mayankm96 thanks for the suggestions. Basically, only actor_critic_conv2d.py and on_policy_runner_conv2d.py should go to RSL-RL library, and I can file an PR there.

Regarding support a dictionary of observations and avoid an ugly hack here to deal with it, can you elaborate what can be improved in this PR after conv2d-related code goes to RSL-RL library?

@Toni-SM
Copy link
Contributor

Toni-SM commented Mar 7, 2025

Regarding support a dictionary of observations and avoid an ugly hack here to deal with it, can you elaborate what can be improved in this PR after conv2d-related code goes to RSL-RL library?

As I commented previously, agent configuration must be agnostic to any environment information except those specified by the environment spaces (observation_space, state_space and action_space) and this is currently, this is only possible with the direct workflow and using composite spaces (Tuple or Dict).

So, the current task implementation is a hacky and particular hard-code solution for one RL library that doesn't follow the convention to express highly structured space specification, and it is not generalizable/usable for other RL libraries. Since only Direct workflow support heterogeneous observations currently, the task implementation must be migrated to that workflow.

@Mayankm96
Copy link
Contributor

Mayankm96 commented Mar 8, 2025

Asking the user here to migrate their environment to direct workflow isn't really the solution. We need to properly support heterogenous observation/action spaces to the manager-based environments. This is in the release plan for 2.2 of IsaacLab. We should probably prioritize it more as it is something doable.

@KingsleyLiu-NV An intermediate solution is that you have a separate observation group for the images similar to how we define the group called 'policy'. If this group is called 'images', you can access it from RSL-RL side through the dictionary: extras["observations"]["images"]. This is how we are dealing with asymmetric actor-critic over there. More information here: https://github.com/leggedrobotics/rsl_rl/blob/main/rsl_rl/runners/on_policy_runner.py#L32-L41

@kellyguo11 for visibility.

@Toni-SM
Copy link
Contributor

Toni-SM commented Mar 8, 2025

This is in the release plan for 2.2 of IsaacLab. We should probably prioritize it more as it is something doable.

For sure, adding to the Manager-based workflow the ability to work with different space specifications is something to be prioritized.

However, while that happens, we should indicate the appropriate workflows to create well-done tasks according to the agreed interface (that allow the project to be used by any RL library) and stop proposing/encouraging hard-code solution (that although it is not as "ugly" as the previous one, still hacky hard-code) tied to specific implementations that do not promote code reusability and compromise the robust, maintainable, and flexible codebase of the project.

Therefore, for the time being, the way forward should be through Direct workflow by defining composite observation_space.

@KingsleyLiu-NV
Copy link
Author

KingsleyLiu-NV commented Mar 10, 2025

Hi @Mayankm96 , I get your point of using extras["observation"]. However, I think there still requires a new branch in the exporter: https://github.com/isaac-sim/IsaacLab/blob/main/source/isaaclab_rl/isaaclab_rl/rsl_rl/exporter.py#L123-L151. If this kind of modification is acceptable, I will make the changes accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants