You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The observation manager has a couple issues when providing history_length != None. In particular:
The env.observation_shape is computed incorrectly for these groups when flatten_history_dim=False. (See below for a concrete example).
When flatten_history_dim=True the observation is not returned in a way that is easy to reshape to recover the common history dimension.
While 2 may be a feature request (I thought this was the expected behavior, but realized it is unclear in hindsight), 1 is certainly a bug with how the observations are being handled currently.
Steps to Reproduce
Create an observation group with multiple history-enabled terms. For example:
@configclass
class ObservationManagerCfg:
@configclass
class PolicyCfg(ObservationGroupCfg):
history_length = H
flatten_history_dim = False
term1 = ObservationTermCfg(func=f1) # f1 returns shape (D1,)
term2 = ObservationTermCfg(func=f2) # f2 returns shape (D2,)
policy = PolicyCfg()
Expected: Upon instantiating the env, env.observation_space["policy"].shape is (num_envs, H, D1 + D2). Actual: Upon instantiating the env, env.observation_space["policy"].shape is (num_envs, 2H, D1 + D2).
Create an observation group with flatten_history_dim=True (e.g., change this parameter of the last example.
When reading an observation from the observation manager,
Expected:policy_obs.reshape(-1, H, D1 + D2) = torch.concatenate([term1, term2], dim=-1). Note this is poor syntax - but ideally we could reconstruct the observations with a coherent history dim. Actual:policy_obs = torch.concatenate([term1.reshape(num_envs, -1), term2.reshape(num_envs, -1)], dim=-1). The reshape does not preserve a history dimension.
Description
The observation manager has a couple issues when providing
history_length != None
. In particular:env.observation_shape
is computed incorrectly for these groups whenflatten_history_dim=False
. (See below for a concrete example).flatten_history_dim=True
the observation is not returned in a way that is easy to reshape to recover the common history dimension.While 2 may be a feature request (I thought this was the expected behavior, but realized it is unclear in hindsight), 1 is certainly a bug with how the observations are being handled currently.
Steps to Reproduce
Expected: Upon instantiating the env,
env.observation_space["policy"].shape
is(num_envs, H, D1 + D2)
.Actual: Upon instantiating the env,
env.observation_space["policy"].shape
is(num_envs, 2H, D1 + D2)
.flatten_history_dim=True
(e.g., change this parameter of the last example.When reading an observation from the observation manager,
Expected:
policy_obs.reshape(-1, H, D1 + D2) = torch.concatenate([term1, term2], dim=-1)
. Note this is poor syntax - but ideally we could reconstruct the observations with a coherent history dim.Actual:
policy_obs = torch.concatenate([term1.reshape(num_envs, -1), term2.reshape(num_envs, -1)], dim=-1).
The reshape does not preserve a history dimension.System Info
Describe the characteristic of your environment:
Additional context
Will open a PR with suggested changes to fix these issues.
Checklist
Acceptance Criteria
env.observation_space
The text was updated successfully, but these errors were encountered: