You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The reference generators seem to always start at zero (cf. Wiener process init). If we run out multiple (small length) episodes this is likely to result in a bias towards zero reference values and, therefore, the performance of learning-based controllers might deteriorate for states/references far away from the origin. Hence, it would be nice to allow a configurable initial value (e.g., manual by the user or automatically drawn from a uniform distribution).
However, even if this functionality is added we do not have any reference generator at hand which ensures that the sampled reference follows a (multivariate) uniform distribution (e.g., uniform coverage of the left id-iq half-space within the current limit of PMSMs). From my perspective, this would be the most natural choice to ensure that a controller learns an optimal policy in the relevant reference space in a bias-free fashion.
This issue is also loosely linked to issue #137: the above argumentation does not address the transient behavior of reference generators, i.e., how fast reference values can change over time. Hence, another dimension of the above problem is transient reference behavior. For example, if we would use an Ornstein-Uhlenbeck process for PMSM current control, its stiffness and the id-iq means would form a three-dimensional space where we would like to achieve a multivariate uniform distribution as well such that fast and slow reference changes are equally likely in the entire reference space.
The text was updated successfully, but these errors were encountered:
The reference generators seem to always start at zero (cf. Wiener process init). If we run out multiple (small length) episodes this is likely to result in a bias towards zero reference values and, therefore, the performance of learning-based controllers might deteriorate for states/references far away from the origin. Hence, it would be nice to allow a configurable initial value (e.g., manual by the user or automatically drawn from a uniform distribution).
However, even if this functionality is added we do not have any reference generator at hand which ensures that the sampled reference follows a (multivariate) uniform distribution (e.g., uniform coverage of the left id-iq half-space within the current limit of PMSMs). From my perspective, this would be the most natural choice to ensure that a controller learns an optimal policy in the relevant reference space in a bias-free fashion.
This issue is also loosely linked to issue #137: the above argumentation does not address the transient behavior of reference generators, i.e., how fast reference values can change over time. Hence, another dimension of the above problem is transient reference behavior. For example, if we would use an Ornstein-Uhlenbeck process for PMSM current control, its stiffness and the id-iq means would form a three-dimensional space where we would like to achieve a multivariate uniform distribution as well such that fast and slow reference changes are equally likely in the entire reference space.
The text was updated successfully, but these errors were encountered: