Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
cartpole.json	cartpole.json
cartpole.png	cartpole.png
cartpolepp.png	cartpolepp.png
phase1.png	phase1.png

CartPole Domain

Task Description
Feature Vector Format
Action Label
Performance
Novelty Indicator
Novelty Characterization
Sample (Mock) Novelty
Phase 1 Revealed Novelty
Frequently Asked Questions

Task Description

CartPole World

For the Phase 2 novelty-level-0 CartPole task, the agent must keep the pole balanced by pushing the cart forward, backward, left or right. The cart is constrained to move in a 2D plane. The pole is affixed to the cart and can move around that fixed point in any direction. There are other objects in the environment that move in 3D and may collide with cart, pole, walls, or each other. The agent receives sensor data about the time, the position and velocity of the cart, the angles and angular velocity of the pole, and the position and velocity of the objects. The agent also receives feedback about their current performance and an optional image. The cart, pole and objects have different random starting states each episode. An episode ends if the cart moves too far from center, the absolute pole angle is too large, or a time limit is exceeded. The time limit is 200 ticks of the game, where each tick corresponds to 0.02 seconds. The agent's final score is T/200, where T the number of ticks they keep the pole balanced.

The graphic below provides more information about the geometry used in CartPole.

CartPole Geometry

See the cartpole.json file for a precise specification of the domain, including ranges on sensor values.

Feature Vector Format

The feature vector provides a value for each sensor and is sent in JSON format. For example,

{
    "cart": {
        "x_position": -0.885514,
        "y_position": -1.605059,
        "z_position": 0.1,
        "x_velocity": -11.970514,
        "y_velocity": 1.005748,
        "z_velocity": 0.0
    },
    "pole": {
        "x_quaternion": 0.064607,
        "y_quaternion": -0.065981,
        "z_quaternion": 0.67565,
        "w_quaternion": 0.731416,
        "x_velocity": 2.085099,
        "y_velocity": -0.892353,
        "z_velocity": 11.80254
    },
    "blocks": [
        {
            "id": 2,
            "x_position": -1.86226,
            "y_position": 2.550503,
            "z_position": 6.339546,
            "x_velocity": 7.707002,
            "y_velocity": -8.587865,
            "z_velocity": 6.575934
        },
        {
            "id": 3,
            "x_position": -1.275671,
            "y_position": 1.31081,
            "z_position": 2.180347,
            "x_velocity": 9.403932,
            "y_velocity": -6.88735,
            "z_velocity": 9.195959
        }
    ],

    "time_stamp": 1627668897.4486113,
    "image": null
}

The first call will have additional feature vector defining the corners of the cube world.

    "walls": [
        [-5,-5, 0],
        [5,-5, 0],
        [5,5, 0],
        [-5,5, 0],
        [-5,-5, 10],
        [5,-5, 10],
        [5,5, 10],
        [-5,5, 10],
    ]

Action Label

Each turn, the agent provides an action to be performed, which is one of ['nothing', 'left', 'right', 'forward', 'backward']. The returned action is referred to as the "label", which is an artifact of other domains in which the task is classification.

Performance

The current performance of the agent on the current episode is provided as feedback after each agent response and is sent in JSON format. For CartPole, performance is defined as the amount of time the pole is kept balanced divided by the maximum time for the episode. The performance at the end of the episode is recorded as the performance for that entire episode.

Novelty Indicator

After each sensor fecture vector, the novelty generator sends a novelty indicator, which indicates if the current episode is novel "true", not novel "false" (i.e., novelty level 0), or unknown "null". The novelty indicator will be the same for every turn during an episode.

Novelty Characterization

At the end of each episode, the agent provides a novelty characterization for the episode, which includes a probability of novelty, probability threshold, novelty level, and a characterization string.

Sample (Mock) Novelty

The CartPole novelty generator includes sample Phase 2 novelties for levels 1-5, also called Mock novelties. These are described below.

Level 1: The mass of the cart changes.
Level 2: The objects are immovable.
Level 3: The objects initially move toward some random point.
Level 4: The objects change size.
Level 5: The objects are attracted to each other.

The implementations of these mock novelties can be found in the folder WSU-Portable-Generator/source/partial_env_generator/envs/cartpolepp.

Phase 1 Revealed Novelty

All:
- Action space limited to two actions ['left', 'right', 'nothing']. Other actions get mapped to 'nothing'.
- Most sensors are set to a fixed value since the information is no longer available.
Level 1 (Class): Increase length of pole
- No Novelty: length=0.5
- Easy: length=2
- Medium: length=4
- Hard: length=7
Level 2 (Attribute): Increase friction between cart and track
- No Novelty: friction=0
- Easy: friction=0.0125
- Medium: friction varies randomly from 0.0125 to 0.1 based on current zone (x pos)
- Hard: friction=0.0125, but works in reverse, effectively increasing push force
Level 3 (Representation): Decrease sensor precision
- No Novelty: Maximum floating point precision
- Easy: Each sensor value mapped to 1 of 10 buckets over sensor's range
- Medium: Each sensor value mapped to 1 of 6 buckets over sensor's range
- Hard: Each sensor value mapped to 1 of 3 buckets over sensor's range

Frequently Asked Questions

Coming soon...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cartpole

cartpole

README.md

CartPole Domain

Table of Contents

Task Description

Feature Vector Format

Action Label

Performance

Novelty Indicator

Novelty Characterization

Sample (Mock) Novelty

Phase 1 Revealed Novelty

Frequently Asked Questions

Files

cartpole

Directory actions

More options

Directory actions

More options

Latest commit

History

cartpole

Folders and files

parent directory

README.md

CartPole Domain

Table of Contents

Task Description

Feature Vector Format

Action Label

Performance

Novelty Indicator

Novelty Characterization

Sample (Mock) Novelty

Phase 1 Revealed Novelty

Frequently Asked Questions