Skip to content

tadeaspaule/generalised-rl-environment

Repository files navigation

generalised-rl-environment

The Env class

The main feature is the Env class, which gets rid of the hassle of building custom environments, and instead asks; what is unique about your RL project?

The following is a snippet of the full code example provided in the repo, just initialising the Env object:

def take_action(state,action):
    oldx,oldy = state['blocks'][0].x,state['blocks'][0].y
    state['blocks'][0].action(action)
    newx,newy = state['blocks'][0].x,state['blocks'][0].y
    state['obs'] += [oldx-newx,oldy-newy,oldx-newx,oldy-newy]

    if np.count_nonzero(state['obs'][0,:2]) == 0:
        return state,10,True
    elif np.count_nonzero(state['obs'][0,2:]) == 0:
        return state,-10,True
    else:
        return state,-1,False

def get_observation(state):
    return state['obs']
    
def get_start_state():
    p,e,f = (Block(X,Y),Block(X,Y),Block(X,Y))
    while (e.x,e.y) == (f.x,f.y):
        e = Block(X,Y)
    while (p.x,p.y) == (f.x,f.y) or (p.x,p.y) == (e.x,e.y):
        p = Block(X,Y)
    return {'blocks': (p,e,f), 'obs':  np.asarray([(*(f-p), *(e-p))],dtype=np.float)}

def display_env(state):
    env = np.zeros((X,Y,3), dtype=np.uint8)
    env[state['blocks'][1].y,state['blocks'][1].x] = (0,0,255)
    env[state['blocks'][2].y,state['blocks'][2].x] = (0,255,0)
    env[state['blocks'][0].y,state['blocks'][0].x] = (255,0,0)
    return env

env = Env(
    take_action=take_action,
    get_action=None, # provided in the Agent class
    get_start_state=get_start_state,
    display_env=display_env,
    get_observation=get_observation,
    steps_per_ep=30,
    stat_every=10,
    printing=False
)

Now, calling env.run() will run through the usual loop of episodes and steps, using the methods you provide it when relevant.

The basic philosophy of this class is providing hooks to only those moments that vary project to project, like taking an action, getting the starting state, etc. For all of these methods, the appropriate parameters are passed to make sure they have enough information to draw on.

To get details on the Env constructor, you can call

Env.help()

The flexibility of the Env class means that it can be used in pretty much any RL scenario, be it DQN, neuroevolution, or something completely new!

Block and Snake classes

Block and Snake are two starter classes for simple RL scenarios on a 2D grid, so you can use them out-of-the-box without having to reinvent the wheel. Snake also has a get_observation method implemented, so it's paired nicely with the Env class

Agent, Population, and PopulationTakeTop classes

These classes are specific to neuroevolution, and come closely connected together and with the Env class.

As seen in the code above, Agent has a get_action method (which just calls Keras' predict() on the observation), and Population expects an array of Agent objects.

The full code example included in this repo makes use of these classes, so check it out to see them in action.

About

Takes care of boilerplate code for RL environments, + some useful classes for constructing them

Topics

Resources

Stars

Watchers

Forks