DRL - Multi-Agent DDPG Algorithm - Tennis Collaboration

Introduction

In this environment, two agents control rackets to bounce a ball over a net. If an agent hits the ball over the net, it receives a reward of +0.1. If an agent lets a ball hit the ground or hits the ball out of bounds, it receives a reward of -0.01. Thus, the goal of each agent is to keep the ball in play.

The observation space consists of 24 variable of continous space corresponding to the position and velocity of the ball and racket. Each agent receives its own, local observation.

Two continuous actions are available, corresponding to movement toward (or away from) the net, and jumping.

Using the Unity agent/environment "Tennis", this deep reinforcement learning task trains two AI agents to play tennis with each other in a cooperative way. The agents are rewarded for keeping the ball in play as long as possible. The task is considered solved when the average reward of the winning agent each episode hits 0.5 over 100 consecutive episodes.

The agents receive feedback in the form of a reward after taking each action. They decide whether to move their rackets forward or backward and at what velocity. They also can decide to jump. A +0.1 reward is given if the agent hits the ball over the net and a -0.01 penalty if they miss the ball or hit it out of bounds. The algorithm provides to each agent the velocity and position of both agents and the ball, but only tells each agent their own reward, not the reward of the other agent.

MADDPG Algorithm

As seen in the code, the MADDPG algorithm is used to train the two agents. MADDPG is a multi-agent variant of DDPG, a model-free, off-policy (which means we can use Experience Replay), policy gradient-based algorithm that uses two separate deep neural networks (one actor, one critic).

The attached code written in Python, using PyTorch and presented in a Jupyter Notebook, demonstrates how the agents learn and eventually achieve the average score of 0.5 in 384 episodes. The attached Report describes the algorithm and methodology in detail, including the introduction of Exploratory Boost.

Setup Instructions

To reproduce this model on a Mac:

Install the Anaconda distribution of Python 3
Install PyTorch, Jupyter Notebook and Numpy in the Python3 environment.
to install the required components please run the below command.

pip install -r requirements.txt

Clone the Udacity DRLND repo and install the dependencies by typing the following:

git clone https://github.com/udacity/deep-reinforcement-learning.git

cd deep-reinforcement-learning/python

pip install .
Download the environment from one of the links below. You need only select the environment that matches your operating system:

Linux: click here
Mac OSX: click here
Windows (32-bit): click here
Windows (64-bit): click here

(For Windows users) Check out this link if you need help with determining if your computer is running a 32-bit version or 64-bit version of the Windows operating system.

Open Jupyter Notebook and run the Multiagent.ipynb file to train the agent.
To watch the agents I trained play tennis, execute the following command
```
python3 Testing.py
```

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.ipynb_checkpoints		.ipynb_checkpoints
SavedAgents		SavedAgents
__pycache__		__pycache__
python		python
CuriosityDrivenModel.png		CuriosityDrivenModel.png
MADDPGGIF.gif		MADDPGGIF.gif
Maddpg.png		Maddpg.png
Multiagent.ipynb		Multiagent.ipynb
Multiagent.md		Multiagent.md
Multiagent.zip		Multiagent.zip
README.md		README.md
Report.md		Report.md
Testing.py		Testing.py
changedMADDPG.ipynb		changedMADDPG.ipynb
checkpoint_actor0.pth		checkpoint_actor0.pth
checkpoint_actor1.pth		checkpoint_actor1.pth
checkpoint_critic0.pth		checkpoint_critic0.pth
checkpoint_critic1.pth		checkpoint_critic1.pth
dog-project.zip		dog-project.zip
model.py		model.py
notebook.tar.gz		notebook.tar.gz
output_19_1.png		output_19_1.png
output_19_11.png		output_19_11.png
output_19_13.png		output_19_13.png
output_19_15.png		output_19_15.png
output_19_3.png		output_19_3.png
output_19_5.png		output_19_5.png
output_19_7.png		output_19_7.png
output_19_9.png		output_19_9.png
output_20_2.png		output_20_2.png
requirements.txt		requirements.txt
unity-environment.log		unity-environment.log

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DRL - Multi-Agent DDPG Algorithm - Tennis Collaboration

Introduction

MADDPG Algorithm

Setup Instructions

About

Releases

Packages

Languages

AdithyaVenkateshMohan/Multiagents-Tennis-MADDPG

Folders and files

Latest commit

History

Repository files navigation

DRL - Multi-Agent DDPG Algorithm - Tennis Collaboration

Introduction

MADDPG Algorithm

Setup Instructions

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages