Reinforcement learning for traffic signal control in hybrid action space

H-PPO

In this repository, utilizing Hybrid Proximal Policy Optimization¹ (H-PPO), we have implemented the synchronous optimization of the signal staging (discrete action) and its corresponding duration (continuous parameter).

The rollout buffer, network architectures, and (H)PPO classes are defined in src\hppo\HPPO.py. In the specific implementation, we have referred to repo:OpenAI and repo:ikostrikov. We have also considered the tricks in revisit papers²³⁴.

In the envs\ , we define the traffic demand as well as the simulation configuration, for example, 4_4.rou.xml and 4_4.add.xml, to make the sumo work. The environment class SUMOEnv is also included, which have been encapsulated to provide interactive envs (Please make sure you have registered it locally). Of course, if you are not interested in traffic signal control, you can ignore this env and write on your own. High performance and vectorized env, is under developed.

Finally, we present the overview of H-PPO’s architecture as well as the partial training results. And you can cite our paper⁵ as well.

Average queue length of Env #9	Average delay of Env #9

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.idea		.idea
__pycache__		__pycache__
args_log		args_log
data		data
envs		envs
pictures		pictures
src		src
README.md		README.md
train_hppo.py		train_hppo.py
train_hyar.py		train_hyar.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.idea

.idea

pycache

pycache

args_log

args_log

data

data

envs

envs

pictures

pictures

src

src

README.md

README.md

train_hppo.py

train_hppo.py

train_hyar.py

train_hyar.py

utils.py

utils.py

Repository files navigation

Reinforcement learning for traffic signal control in hybrid action space

H-PPO

HyAR(TODO)

About

Releases

Packages

Languages

Metro1998/hppo-in-traffic-signal-control

Folders and files

Latest commit

History

Repository files navigation

Reinforcement learning for traffic signal control in hybrid action space

H-PPO

HyAR(TODO)

Footnotes

About

Resources

Stars

Watchers

Forks

Languages