Reinforcement learning has been recently adopted to revolutionize and optimize traditional traffic signal control systems. Existing methods are either based on a single scenario or multiple independent scenarios, where each scenario has a separate simulation environment with predefined road network topology and traffic signal settings. These models implement training and testing in the same scenario, thus being strictly tied up with the specific setting and sacrificing model generalization heavily. While a few recent models could be trained by multiple scenarios, they require a huge amount of manual labor to label the intersection structure, hindering the model’s generalization. In this work, we aim at a general framework that could eliminate heavy labeling and model a variety of scenarios simultaneously. To this end, we propose a GEneral Scenario-Agnostic (GESA) reinforcement learning framework for traffic signal control with: (1) A general plug-in module to map all different intersections into a unified structure, freeing us from the heavy manual labor to specify the structure of intersections; (2) A unified state and action space to keep the model input and output consistently structured; (3) A large-scale co-training with multiple scenarios, leading to a generic traffic signal control algorithm. In experiments, we demonstrate our algorithm as the first one that can be co-trained with seven different scenarios without manual annotation, and get 17.20% higher rewards than benchmarks. When dealing with a new scenario, our model can still achieve 10.36% higher rewards.
TL; DL: OK, this title is more buzzword-confusing. We proposed a general agent that is massively trained with various city data so that it can control different cities’ traffic lights that the agent has never seen during the training.