Developing Multi-Agent Reinforcement Learning in Adaptive Traffic Signal Control

Abstract:
Nowadays, severe traffic congestion in urban areas resulting in different undesirable socio-economic and environmental consequences is inevitable. Infrastructure improvement for preventing these undesirable impacts seems to be necessary. Integration of intelligent transportation systems (ITS) into the existing transportation infrastructure leads efficient operation -using electronic, sensing, information and communication technologies, and advanced control techniques- without building new roads.
The main focus of this article is developing multi-agent reinforcement learning for traffic signal control. Two types of agents are employed: (1) Learning traffic signal agents (LTSAs) that interact with the traffic environment in order to find the optimal traffic signal parameters (traffic signal timing) in response to traffic fluctuations. (2) Vehicle agents that are purely reactive. They can detect their forward direction, current driving lane, other vehicles, and the current phase of approaching traffic signal. Also, vehicles can chane their driving lane in order to reach the better driving speed. Unlike vehicles that are reactive and are not able to learn, LTSAs have the ability to learn over time through reinforcement learning.
Reinforcement learning originally stems from the study of animal intelligence and has been developed as a major branch of machine learning for solving sequential decision-making problems. It is a useful approach for solving the stochastic optimization problems. It learns the optimal policy of the agent by interacting with the environment in such a way to maximize some numerical value which represents a long-term objective. Reinforcement learning allows traffic signals to automatically determine the ideal behavior for achieving their objectives. In fact, it enables traffic signals to learn and react flexibly to different traffic situations without the need of a predefined model of the environment and also without the need of human intervention. Each time the traffic signal performs an action, it receives a reward signal indicating whether its action has led it closer to realizing their objectives or not. The traffic signal tries to learn a control policy which is a mapping from states to actions that maximizes the expected sum of the received rewards.
Two different scenario including single-agent traffic signal control and multi-agent traffic signal control were conducted. In the first scenario, a learning agent controls an isolated intersection by employing two methods of reinforcement learning including Q-learning and State-Action-Reward-State-Action (SARSA). Q-Learning is an off-policy method that updates the value of actions based on the hypothetical actions. In Q-Learning, as long as the traffic signal visits all the state-action pairs, it converges to the optimal action-values. SARSA is an on-policy algorithm that updates action-values on the basis of the experience gained from following some policy. In SARSA, the traffic signal should explore, and stop exploring after a number of steps. The results of the first scenario indicate that Q-Learning outperforms SARSA. In the second scenario, four learning agents control the main street composed of four intersections by employing indirect cooperative Q-Learning. The results of the second scenario reveal that the indirect cooperative Q-Learning controller decreases 81% queue length, 78% travel time, 57% fuel consumption and 73% air pollution when compared to the optimized pre-timed controller.
Language:
Persian
Published:
Journal of Geomatics Science and Technology, Volume:7 Issue: 1, 2017
Page:
85
magiran.com/p1745154  
دانلود و مطالعه متن این مقاله با یکی از روشهای زیر امکان پذیر است:
اشتراک شخصی
با عضویت و پرداخت آنلاین حق اشتراک یک‌ساله به مبلغ 1,390,000ريال می‌توانید 70 عنوان مطلب دانلود کنید!
اشتراک سازمانی
به کتابخانه دانشگاه یا محل کار خود پیشنهاد کنید تا اشتراک سازمانی این پایگاه را برای دسترسی نامحدود همه کاربران به متن مطالب تهیه نمایند!
توجه!
  • حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران می‌شود.
  • پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانه‌های چاپی و دیجیتال را به کاربر نمی‌دهد.
In order to view content subscription is required

Personal subscription
Subscribe magiran.com for 70 € euros via PayPal and download 70 articles during a year.
Organization subscription
Please contact us to subscribe your university or library for unlimited access!