研究生: |
鄭東濬 Cheng, Tung-Chun |
---|---|
論文名稱: |
基於強化學習之高速公路路肩流量管制策略 Reinforcement Learning Approach for Adaptive Road Shoulder Traffic Control |
指導教授: | 賀耀華 |
學位類別: |
碩士 Master |
系所名稱: |
資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2020 |
畢業學年度: | 108 |
語文別: | 中文 |
論文頁數: | 50 |
中文關鍵詞: | 交通堵塞 、流量管制(Traffic Control) 、強化學習(Reinforcement Learning) 、路肩通行 、SUMO |
英文關鍵詞: | Congestion, Traffic Control, Reinforcement Learning, Road Shoulder, SUMO |
DOI URL: | http://doi.org/10.6345/NTNU202001219 |
論文種類: | 學術論文 |
相關次數: | 點閱:154 下載:19 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
為解決在公速公路上的交通壅塞情況,透過行車速度、通行車流量以及紅綠燈等都是現行的方式以控制交通。在壅塞情形發生時,透過外力的介入,來想辦法控制整體狀況,不要讓交通壅塞更加惡化。所幸在現代車聯網愈趨開發穩定的情況,透過(Vehicle to Vehicle, V2V)或是(Vehicle to Infrastructure, V2I)等方式,能夠更快速的將交通舒緩策略傳遞給所有在此範圍運行中的車輛,並讓他們及時地做出反應來幫助整體交通的舒緩。
在本篇研究中提出基於強化學習的路肩通行車流量管制策略(Reinforcement Learning Approach for Adaptive Road Shoulder Traffic Control, ARSTC)。不同於傳統固定路肩開放時間的方式,本研究提出適用且合乎現行高公局法規之下的路肩管制策略,藉由結合強化學習(Reinforcement Learning)的技術,使其能夠對應不同車流的情況,推薦不同的管制策略。透過在模擬環境的實驗結果 (Simulation of Urban Mobility, SUMO),ARSTC能夠依照整體的車流變化來判斷是否開放路肩通行,讓路肩通行的車流量能夠控制在安全的範圍內,且能夠最小化與原本無管制車流的壅塞時間差異,來達到最安全且有效率的路肩通行環境。
To reduce traffic congestion on the highway, variable speed limit, flow control, and traffic light are used in the current traffic control system. Through those approaches, the traffic can maintain in an acceptable condition when congestion occurred. With the development of the vehicular networks, i.e., Vehicle-to-Vehicle(V2V) and Vehicle-to-Infrastructure (V2I) techniques, drivers are able to receive updated traffic information which allows them to change their route plan immediately.
In this research, we proposed a Reinforcement Learning Approach for Adaptive Road Shoulder Traffic Control (ARSTC) to dynamically change the opening and closing time of hard shoulder. Using the reinforcement learning approach, the proposed ARSTC technique, is able to adjust to different traffic situations and make a suitable decision which is different from the traditional static scheduling approach for the hard shoulder. The proposed technique is simulated in the Simulation of Urban Mobility (SUMO). The performance results showed that ARSTC can reduce traffic congestion time by adaptively control the hard shoulders’ opening time and the traffic flow within the safety range follow by the policy of the Freeway Bureau. Our proposed technique (ARSTC) is able to provide a safer and more efficient driving condition while using the hard shoulder to ease traffic congestion.
[1] 交通部高速公路局, 高速公路1968, 中華民國: 交通部高速公路局, 2020. [online].
Available: https://1968.freeway.gov.tw/, [Accessed: Jan 30, 2020]
[2] 科技部, 民 生 公 共 物 聯 網, 中 華 民 國: 科技部, 2020. [online]. Available:
https://ci.taiwan.gov.tw/, [Accessed: Jan 30, 2020]
[3] M. J. Cassidy and J. Rudjanakanoknad, “Increasing the capacity of an isolated merge by
metering its on-ramp,” Transportation Research Part B: Methodological, VOL. 39, NO.
10, pp. 896–913, 2005.
[4] K. Chung, J. Rudjanakanoknad, and M. J. Cassidy, “Relation between traffic density and
capacity drop at three freeway bottlenecks,” Transportation Research Part B:
Methodological, VOL. 41, NO. 1, pp. 82–95, 2007.
[5] L. Zhang and D. Levinson, “Ramp metering and freeway bottleneck capacity,”
Transportation Research Part A: Policy Practice, VOL. 44, NO. 4, pp. 218–235, 2011.
[6] Zhibin Li, Pan Liu, Chengcheng Xu, Hui Duan, and Wei Wang, “Reinforcement LearningBased Variable Speed Limit Control Strategy to Reduce Traffic Congestion at Freeway
Recurrent Bottlenecks”, IEEE Transactions on Intelligent Transportation on Systems, VOL.
18, NO. 11, NOVEMBER 2017
[7] Richard S. Sutton and Andrew G. Barto, “Reinforcement Learning: An Introduction,” VOL.
2, in Progress, Cambridge: MIT Press, 2014
[8] Leslie Park Kaelbling, Micheal L. Littman, and Andrew W. Moore, “Reinforcement
Learning: A Survey,” Journal of Artificial Intelligence Research 4, pp. 237-285, 1996
[9] H. Liu, L. Zhang, D. Sun, and D. Wang, “Optimize the settings of variable speed limit
system to improve the performance of freeway traffic,” IEEE Tranactions on Intelligent
Transporation on System, VOL. 16, NO. 6, pp. 3249–3257, DECEMBER 2015.
[10] Andreas Hegyi, Bart De Schutter, and J. Hellendoorn, “Optimal Coordination of Variable
Speed Limits to Suppress Shock Waves,” IEEE Tranactions on Intelligent Transporation
on System, VOL. 6, NO. 1, pp. 102-112, MARCH 2005
[11] Rodrigo C. Carlson, Ioannis Papamichail, and Markos Papageorgiou, “Optimal mainstream
traffic flow control of large-scale motorway networks,” Transportation Research Part C:
Rmerging Technologies, VOL. 18, NO. 2, pp. 193-212, 2010
49
[12] R. C. Carlson, I. Papamichail, and M. PaPageorgious, “Local Feedback-Based Mainstream
Traffic Flow Control on Motorways Using Variable Speed Limits,” IEEE Transaction on
Intelligent Transportation System, VOL. 12, NO. 4, pp. 1261-1276, DECEMBER 2011.
[13] G. Iordanidou, C. Roncoli, I. papamichail, and M. PaPageorgious, “Feedback-Based
Mainstream Traffic Flow Control for Multiple Bottlenecks on Motorways,” IEEE
Transaction on Intelligent Transportation System, VOL. 16, NO. 2, pp. 610-621, APRIL
2015.
[14] Y. Zhang and P.A. Ioannou, “Combined Variable Speed Limit and Lane Change Control
for Highway Traffic,” IEEE Transaction on Intelligent Transportation System, VOL. 18,
NO. 7, pp. 1812-1823, JULY 2017.
[15] Zhou, Weiyi. "A Q-Learning Based Integrated Variable Speed Limit and Hard Shoulder
Running Control to Reduce Travel Time at Freeway Bottleneck." PhD diss., 2019.
[16] R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. Cambridge, MA,
USA: MIT Press, 1998.
[17] Altman, Eitan, “Constrained Markov Decision Processes,” VOL. 7, CRC Press, 1999.
[18] Spaan, Matthijs TJ. “Partially observable Markov Decision Processes,” Reinforcement
Learning, Springer Berlin, Heidelberg, pp. 387-414, 2012.
[19] Sutton, Richard S., Doina Precup, and Satinder P. Singh, “MDPs and semi-MDPs: A
framework for temporal abstraction in reinforcement learning. ” Artifficial Intelligence,
112. 1-2, 181-211, 1999.
[20] Jaakola, Tommi, Satinder P. Singh, and Micheal I. Jordan. “Reinforcement Learning
algorithm for partially observable Markov decidion Problems,” Advance in neural
information processing system. 1995.
[21] Teasuro, Gerald. “Temporal difference learning and TD-Gammon,” Communication of the
ACM 38.3, pp. 58-68, 1995.
[22] Watkins, C.J.C.H. “Learning from Delayed Rewards,” Cambridge University, Ph.D. thesis,
1989.
[23] Wang, Chong, Jian Zhang, Linghui Xu, Linchao Li, and Bin Ran. "A new solution for
freeway congestion: Cooperative speed limit control using distributed reinforcement
learning." IEEE Access 7 (2019): 41947-41957.
[24] B. Abdulhai, R. Pringle, and G. J. Karakoulas, “Reinforcement learning for true adaptive
traffic signal control,” Journal of Tranportation Engineering, VOL. 29, NO. 3, pp. 278–
285, 2003.
50
[25] Kasra Rezaee, Baher Abdulhai, and Hossam Abdelgawad, “Application of reinforcement
learning with continuous state space to ramp metering in real-world conditions,” IEEE
Conference on Intelligent Transportation Systems, September, 2012
[26] Landau, Lev Davdvoich, and Lifshitz, Evgeny Mikhailovich, “Statistical Physics. Course
of Theoretical Physics Edition 3., ” Oxford: Pergamon Press, ISBN 0-7506-3372-7, 1980.
[27] Tokic, Michel, “Adaptive ε-greedy exploration in reinforcement learning based on value
difference,” Annual Conference on Artificial Intellegence, pp. 203-210, Springer, Berlin,
Heidelberg, 2010.
[28] 交通部高速公路局交通資料庫, VD 五分鐘動態資訊(V 1.1), 中華民國: 交通部高速公
路局,2020. [dataset]. Available: https://tisvcloud.freeway.gov.tw/history/vd/. [Accessed:
Jan 30, 2020]
[29] 交通部高速公路局, 國道主線實施開放路肩作業規定, 中華民國: 交通部高速公路局,
2020. [online]. Available:
https://www.freeway.gov.tw/Upload/DownloadFiles/%E5%9C%8B%E9%81%93%E4%
B8%BB%E7%B7%9A%E5%AF%A6%E6%96%BD%E9%96%8B%E6%94%BE%E8%
B7%AF%E8%82%A9%E4%BD%9C%E6%A5%AD%E8%A6%8F%E5%AE%9A_005
361.pdf. [Accessed: May 20, 2020]
[30] 交通部高速公路局, 高速公路及快速公路交通管制規則, 中華民國: 全國法規資料庫,
2020. [online]. Available:
https://law.moj.gov.tw/LawClass/LawAll.aspx?pcode=K0040019. [Accessed: May 20,
2020]
[31] German Aerospace Center (DLR), “Simulation of Urban Mobility, ” sumo.dlr.de, 2020.
[online]. Available: https://sumo.dlr.de/docs/index.html. [Accessed: Jan 30, 2020]
[32] OpenStreetMap Foundation (OSMF), “OpenStreetMap, ” openstreetmap.org, 2020.
[online].Available: https://www.openstreetmap.org. [Accessed: Jan 30 2020]