Applications of deep reinforcement learning for advanced driving assistance systems

thumbnail.default.alt
Tarih
2023-07-23
Yazarlar
Yavaş, Muharrem Uğur
Süreli Yayın başlığı
Süreli Yayın ISSN
Cilt Başlığı
Yayınevi
Graduate School
Özet
Nowadays, advanced driving support systems are becoming more prevalent every day. For instance, although adaptive cruise control has been present in some mass-produced vehicles since 1980, it is now available in almost every new vehicle model and is becoming usable, especially in congested traffic situations, with the help of developing technology. On the other hand, the autonomous lane centering function developed for highway environments reduces the driving load on drivers. One of the main reasons for the advancement and prevalence of technology is the progress in environmental perception sensors. Decision-making algorithms can obtain high-accuracy positions of lanes and other vehicles' speed and positions on the road by blending data from intelligent camera and radar sensors. Thanks to advancements in artificial intelligence research, the main topic of this thesis is to evaluate the conditions of surrounding vehicles to achieve cruise follow speed, the amount of gas or brake applied, and finally, the lane changing decision by deep reinforcement learning. Deep reinforcement learning is the integration of reinforcement learning theory into new generation artificial neural networks that emerged with the deep learning revolution. In the proposed methods, both the adaptive cruise control and autonomous lane-changing functions designed with deep reinforcement learning have taken more optimal decisions than classical algorithms and the similarity between the decisions taken and those taken by human drivers has been revealed. Adaptive cruise control systems typically calculate the amount of acceleration required to maintain a safe following distance by using information about the distance to the closest vehicle. However, this method is not compatible with human driving behavior, as it involves scanning the entire traffic and taking into account the dynamic elements surrounding the vehicle being driven. In one of our proposed solutions, we designed the adaptive cruise control function using a model-based deep reinforcement learning method. In model-based reinforcement learning, the decision-making policy uses its own internal model during training to minimize interaction with the system. Therefore, one artificial neural network creates the decision-making policy, while a second network creates the internal model. By using the proposed meta-learning approach to train the two neural networks in a closed-loop fashion, we selected two leader vehicle data inputs for the algorithm instead of a single one. In our simulation environment, the model-based artificial intelligence algorithm performed better than the classical intelligent driver model. Additionally, we proposed a hybrid method that switches to the classical driver model if the internal model and real-world observations do not match for a certain period of time, with a fallback mechanism added to the system's internal model. xxiii In the second proposed study on adaptive cruise control, we suggested a discrete driver model inspired by human drivers' use of gas and brake pedals to manipulate them directly. In the analysis performed using data collected from real life, it was observed that drivers were driving at a stable state with certain gas and brake pedals and coped with dynamic conditions by applying delta brake or pedal. Different gas and brake delta levels were determined through statistical inference based on this dataset. In this case, as the inputs of the artificial intelligence algorithm, the position and speeds of all vehicles in a multi-lane highway in front of the vehicle were determined. When considering the superiority of the algorithms that work with a single leader vehicle compared to two leader vehicles on a single lane, the information of the vehicles on the adjacent lanes will help in case of changes in the leading vehicle of the ego vehicle. The deep Q-learning algorithm, which provides the best results in discrete outputs, was used as the decision-making algorithm. In the evaluations performed on both simulation and real test data, the proposed algorithm obtained the highest score. Especially, slowing down the vehicle in line with its own friction by giving a 0 output without pressing both gas and brake pedals, which can be evaluated as tactical decision-making, was frequently preferred by the designed algorithm. The other advanced driver assistance system studied in the thesis work is the autonomous lane-changing function. In the first original study, autonomous lane-changing was designed using deep reinforcement learning method, and the normally long training process was accelerated 5 times with the proposed safety reward feedback. In the autonomous lane-changing problem, the critical task is to process the position and speed information from all vehicles in front and behind in traffic and make safe maneuvers that will cause speed increase at the right time. Especially in complex traffic scenarios created in simulated environments, classical algorithms are adversely affected by sensor uncertainties and noises, and they cannot show optimal performance in the dynamic driving of multiple vehicles. With the uncertainty calculation in the designed deep reinforcement learning algorithm, the confidence level of the decisions made is observed, and progress is made in the important research area of explainable artificial intelligence. It seems that although deep reinforcement learning techniques have achieved significant successes, they still face integration issues in real-world applications. One of the main problems is the lengthy training process, which can take millions of steps, and the fact that policies are optimized through trial and error, making training in real systems impossible. One promising area of research is sim2real transfer, which involves transferring policies trained in simulation directly to real-world applications. In the second original study on autonomous lane changing, a new approach was introduced to measure the transferability between two simulators with different resolutions. The transferability was evaluated using a human-like usage score generated from the traffic situations when lane-changing decisions were made. In the training process, an adjusted reward function was used, and the proposed method outperformed reference methods in terms of both efficiency and safety, achieving the highest human-like lane-changing score.
Açıklama
Thesis (Ph.D.) -- Istanbul Technical University, Graduate School, 2023
Anahtar kelimeler
artificial intelligence, yapay zeka, deep learning, derin öğrenme, reinforcement learning, pekiştirmeli öğrenme, advanced driving assistance systems, ileri sürüş destek sistemleri
Alıntı