LEE- Mekatronik Mühendisliği-Doktora

Bu koleksiyon için kalıcı URI

http://hdl.handle.net/11527/19387

Gözat

Şimdi gösteriliyor 1 - 5 / 18

Applications of deep reinforcement learning for advanced driving assistance systems

(Graduate School, 2023-07-23) Yavaş, Muharrem Uğur ; Kumbasar, Tufan ; 518162005 ; Mechatronics Engineering

Nowadays, advanced driving support systems are becoming more prevalent every day. For instance, although adaptive cruise control has been present in some mass-produced vehicles since 1980, it is now available in almost every new vehicle model and is becoming usable, especially in congested traffic situations, with the help of developing technology. On the other hand, the autonomous lane centering function developed for highway environments reduces the driving load on drivers. One of the main reasons for the advancement and prevalence of technology is the progress in environmental perception sensors. Decision-making algorithms can obtain high-accuracy positions of lanes and other vehicles' speed and positions on the road by blending data from intelligent camera and radar sensors. Thanks to advancements in artificial intelligence research, the main topic of this thesis is to evaluate the conditions of surrounding vehicles to achieve cruise follow speed, the amount of gas or brake applied, and finally, the lane changing decision by deep reinforcement learning. Deep reinforcement learning is the integration of reinforcement learning theory into new generation artificial neural networks that emerged with the deep learning revolution. In the proposed methods, both the adaptive cruise control and autonomous lane-changing functions designed with deep reinforcement learning have taken more optimal decisions than classical algorithms and the similarity between the decisions taken and those taken by human drivers has been revealed. Adaptive cruise control systems typically calculate the amount of acceleration required to maintain a safe following distance by using information about the distance to the closest vehicle. However, this method is not compatible with human driving behavior, as it involves scanning the entire traffic and taking into account the dynamic elements surrounding the vehicle being driven. In one of our proposed solutions, we designed the adaptive cruise control function using a model-based deep reinforcement learning method. In model-based reinforcement learning, the decision-making policy uses its own internal model during training to minimize interaction with the system. Therefore, one artificial neural network creates the decision-making policy, while a second network creates the internal model. By using the proposed meta-learning approach to train the two neural networks in a closed-loop fashion, we selected two leader vehicle data inputs for the algorithm instead of a single one. In our simulation environment, the model-based artificial intelligence algorithm performed better than the classical intelligent driver model. Additionally, we proposed a hybrid method that switches to the classical driver model if the internal model and real-world observations do not match for a certain period of time, with a fallback mechanism added to the system's internal model. xxiii In the second proposed study on adaptive cruise control, we suggested a discrete driver model inspired by human drivers' use of gas and brake pedals to manipulate them directly. In the analysis performed using data collected from real life, it was observed that drivers were driving at a stable state with certain gas and brake pedals and coped with dynamic conditions by applying delta brake or pedal. Different gas and brake delta levels were determined through statistical inference based on this dataset. In this case, as the inputs of the artificial intelligence algorithm, the position and speeds of all vehicles in a multi-lane highway in front of the vehicle were determined. When considering the superiority of the algorithms that work with a single leader vehicle compared to two leader vehicles on a single lane, the information of the vehicles on the adjacent lanes will help in case of changes in the leading vehicle of the ego vehicle. The deep Q-learning algorithm, which provides the best results in discrete outputs, was used as the decision-making algorithm. In the evaluations performed on both simulation and real test data, the proposed algorithm obtained the highest score. Especially, slowing down the vehicle in line with its own friction by giving a 0 output without pressing both gas and brake pedals, which can be evaluated as tactical decision-making, was frequently preferred by the designed algorithm. The other advanced driver assistance system studied in the thesis work is the autonomous lane-changing function. In the first original study, autonomous lane-changing was designed using deep reinforcement learning method, and the normally long training process was accelerated 5 times with the proposed safety reward feedback. In the autonomous lane-changing problem, the critical task is to process the position and speed information from all vehicles in front and behind in traffic and make safe maneuvers that will cause speed increase at the right time. Especially in complex traffic scenarios created in simulated environments, classical algorithms are adversely affected by sensor uncertainties and noises, and they cannot show optimal performance in the dynamic driving of multiple vehicles. With the uncertainty calculation in the designed deep reinforcement learning algorithm, the confidence level of the decisions made is observed, and progress is made in the important research area of explainable artificial intelligence. It seems that although deep reinforcement learning techniques have achieved significant successes, they still face integration issues in real-world applications. One of the main problems is the lengthy training process, which can take millions of steps, and the fact that policies are optimized through trial and error, making training in real systems impossible. One promising area of research is sim2real transfer, which involves transferring policies trained in simulation directly to real-world applications. In the second original study on autonomous lane changing, a new approach was introduced to measure the transferability between two simulators with different resolutions. The transferability was evaluated using a human-like usage score generated from the traffic situations when lane-changing decisions were made. In the training process, an adjusted reward function was used, and the proposed method outperformed reference methods in terms of both efficiency and safety, achieving the highest human-like lane-changing score.
Developing mobile robot obstacle avoidance methods with model-based and learning-based methods

(Graduate School, 2023-07-19) Özdemir, Aykut ; Bogosyan, Seta O. ; 518162010 ; Mechatronics Engineering

Mobile robot navigation is a crucial area of research and development in robotics that focuses on enabling robots to move autonomously in their environments. Mobile robots are increasingly being used in a wide range of applications, including manufacturing, healthcare, transportation, and search and rescue missions. These robots have the potential to improve efficiency, reduce costs, and enhance safety in a variety of industries. However, for mobile robots to be effective, they must be able to navigate their surroundings with accuracy and reliability. Navigation involves the robot's ability to perceive its environment, plan a path, and execute that path while avoiding obstacles and other hazards. The development of mobile robot navigation systems has been a major area of focus in robotics research for several decades, and it continues to evolve rapidly. Advances in technologies such as sensors, computing, and machine learning have enabled mobile robots to navigate more complex environments and perform increasingly sophisticated tasks. As such, mobile robot navigation is a critical area of study for researchers and engineers who seek to develop intelligent and autonomous systems that can operate in real-world environments. Path planning and obstacle avoidance are two important topics in robotics that are closely related. Path planning refers to the process of determining a safe and efficient path for a robot to travel from its current location to a desired destination. This process takes into account the robot's movement capabilities, the environment it is operating in, and any obstacles that may be present. Obstacle avoidance, on the other hand, involves the robot's ability to detect and avoid obstacles as it navigates its environment. This is an essential component of path planning, as the robot must be able to react to changes in its environment and modify its path accordingly in order to avoid collisions and ensure safety. Both path planning and obstacle avoidance are critical for the development of autonomous robots that can navigate complex environments and perform tasks without human intervention. These topics are the focus of ongoing research in the field of robotics, and advances in technologies such as sensors, mapping algorithms, and machine learning are enabling robots to navigate increasingly complex environments with greater efficiency and safety. This study proposes three novel contributions in the field of robotics. The first is a novel model-based obstacle avoidance method that plans local trajectories by passing through gaps between obstacles. The second is a learning-based sampling method that improves the efficiency of trajectory planning for path planning algorithms. Finally, we proposed a non-holonomic local planner that uses a CNN-based sampling technique. These contributions aim to improve the navigation and path planning capabilities of robots, allowing them to operate more efficiently and safely in complex environments. Overall, this thesis demonstrates the potential of using advanced techniques and technologies, such as machine learning and local planning, to enhance the performance and capabilities of mobile robots.
A novel gripper design based on series elastic actuator for object recognition and manipulation

(Graduate School, 2023-03-03) Kaya, Ozan ; Ertuğrul, Şeniz ; 518162009 ; Mechatronics Engineering

Because of Industry 4.0 and its following releases, robotic applications are becoming more significant. The goal of using robots is to automate industrial processes and increase production yield. However, there are still study topics that need to be explored for other problems, such as safety and cooperation. Furthermore, sensor technologies are another important subject for automation. In general, sensors like encoders, cameras, lidar, and proximity are chosen for the control algorithm's feedback sensors. Many times, when only one sensor is used, sensor technologies are insufficient to identify or describe incidental obstacles. Due to this, two or more sensors may be required for continuity and safety. Alternatively, it is proposed that a gripper design with external effect sensitivity may be useful in both reducing the number of sensors and inherently sensing the external effects. For this purpose, a novel gripper mechanism design based on SEA is achieved for object recognition and manipulation. For a low-cost solution, one actuator with a ball-screw mechanism as a linear actuator is used for the fingers' positions. As it is based on SEA, the spring is placed between the linear actuator and the fingers. With this method, the finger can be actuated by one motor. However, they can be rotated independently by external effects. To estimate the external force, the length of the spring is computed by using absolute encoders. As a result of these, the proposed gripper mechanism is sensitive to external effects and can be used for estimating force without any force/torque sensor or tactile sensor. For object recognition, the proposed gripper interacts with the objects placed at the workspace. However, this is not enough to recognize an object. Hence, a DNN model is needed to interpret the interaction between the gripper and an object. Therefore, a DNN model is created in order to achieve recognition by using the points on the defined objects' surfaces. For the training part of DNN, a synthetic data set is generated via CloudCompare. As a result of different hyperparameters' effects on the DNN model, the best model is achieved for the recognition of 11 objects. The experiments are conducted in MEAM laboratory with the gripper mounted on the Staubli Rx160 robot arm. It is proposed for object manipulation that the gripper has the ability to compensate for position faults caused by controller error, an inaccurate model, and so on. The proposed gripper can successfully perform common industrial tasks such as peg-in-hole and surfaces following in collaborative applications. To prove this approach, the experiments are conducted with a haptic device and the gripper mounted on the Staubli Rx160 robot arm and used untrained operators. The results are compared according to control strategies. For this purpose, the user operates the tasks in cases of no guidance and a rigid gripper mechanism, guidance and a rigid gripper mechanism, and a series elastic gripper mechanism with guidance.
Cooperative control of multi-agent system under time delay

(Graduate School, 2023-09-07) Akkaya, Şirin ; Ergenç, Ali Fuat ; 518142009 ; Mechatronic Engineering

In this Ph.D. dissertation, multi-agent systems are studied in detail using two of the most common examples in practice, which are vehicle platooning systems and formation control of unmanned aerial vehicles. For a better understanding of the study, some basic information such as graph theory, matrix theory, and time-delayed systems are given. Then, the "Cluster Treatment of Characteristic Roots" paradigm, which forms the backbone of the study, is explained, and the existing methods in the literature have been explained. In this study, a new Bezout Resultant matrix-based CTCR method has been proposed, and the steps of the algorithm are explained via simulation examples in detail. The main advantage of the proposed method is that it provides computational convenience for the time-delayed systems in which the degree of characteristic equation is relatively large and not decomposed into factors in obtaining the stability posture of the system in terms of time delay. First, the distributed controller algorithm is selected as the state feedback controller. The closed-loop system matrix is constructed for the cases with and without time delay. The controller coefficients that make the system stable are obtained by using the Routh table and Lyapunov-based methods for the case where the time delay is neglected. However, in the presence of delay, the system is converted into retarded time delay system, and the stability posture is obtained with CTCR methods for single and multiple time delays. Morover, the formation geometry between vehicles is considered as constant policy and constant headway policy. For constant policy, the characteristic equation of the system for delayless and single time delay case, is decomposed into factors, which makes the stability analysis easier. But, this case is not possible for the characteristic equation involved multiple time delay, which direct us to utilize Bezout Resultant matrix-based CTCR method. For constant time headway policy, it is seen that, the characteristic equation cannot decomposed into factors for any cases. So, the sufficient condition is derived for determining the stability of multi-agent system for delay-free case with converted the system matrix to block companion form and block Schwarz form. Then, a PID controller based distributed controller protocol is proposed. The cooperative control problem of multi-agent system with distributed PID controller is converted into an asymptotic stability problem through matrix and state transformations in the absense of time delay. Finally, a Lyapunov function is created and the controller parameters are choosen with the help of linear matrix inequality. In the presence of time delay, the closed-loop system is converted into a neutral-type time delay system. And, the stability posture of the multi-agent system is obtained with the help of Kronecker multiplication and elementary transformation based CTCR method. Finally, all the theoretical studies and simulation results are evaluated with a real-time experimental study. An industrial controller-based real-time simulation for the platoon system with five connected vehicle including a virtual leader is proposed. The constant time headway policy is selected to modeled the desired inter-vehicle distance and the vehicle dynamic states-based distributed control strategy is used to converge to their desired velocities and inter-vehicle distances. Then the multi-agent platooning control problem is converted into LTI system stability analysis problem. The delay-based stability analysis is studied by means of Bezout Resultant matrix-based CTCR method. Numerical simulations are provided to verify the validity of the proposed method. The real-time experiments are carried out on industrial computers to show the applicability of the proposed method in real time systems. The study concluded by evaluating the results and recommendations.
A control-theoretic approach for vision based quality aware autonomous navigation and mapping toward drone landing

(Graduate School, 2023-12-15) Sözer, Onuralp ; Kumbasar, Tufan ; 518172009 ; Mechatronics Engineering

This thesis presents a novel autonomous navigation approach that is capable of increasing map exploration and accuracy while minimizing the distance traveled for autonomous drone landings. For terrain mapping, a probabilistic sparse elevation map is proposed to represent measurement accuracy and enable the increasing of map quality by continuously applying new measurements with Bayes inference. For exploration, the Quality-Aware Best View (QABV) planner is proposed for autonomous navigation with a dual focus: map exploration and quality. Generated paths allow for visiting viewpoints that provide new measurements for exploring the proposed map and increasing its quality. To reduce the distance traveled, we handle the path-cost information in the framework of control theory to dynamically adjust the path cost of visiting a viewpoint. The proposed methods handle the QABV planner as a system to be controlled and regulate the information contribution of the generated paths. As a result, the path cost is increased to reduce the distance traveled or decreased to escape from a low-information area and avoid getting stuck. The usefulness of the proposed mapping and exploration approach is evaluated in detailed simulation studies including a real-world scenario for a packet delivery drone.

Gözat

Son Başvurular