LEE- Kontrol ve Otomasyon Mühendisliği-Yüksek Lisans
Bu koleksiyon için kalıcı URI
Gözat
Son Başvurular
1 - 5 / 37
-
ÖgeRobust attitude control of the F-16 aircraft using incremental nonlinear dynamic inversion(Graduate School, 2025-06-13)Contemporary high-performance aircraft like the F-16 Fighting Falcon necessitate flight control systems that are both exceptionally sensitive and intrinsically resilient to external disturbances, modeling uncertainties, and time-varying dynamics. The F-16's inherently unstable aerodynamic configuration, which is manageable solely via its fly-by-wire (FBW) technology, presents considerable difficulties for traditional control methods. Traditional Proportional-Integral-Derivative (PID) controllers frequently demonstrate inadequacy owing to the aircraft's pronounced nonlinear dynamics and the necessity for real-time flexibility in quickly fluctuating flight situations. This work examines and contrasts two sophisticated nonlinear control strategies—Nonlinear Dynamic Inversion (NDI) and Incremental Nonlinear Dynamic Inversion (INDI)—to attain precise and resilient attitude regulation of the F-16 aircraft, given the existing limits. A comprehensive nonlinear flight dynamics model of the F-16 is created, integrating aerodynamic forces and moments, actuator dynamics, and environmental perturbations including wind gusts and sensor noise. In this modeling context, NDI functions as a control mechanism that inverts the nonlinear system dynamics to get a specified linear behavior. Nonetheless, its efficacy is significantly dependent on the precision of the foundational model and the presumption of invertible control effectiveness matrices. This sensitivity frequently results in diminished performance when faced with unmodeled dynamics, parameter uncertainty, or actuator saturation. Conversely, INDI aims to diminish dependence on precise model information by utilizing real-time sensor feedback to incrementally adjust the control input. Rather of directly inverting the complete nonlinear dynamics, INDI calculates incremental adjustments in the control command based on detected fluctuations in angular acceleration and applied moments. This approach offers an inherently adaptive control mechanism that successfully adapts to disturbances and uncertainty without necessitating detailed modeling of the entire system. INDI is especially advantageous for practical flight applications, where ideal models are seldom accessible and external disturbances are unavoidable. Simulation outcomes derived from the constructed F-16 model indicate that INDI demonstrates enhanced performance relative to NDI across various metrics. INDI attains enhanced accuracy in attitude tracking, provides smoother control inputs with reduced actuator workload, and demonstrates significantly greater resilience to wind disturbances and sensor noise. In contrast to NDI, which often experiences instability with model deviations, INDI ensures a stable response by persistently adjusting its control effort to align with the aircraft's actual dynamic behavior. Moreover, INDI's resilience extends beyond nominal settings; it also operates efficiently in compromised scenarios, including abrupt parameter alterations or sensor delays. This thesis concludes that Incremental Nonlinear Dynamic Inversion provides a significantly more resilient and adaptable method for flight control compared to traditional model-based inversion techniques. Although NDI may suffice in controlled or well-defined settings, its relevance decreases under unknown conditions. INDI, conversely, matches more closely with the requirements of contemporary aerial platforms that must function in intricate, unexpected, and frequently hostile environments. Despite the INDI technique imposing extra computing requirements due to its dependence on real-time estimation, these expenses are warranted by the significant improvements in stability, control authority, and overall mission reliability. Consequently, INDI emerges as a highly attractive contender for next-generation flight control systems, especially in applications requiring quick maneuvering and dynamic threat response.
-
ÖgeCapturing aerodynamic characteristics of attas aircraft with evolving intelligent system(Graduate School, 2025-04-28)There are many studies in the literature that have been conducted to obtain an accurate mathematical model. In the early times, modeling studies were done with differential equations, but this approach could not fully express the nonlinear characteristics in some cases. Later, it was seen that nonlinear systems can be modeled successfully with the development of artificial intelligence and fuzzy systems. Especially in the aviation industry, where safety and security are of paramount importance, it is critical to accurately represent aircraft models. Mathematical models that accurately represent aircraft dynamics are critical in many studies such as aircraft control system design development, certification, and flight mechanics analysis. Therefore, aerodynamic modeling of the aircraft is very important. Either a wind tunnel or a parameter estimation method is used for aerodynamic modeling. However, wind tunnel, which is an experimental method, is quite costly since it requires an experimental setup. For this reason, many statistical-based system identification algorithms have been developed in the literature to estimate aerodynamic control and stability derivatives using measured flight tests. The Ordinary Least Squares (OLS) method is the most widely used system identification algorithm. In this method, which belongs to the light gray box model category, an aerodynamic mathematical model is developed that best fits the flight dataset and minimizes the squares of the differences between the estimated value and the actual value. However, this method may not be fully successful in expressing the nonlinear characteristics of the aircraft. In the neural network (NN) algorithm, which is in the black box model category, the weight parameters are trained using the input data and output data of the aerodynamic postulated model. The model obtained as a result of the NN algorithm can successfully represent the nonlinear characteristics of the aircraft. However, it is not possible to interpret NN based model since they lack a rule base in their structure. On the other hand, the models obtained with fuzzy logic algorithms are open to interpretation because they have a rule base structure and these models are in the dark gray box model category. Moreover, fuzzy logic algorithms are very successful in modeling complex and nonlinear systems. Considering these advantages of fuzzy systems, many aerodynamic modeling studies have been conducted in the literature with Adaptive Network Based Fuzzy Inference System (ANFIS). Based on these observations, Evolving Type 1 Quantum Fuzzy Neural Network (eT1QFNN) and Evolving Type 2 Quantum Fuzzy Neural Network (eT2QFNN) structures have been developed in this study. These evolving structures can better capture the nonlinear aerodynamic characteristics of the aircraft. Also, they are open to interpretation and they are robust to model uncertainties. The aerodynamic postulate model obtained from this methodology is compared with the aerodynamic postulate models obtained by OLS, NN, and ANFIS structures and the accuracy of the obtained aerodynamic models is analyzed. Firstly, flight data from the flight test campaign previously conducted with the ATTAS aircraft are used to obtain the aerodynamic model of the ATTAS aircraft. When selecting the suitability of this flight data, attention should be paid to whether the aircraft can trigger the longitudinal, lateral and directional modes. In this study, short period, bank to bank and dutch roll maneuvers were used to trigger the longitudinal, lateral and directional modes of the ATTAS aircraft. With these maneuvers, the responses of the aircraft obtained from the sensor were analyzed and the parameters to be used in system identification were recorded. A low-pass filter was used to remove noise from the recorded flight data. Thus, the noise effect in the parameters to be used in the identification of the aerodynamic model of the ATTAS aircraft was removed and made more appropriate. After the obtained flight data were filtered with a low-pass filter, the flight data was preprocessed. In order to preprocess the data, force and moment equations were generated in MATLAB using the weight, moment of inertia, and thrust values of the ATTAS aircraft. Then, the linear accelerations and angular rates obtained from the measured flight data are written into the previously created equations, and the aerodynamic force and moment coefficients are calculated. Thus, reference aerodynamic coefficients expressing the characteristics of the ATTAS aircraft are calculated with these flight data. After obtaining the reference aerodynamic coefficients, the aerodynamic postulate model of the ATTAS aircraft is derived. While constructing this postulate model, the aerodynamic postulate models available in the literature and the stepwise regression algorithm are utilized. With the stepwise regression algorithm, it was determined which stability and control derivative coefficients can be used in the aerodynamic postulate model and the over-parameterization problem was avoided. As a result of these analyses, the postulate models were obtained for 6 aerodynamic coefficients. In the next step, it is aimed to obtain aerodynamic postulate models that can represent the aerodynamic characteristics of the ATTAS aircraft well by using system identification algorithms. These models are compared with the reference models obtained from the force and moment equations to analyze whether they accurately represent the aerodynamic characteristics of the ATTAS aircraft. In this study, eT1QFNN and eT2QFNN are proposed to model the aerodynamic characteristics of the ATTAS aircraft. These evolving structures, which contain quantum fuzzy sets and neural network structures, have multiple inputs and a single output. In these evolving structures, the learning process starts with an empty rule base and the structure is continuously updated as a new data sample arrives. With each new data sample, these evolving structures generate a hypothetical rule that drives the autonomous evolution of the fuzzy rules. The generated hypothetical rules need to evolve significantly before they are incorporated into the network structure. The significance is evaluated using the Gaussian Mixture Model to predict complex changes in the data. If the generated hypothetical rules provide more contribution and meaning than the existing rules, they are added to this structure as new rules. On the other hand, when the hypothetical rules do not provide more meaning than the existing rules, the parameters of the quantum membership function and the consequent weight parameters in the rule base are updated by a decoupled extended Kalman filter. To do this, a winning rule is developed that depends on the maximum spatial firing power. In other words, the antecedent membership function and consequent weight parameters of the rule with maximum spatial firing power are updated. Thus, the performance of the evolving structures is preserved. These evolving structures are robust to uncertainties and data noise thanks to quantum membership functions as well as automatic rule learning and parameter tuning capabilities. They can also represent the nonlinear aircraft model by creating multiple linear sub-models with a rule-based structure through an incremental learning strategy instead of the traditional batch learning approach. In the next step of the study, the aerodynamic postulate models obtained from the proposed eT1QFNN and eT2QFNN are compared with the aerodynamic postulate models obtained from the OLS, NN, and ANFIS structures. Thus, the proposed methodology can be compared with previously existing methodologies in the literature in terms of modeling performance. In order to examine whether the system identification algorithms can successfully represent the aerodynamic characteristics of the ATTAS aircraft, two different settings were made. In the first one, training was performed with 80% of the flight data and testing with 20% of the flight data, while in the second one, training was performed with 50% of the flight data and testing with 50% of the flight data. Thus, models trained with both large and small data sets were analyzed. Furthermore, it was questioned whether the aerodynamic characteristics of the ATTAS aircraft could be captured with less flight data. In addition, during the training process of ANFIS and NN based aerodynamic models, overfitting was checked using test data. In contrast, no such overfitting check was performed for the OLS, eT1QFNN, and eT2QFNN models. This distinction arises from the fact that ANFIS and NN models are trained through multiple iterations, whereas OLS, eT1QFNN and eT2QFNN models are trained in a single iteration. In the next phase of the study, the Delta method was applied to the aerodynamic models estimated with the eT1QFNN and eT2QFNN with more training data, since more training data included short period, bank to bank, and dutch roll maneuvers. Thus, all longitudinal, lateral, and directional modes of the ATTAS aircraft could be triggered. As a result of the application of this method, the control and stability derivative parameters of the aerodynamic model were obtained. The dynamics, stability and controllability of the aircraft could be analyzed using these parameters. In this study, the control and stability derivative parameters are obtained by perturbing each of the input variables by about 1% in each direction. While one input variable is perturbed, the others should remain constant. The values of the control and stability derivative parameters during the flight time are shown and analyzed in histogram plots. The structure and sensitivity of the evolving structures in the rule bases could be interpreted by looking at the changes of these parameters in the histogram plots. The parameters obtained from this evolving structure with the Delta method were compared with the parameters obtained from the OLS method. Thus, it was analyzed whether the control and stability derivative parameters obtained from the evolving structure consistently represent the aerodynamic characteristics of the ATTAS aircraft. As a result, when the aerodynamic models obtained with the eT1QFNN and eT2QFNN are compared with the aerodynamic models obtained with other system identification algorithms, it is seen that the eT2QFNN better represents the aerodynamic characteristics of the ATTAS aircraft. In making this comparison, the closeness of the obtained aerodynamic postulate model to the reference aerodynamic model obtained in the flight test was considered. In addition, the accuracy of the values of the control and stability derivative parameters of the aerodynamic postulate model was also analyzed.
-
ÖgePath planning algorithm development for unmanned aerial and ground vehicles(Graduate School, 2023)The usage areas of robots are increasing day by day, mainly mobile robots, robotic arms, storage robots, robots that cook food. With this increase, more optimal and robust robots are needed in various technical subjects. We can group these needs as follows: localization, mapping, path planning, trajectory tracking, dynamic and static obstacle avoidance. If we briefly explain the general definitions, firstly, localization can be defined as the robot's estimation of its own position in variable environmental conditions with various sensor data. Map is created as a result of detecting the objects in the environment and keeping the places of these objects in memory during the vehicle movement. The other topic is road planning and it is defined as the set of locations obtained in order to travel a safe route between the starting and goal point. The other topic is the path tracking which provides control signals to be followed in order to follow the route with the minimum possible error. Finally, dynamic and static obstacle avoidance can be defined as predicting the obstacles that a robot may collide with during its movement and updating the route to be . Among the above-mentioned areas, road planning was chosen as the main thesis topic. In this context, path planning algorithms can basically be grouped under 3 categories, these are geometric based algorithms, tree search algorithms, machine learning algorithms and sampling based algorithms. The advantages and disadvantages of these algorithm groups are differentiated within themselves. In this thesis study, sampling based algorithms were studied. The RRT algorithm, which is the most basic of sampling based algorithms, was first examined. This algorithm ensures that the points are connected to each other without any cost optimization. Afterwards, the RRT* algorithm was proposed and this algorithm provides an optimal combination of points by cost optimization and is called rewiring tree. But the biggest disadvantage of this algorithm is that it examines all possible points of the map. Informed RRT* algorithm actually showed that the result can be reached much faster when possible path optimization is made in the form of ellipses on the areas of interest. However, where this algorithm is insufficient is that the ellipse created has a high eccentricity, so the ellipse area loses its meaning and covers the whole map. Then, the method we propose, the road is divided into n slices. These slices are randomly optimized with the help of the rewiring tree, which is a feature of the RRT* algorithm. Thus, even on roads with high eccentricity, it can reach the optimal result in a shorter time. Various maps were determined in the simulation and a consistent comparison was made by keeping the simulation parameters constant. As a result, a high rate of success has been achieved.
-
ÖgeGemi manevra modeli ve sapma açısının kontrolü(Lisansüstü Eğitim Enstitüsü, 2022)Gemi hareket kabiliyetini inceleyebilmek, hareket kontrol sistemini tasarlayabilmek ve gemiyi simülasyon ortamına taşıyabilmek amacıyla gemi dinamik hareket modelleri önerilmiştir. Önerilen modellere Abkowitz modeli, Nomoto model, Maneuvering Modelig Group modeli vb. örnek verilebilir. Abkowitz modeli bu modeller arasında yaygın kullanıma sahip bir modellerden birisidir. Abkowitz modeli gemiye etkiyen kuvvet ve momentlerin; gemi hareketine, pervane devrine ve dümen açısına bağlı çok terimli denklemler ile modellenmesine dayanır. Geminin hareketlerini sergileyecek matematiksel model oluşturmak için simülasyondan bağımsız ve simülasyon tabanlı olmak üzere iki yöntem kullanılır. Simülasyondan bağımsız yöntemler; tam ölçekli gemilerden veri toplanması, ölçeklenmiş model gemileri ile model testlerinin yapılması veya farklı gemilere ait veri bankalarından faydalanılması şeklindedir. Simülasyondan bağımsız yöntemler, her geminin kendi yapısına uygun bir test altyapısı oluşturulması gerektiğinden maliyetli ve zaman alıcı yöntemlerdir. Simülasyon tabanlı yöntemler ise sistem tabanlı ve hesaplamalı akışkan dinamiği tabanlı olmak üzere iki kısma ayrılabilir. CFD uygulamalarında doğruluğu yüksek sonuçlar elde etmek için çoğu zaman uzun hesaplama süresi gerektiren analizler yapılmaktadır. Sistem tabanlı yöntemlerden olan sistem tanıma teknikleri tekrarlı analizler gerektiren çalışmalar için CFD uygulamalarına göre daha elverişlidir. Kontrol mühendisliği alanında da sıkça kullanılan sistem tanıma teknikleri uygulama açısından model testlerine ve CFD uygulamalarına kıyasla erişim ve uygulama kolaylığına sahiptir. Gemi dinamik hareket modellerinin çıkarılması için sistem tanıma deneylerinde çokça başvurulmaktadır. Bu alanda literatürde sıkça karşılaşılan sistem tanıma yöntemlerine örnek olarak en küçük kareler regresyonu (LS), genişletilmiş Kalman filtresi (EKF), en büyük olabilirlik kestirimi (MLE) örnek olarak verilebilir. Gemi dinamik modeli kestirimi için gerçekleştirilen bir sistem tanıma deneyi akışı şu şekilde özetlenebilir: Yapılacak gemi model analizlerine uygun bir matematiksel model yapısı seçilir. Matematiksel modellere ait parametre setlerinden kestirimi yapılacak parametreler belirlenir. Sistem tanıma yöntemlerinden, model yapısına ve kestirilecek parametrelere uygun bir yöntemi seçilir. Tam ölçekli gemiden, ölçeklendirilmiş gemiden veya gerçekleştirilecek çalışmaya göre farklı bir alt sistemden kestirim çalışmalarına olanak sağlayacak ölçümler toplanır. Ölçümlerden uygun kısımlar, kestirim çalışması ve model doğrulama testleri için seçilir. Model doğrulama testleri ile sistem tanıma deneyi sonuçlandırılır. Kestirilen modelin doğruluğu hedeflenen çalışma şartlarını karşılıyor ise gemi manevra analizi, kontrol sistem tasarımı vb. çalışma aşamalarına geçilebilir. Tez kapsamında gerçekleştirilen simülasyon çalışmalarında kullanılmak üzere, hem veri toplanacak gemi modeli hem de kestirim modeli olarak Abkowitz model yapısı tercih edilmiştir. Kestirilecek gemi modeli olarak "mariner" tipi gemi modeli kullanılmıştır. Kestirim çalışmalarında EKF temelli durum artırımlı genişletilmiş kalman filtresi (SAEKF) yöntemi kullanılmıştır. EKF geliştirildiği zamandan bu yana sistem tanıma uygulamalarında sıkça kullanılan, farklı sistemler ve uygulamalar için temel alınıp özel yöntemler geliştirilen temel yöntemlerden biridir. SAEKF, EKF yöntemindeki durum vektörüne kestirilecek parametrelerin eklenmesiyle türetilmiş bir yöntemdir. Kestirim çalışmalarının ardından sistem tanıma deneylerinde doğrulama yöntemi olarak kullanılan bağımsızlık ve beyazlık testleri uygulanmıştır. Bu çalışmalar için farklı senaryolar oluşturulmuş, başarısız ve başarılı sonuçlar için incelemeler yapılmıştır. Tezde kestirim çalışmalarının devamında gemi sapma açısı kontrolü için kontrolör tasarımı gerçekleştirilmiştir. Kontrol yöntemi olarak durumlara bağlı Riccati denklemleri (State dependent Riccati equations - SDRE) yöntemi seçilmiştir. Bu yöntem doğrusal sistemlerde optimal geri besleme kontrolcüsünü hesaplamak için kullanılan sabit değerli Riccati denklemi yöntemine dayanır. SDRE yönteminde durum uzay modeli, doğrusal olmayan sistemin durumları arasındaki ilişkiye bakılarak birden fazla şekilde tasarlanabilir. Mariner gemisi manevra modeli için SDRE yöntemine uygun bir durum uzayı modeli tasarlanmış ve kontrolör tasarımı bu model üzerinden gerçekleştirilmiştir. Kontrolör performansı, sapma açısı referansı ve rota takibi testleri kurgulanarak incelenmiştir.
-
ÖgeA comparative study of nonlinear model predictive control and reinforcement learning for path tracking(Graduate School, 2022)One of the most financially significant industries is the automotive industry because of the benefits as well as the fact that it is always evolving and changing. Discoveries in computing and sensing hardware contributed to the evolution of this industry and led to the development of autonomous driving technology. Besides that, they offer several potentials for improving transportation safety, energy and fuel efficiency, as well as traffic congestion. These benefits and increasing attention to autonomous vehicles encourage the development of advanced driving systems. In this thesis, the path tracking problem of autonomous vehicles is investigated and a comparative analysis of two path tracking methods is presented. One of the selected methods is model predictive control and the other is a reinforcement learning algorithm soft actor-critic method. The model predictive controller is applied in a wide variety of path tracking problems due to its high performance and benefits over other control methods. The benefits of MPC are the ability to handle multi-input multi-output systems, optimize multiple objectives, work with nonlinear models, incorporate future steps into the optimization problem, overcome disturbances, and deal with constraints on the inputs, outputs, and states. Basically, MPC determines optimal control inputs for a given prediction horizon by minimizing the cost function while taking the system constraints and objectives into account. The system model is used to obtain future state predictions and these future state predictions are included in the cost function that determines the desired behaviour of the system. The optimization problem is solved for the current time step and system state, resulting in the generation of optimal control input sequences. Then, only the first input of the resulting optimal sequence is given to the system. This procedure is performed for each time step. In this thesis, the problem will be handled as a nonlinear model predictive control problem since a nonlinear vehicle model is used. NMPC problems are expressed as optimal control problems (OCP) and the multiple shooting method is used to transform the OCP into a nonlinear optimization problem (NLP) which is addressed by utilizing the optimization software package IPOPT. A vehicle model is one of the main things that MPC requires, and a vehicle model may be modelled with varying degrees of complexity depending on the problem and performance needs. There are several of different way to model vehicles such as a kinematic model which consists solely of a mathematical description of vehicle motion taken into account geometrically and ignoring the forces acting on the vehicle and a dynamic model which includes the forces affecting motion. Additionally, vehicle models can be described differently with various tire models. Basically, the kinematic model shows poor performance at high speeds due to lateral forces, whereas dynamical model shows high performance at high speeds but cannot be used in stop-and-go situations due to tire models becoming singular at low speeds. Additionally, the system identification process is easier for kinematic model since the kinematic model has only two parameters. Furthermore, one of the objectives of the thesis is to show that vehicles can be controlled with the minimum knowledge of the vehicle model. Therefore, a kinematic model is employed as it requires only distances from center of mass to axles. Control methods require parameters to be tuned manually or by optimization algorithms, and these approaches are not always capable of generalizing to new conditions, but intelligent methods arise with their ability to generalize to new conditions. In addition, while the vehicle model is needed for the controller, it is not always needed for intelligent methods. Intelligent methods like deep and machine learning have been included in autonomous driving studies to automate the driving task. These methods enable researchers to specify the desired behavior, teach the system to perform the desired behavior, and generalize their behaviors. Reinforcement learning has been selected as the method of choice to achieve automating the driving task. A learner agent interacts with the environment and collects experiences. Also, the environment gives feedback with reward signals. Because the agent is motivated to maximize positive reward signals and learns what to do as a result of its own experiences without specific instructions. However, the reinforcement learning problem becomes intractable as the states of the agent increase. The solution to this was found by combining deep learning and reinforcement learning and as a result, deep reinforcement learning has emerged. Deep reinforcement learning problems can be classified according to whether they have an environmental model or the way they optimize policy or whether they use different policies in training. Among many types, the soft actor-critic learning method is chosen for this thesis because it shows outperforming performance regarding both efficiency and stability compared to many other powerful methods. The soft actor-critic is an off-policy method that combines actor-critic and maximum entropy reinforcement learning methods. In order to generate stochastic policies with more exploration abilities, the entropy element is introduced to the objective function in this algorithm. As a result, the agent achieves learning by maximizing both expected reward and entropy rather than only maximizing expected reward as in other standard reinforcement algorithms. One of the important key parts of training reinforcement learning agents is that they require a lot of data and take a long time to learn. However, experience replays, which are mechanisms that allow using past experiences, are employed and it is observed that the learning is stabilized and the amount of experience required is decreased. In this thesis, SAC with different buffers are implemented and their efficiencies are examined. During parameter updates, experiences in the buffer are sampled uniformly in the vanilla experience replay. Prioritized experience replay (PER) is one of the experience replay methods used in this thesis, and it basically samples high important experiences more frequently. Emphasizing recent experience (ERE) is another strategy that samples more aggressively from recent experiences to emphasize the importance of the recently observed experience. These methods were chosen because PER has been shown to be effective in numerous studies, and ERE outperforms PER in some applications in terms of efficiency. However, the performance of ERE in the path tracking problem has not been compared with the PER and one of the aims of this thesis is to examine their efficiency in vehicle driving task. The simulation environment is chosen as CARLA simulator, which aims to be as realistic as possible in terms of control and visual elements. Several towns are available in CARLA, and two different ones have been chosen for this thesis. Also, it is necessary to establish the reference values that will be followed by the vehicle. For this purpose, paths were created for the selected towns and waypoints were produced for the vehicle to follow. Then, the cubic spline interpolation method was used as an optimization method for the waypoints because it is desired that the reference waypoints should be smooth and continuous. As a result of these operations, reference yaw angle and x and y positions were obtained. In addition, the speed reference is given in different values as a fixed reference. NMPC and SAC are responsible for both lateral and longitudinal control to follow the given path. As a longitudinal controller, they control the acceleration in order to achieve the target speed, and as a lateral controller, they change the steering wheel to track the reference path. This means that both have two action outputs which are steering angle and acceleration command. The states in NMPC are the states of the kinematic bicycle model, and the parameters of a Tesla Model 3 vehicle provided by CARLA are used. The states in SAC are chosen similar to the NMPC states and consist of steering and acceleration commands, target speed and reference tracking errors up to 10 steps ahead to reflect horizon information. A cost function consisting of tracking errors is constructed to minimize the error between the reference and followed paths for NMPC. The best weight coefficients of cost function are found after several experiments. Furthermore, steering angle and acceleration constraints are defined to participate in the optimization problem. Then, a symbolic framework CASADI is used to formulate this NMPC and provides an interface to IPOPT solver for solving the optimization problem. On the other side, for the SAC agent to follow the path, an appropriate reward function is prepared after many trials, which the agent will maximize according to its actions. Also, terminal conditions are created where the simulation ends if the agent goes out of lane, moves too slowly, and hits something. The network to be used in the training of the SAC agent consists of an actor network that decides on the actions and a critic network that measures how well the actions are. These networks are implemented with PyTorch library and hyperparameters for networks and buffers are taken from the original papers of the methods. The SAC agent is trained in CARLA on 10 and 5 different paths over 2000 episodes and it is observed that the agent trained on 10 different paths converged faster, so the training with other buffers are done on 10 different paths. After training with buffers, SAC+PER and SAC+ERE converged faster than SAC with vanilla buffer. It shows that the advanced buffer implementations enhance sampling efficiency. These trainings are done with random target velocities of 5 and 6 m/s, then for the SAC+PER agent, which is the fastest converging agent, the training is continued for the target velocities with 5 to 8 m/s. Simulations are carried out on 5 different paths to investigate path tracking performance. The results are discussed for each method and it is shown that the vehicle can follow the reference trajectory with a small margin of error for all approaches. This demonstrates that SAC agents have the ability to generalize since they performed well on unseen tracks. Although the performances of NMPC and SAC agents are very close to each other, SAC agents outperform NMPC in target velocity tracking and NMPC has better performance in yaw angle tracking. Also, as expected, the NMPC with the kinematic model performed worse as the speed increased. Furthermore, it is also observed that SAC+ERE and SAC+PER increase sample efficiency without reducing the performance.