Trajectory tracking control of a quadrotor with reinforcement learning

thumbnail.default.alt
Tarih
2023-01-23
Yazarlar
Çakmak, Eren
Süreli Yayın başlığı
Süreli Yayın ISSN
Cilt Başlığı
Yayınevi
Graduate School
Özet
Drone control algorithms are usually broken down into several steps. The innermost parts of a drone control algorithm are angle and angular velocity control loops. Whether it is fixed-wing or rotary-wing, these control loops conventionally consist of PID based controllers. Although a PID controller can control these loops successfully, it may not lead the outer loops to desired positions or velocities. An outer loop designed to manage these situations can be done with conventional controller loops. However, these kinds of controllers are heavily model-dependent and often require tuning. Motivated by this situation, the aim of the presented study is to show that reinforcement learning based algorithms can control a quadrotor drone without prior knowledge of the model. The most preferred model-free reinforcement algorithms in the literature are DDPG, TRPO, and PPO. The studies that use state-of-the-art reinforcement learning methods for quadcopter control are compared, and it is concluded that PPO is the best choice to begin with. An actor-critic neural network for PPO-clip, the most successful version of PPO, is built and trained on a custom Gym environment. The environment is a quadrotor model that covers fundamental dynamics. This study is composed of six chapters. In the first chapter, motivation of research and literature review are given. In the second chapter, the theoretical background to construct a quadrotor model is given, and a general picture of reinforcement learning and model-free algorithms is drawn. In the third chapter, a custom simulation environment using the features of Gym library is designed. Then, the neural network based controller is designed, in the fourth chapter. Next, the agent is trained in the custom environment, in the fifth chapter. The simulation results of hovering and trajectory tracking tests are given. In the last chapter, it is concluded that a model-free reinforcement learning-based neural network without any additional control loop can control a quadrotor, and possible future works for this study are discussed.
Açıklama
Thesis (M.Sc.) -- İstanbul Technical University, Graduate School, 2023
Anahtar kelimeler
drone, insansız hava aracı, control theory, kontrol teorisi, orbit control, yörünge kontrolü, learning algorithms, öğrenme algoritmaları
Alıntı