Deep reinforcement learning for partially observable markov decision processes

Haklıdır, Mehmet

Deep reinforcement learning for partially observable markov decision processes

dc.contributor.advisor	Temeltaş, Hakan
dc.contributor.author	Haklıdır, Mehmet
dc.contributor.authorID	504102110
dc.contributor.department	Control and Automation Engineering
dc.date.accessioned	2023-12-26T06:28:02Z
dc.date.available	2023-12-26T06:28:02Z
dc.date.issued	2022-07-19
dc.description	Thesis(Ph.D.) -- Istanbul Technical University, Graduate School, 2022
dc.description.abstract	Deep reinforcement learning has recently gained popularity owing to its many successful real-world applications in robotics and games. Conventional reinforcement learning faces a substantial challenge in developing effective algorithms for high-dimensional environments. The use of deep learning as a function approximator in reinforcement learning is a viable solution to overcome this challenge. Furthermore, in deep reinforcement learning, the environment is often thought to be fully observable, meaning that the agent can perceive the true state of the environment and so act appropriately in the current state. Most real-world problems are partially observable and the environmental models are unknown. Therefore, there is a significant need for reinforcement learning approaches to solve these problems, in which the agent perceives the state of the environment partially and noisily. Guided reinforcement learning methods solve this issue by providing additional state knowledge to reinforcement learning algorithms during the learning process, thereby allowing them to solve a partially observable Markov decision process (POMDP) more effectively. However, these guided approaches are relatively rare in the literature, and most existing approaches are model-based, which means that they require learning an appropriate model of the environment first. In this thesis, we present a novel model-free approach that combines the soft actor-critic method and supervised learning concept to solve real-world problems, formulating them as POMDPs. We evaluated our approach using modified partially observable MuJoCo tasks. In experiments performed on OpenAI Gym, an open-source simulation platform, our guided soft actor-critic approach outperformed other baseline algorithms, gaining 7∼20% more maximum average return on five partially observable tasks constructed based on continuous control problems and simulated in MuJoCo. To solve the autonomous driving problem, we focused on decision making under uncertainty, as a partially observable Markov decision process, using our guided soft actor-critic approach. A self-driving car was trained in a simulation environment, created using MATLAB/SIMULINK, for a scenario in which it encountered a pedestrian crossing the road. Experiments demonstrate that the agent exhibits desirable control behavior and performs close to the fully observable state under various uncertainty situations.
dc.description.degree	Ph. D.
dc.identifier.uri	http://hdl.handle.net/11527/24262
dc.language.iso	en_US
dc.publisher	Graduate School
dc.sdg.type	Goal 9: Industry, Innovation and Infrastructure
dc.subject	reinforcement learning
dc.subject	pekiştirmeli öğrenme
dc.subject	markov decision processes
dc.subject	markov karar süreçleri
dc.subject	robots
dc.subject	robotlar
dc.title	Deep reinforcement learning for partially observable markov decision processes
dc.title.alternative	Kısmi gözlemlenebilir markov karar süreçleri için derin pekiştirmeli öğrenme
dc.type	Doctoral Thesis

Dosyalar

Orijinal seri

Şimdi gösteriliyor 1 - 1 / 1

Ad:: 504102110.pdf
Boyut:: 3.68 MB
Format:: Adobe Portable Document Format
Açıklama

İndir

Lisanslı seri

Şimdi gösteriliyor 1 - 1 / 1

Ad:: license.txt
Boyut:: 1.58 KB
Format:: Item-specific license agreed upon to submission
Açıklama

İndir

Koleksiyonlar

LEE- Kontrol ve Otomasyon Mühendisliği-Doktora