Reinforcement learning in fighting games

thumbnail.default.alt
Tarih
2022
Yazarlar
Uğursoy, Muhammet Sadık
Süreli Yayın başlığı
Süreli Yayın ISSN
Cilt Başlığı
Yayınevi
Graduate School
Özet
Reinforcement learning is one of the most popular learning methods used on games because of its similar nature to competitive play. Winning the game and the means to win the game can be used as rewards easily, which enables us to create a reasonable benchmark. The field has many algorithms and approaches that can solve simple Atari games and robotic problems, however, it still has many unexplored areas with difficult problems to solve. After the introduction of Deep Q-Learning (DQN) by DeepMind, learning from pixel data become popular and applied to many other games. Agents could reach and exceed human level play in simple games. But for more complex games like Montezuma's Revenge, different approaches such as hierarchical DQN is needed to search the huge search space of the game. Furthermore, the classical +1 reward for win, -1 reward for lose strategy is not always enough for complex games. As the complexity increases, the algorithm and model should change and adapt. Even though Atari games look simple, they are hard problems to solve for an AI agent. The most recent work on Atari games published in 2020, claims to outperform humans on all Gym Atari games. However, there are still many difficult games to solve that requires novel approaches. The work on this thesis focuses on reaching a human-like play at the end stage boss fight in the game called Megaman X. Existing RL algorithms have been tested with different replay buffer types, parameters and exploration strategies and their performances were compared. To make a better comparison of the algorithms, a simple game called Super Mario World and a fighting game with similar characteristics to the main game called Ultimate Mortal Kombat has been tested as well. We proposed new game specific methods to make the agent play better, including reward shaping and feature extraction methods. This thesis shows the all the results of those trainings and analyses the results. In order to get better results from the trainings, reward shaping and feature extraction methods have been suggested and tested. For feature extraction, CNN based methods and auto-encoder frameworks have been tested and in addition to that, direct data read from the RAM such as character and enemy positions. Reward shaping is applied for the main focus on this thesis, the game Megaman X. Variables such as the charge status of the weapon, distance between the enemy and the agent, health and time are used as reward shaping parameters. Both Q-learning and policy gradient methods are tested. In addition, the latest exploration focused methods and hierarchical methods, which are said to be enhancing exploration, are tested. Also, human players who are familiar with platform games are also played the game and their experiences are recorded in a survey. In this thesis, all those methods and results are analysed.
Açıklama
Thesis (M.Sc.) -- Istanbul Technical University, Graduate School, 2022
Anahtar kelimeler
Intensification, Fighting games, Deep learning
Alıntı