An affective framework for brain computer interfaces using transfer learning in virtual environments
An affective framework for brain computer interfaces using transfer learning in virtual environments
Dosyalar
Tarih
2024-12-13
Yazarlar
Sarıkaya, Mehmet Ali
Süreli Yayın başlığı
Süreli Yayın ISSN
Cilt Başlığı
Yayınevi
Graduate School
Özet
The significance of emotion recognition through physiological signals has been increasingly acknowledged in the context of its applications in diverse fields such as psychology, healthcare, and human-computer interaction. Physiological signals including ElectroEncephaloGraphy (EEG), ElectroMyoGraphy (EMG), ElectroOculoGraphy (EOG), ElectroDermal Activity (EDA), Galvanic Skin Response (GSR), SKin Temperature (SKT), RESPiration (RESP), Blood Volume Pulse (BVP), Heart Rate (HR), and eye movements offer a viable alternative to facial recognition systems in Virtual Reality (VR) environments, where traditional methods fall short due to the obtrusive nature of VR headsets. This has led to a growing interest in utilizing these signals to discern the emotional states of users, thereby enhancing their interaction within virtual environments. Despite the promising prospects of physiological signal-based emotion recognition, there are considerable challenges associated with the development of affective systems. One major issue is the high variability of these signals, which can be influenced by individual-specific factors such as mood and stress levels. This variability necessitates the collection of large amounts of data, which is both time-consuming and costly, making the process tedious and inefficient. Furthermore, psychological patterns are known to be transient, leading to a decline in the performance of classifiers over time and necessitating frequent recalibrations. To mitigate these issues, this thesis proposes the adoption of transfer learning strategies, which have been successful in other domains such as image recognition. Transfer learning allows for the leveraging of pre-existing models and datasets, thereby reducing the need for extensive new data collection and enabling the adaptation of models to new tasks with minimal additional training. This approach not only saves time but also enhances the accuracy and efficiency of emotion recognition systems. One of the focal points of this thesis is the calibration process in current Brain-Computer Interface (BCI) systems, particularly those based on EEG. These systems typically require long calibration times, as they depend heavily on data accumulated across numerous training sessions. This thesis argues for the development of adaptive algorithms that can significantly cut down the calibration time, thereby making BCI systems more practical and accessible for real-world applications. The thesis discusses the distinction between subject-specific and subject-independent models in emotion recognition. Subject-specific models, while offering high accuracy, tend to overfit to limited data, which can severely restrict their generalization capabilities. On the other hand, subject-independent models, which are designed to be more general, often fail to capture individual nuances that are crucial for personalized emotion recognition. This dichotomy underscores the challenges inherent in using subject-specific data alone to model complex emotion-recognition mechanisms. The need for specialized algorithms that can handle the unique dynamics of 3D immersive virtual environments is another critical area addressed in this thesis. Traditional 2D emotion recognition systems do not provide the sense of immersion, presence, and depth that are integral to VR applications, necessitating the development of algorithms that are specifically tailored for these environments. In response to these challenges, the thesis introduces a novel Heterogeneous Adversarial Transfer Learning (HATL) module, designed to synthesize EEG data from multimodal non-EEG inputs. This module significantly reduces the calibration durations and enhances the adaptability and performance of the system across different VR settings, paving the way for more agile and responsive emotion recognition systems. Concurrently, the thesis implements a Knowledge Distillation (KD) strategy to effectively amalgamate and utilize multimodal data. This approach significantly improves the accuracy and generalization capabilities of emotion recognition models, making them suitable for both subject-specific and subject-independent applications. By leveraging the strengths of both EEG and non-EEG data, the KD method facilitates a deeper understanding of emotional states, transcending individual variances. The novel framework proposed in this thesis integrates the HATL and KD modules to optimally address the dual needs of rapid calibration in subject-specific scenarios and enhanced model generalizability in subject-independent applications. This dual-module setup is a core component of the thesis and represents a significant advancement in the field of emotion recognition. The efficacy of this framework is demonstrated through extensive empirical testing, which confirms that the models not only perform well in controlled environments but also adapt effectively to real-world VR scenarios. These results are crucial for applications that require rapid and precise emotion assessments, such as personalized therapeutic interventions and adaptive educational systems. The integration of adversarial learning and knowledge distillation in a unified framework has the potential to revolutionize emotion recognition technology, especially in VR environments. The ability to quickly and accurately assess emotional states in VR enhances user interaction and system responsiveness, making it applicable across a broad range of practical scenarios. Furthermore, this thesis provides a comprehensive analysis of the effectiveness of the proposed models in both 2D and 3D environments. By conducting extensive comparisons, it establishes the superior performance of these models in immersive VR settings compared to traditional 2D setups. This analysis not only validates the effectiveness of the proposed approaches but also highlights their potential to bridge the gap between traditional emotion recognition methods and the requirements of immersive VR technologies. In conclusion, the thesis presents a robust and adaptable framework that sets a new benchmark for the practical application of BCIs in immersive virtual environments. By addressing the limitations of current systems and harnessing the capabilities of advanced machine learning techniques, the proposed framework significantly advances the field of emotion recognition. This comprehensive approach not only overcomes the challenges posed by high variability and transient psychological patterns but also opens new avenues for future research and development in this domain. The thesis concludes with a discussion on the implications of these findings for future research, suggesting areas where further advancements in technology and methodology could enhance the robustness and applicability of emotion recognition systems. The potential for integrating these systems with other technologies, such as mixed reality providing a vision for a more interconnected and responsive future in human-computer interaction. The contributions of this thesis are expected to have a lasting impact on the field of emotion recognition, particularly in the context of VR. By improving both the theoretical understanding and practical applications of BCIs in virtual environments, this work paves the way for more personalized and immersive user experiences. The proposed models offer a promising direction for future research, with the potential to further refine and expand the capabilities of emotion recognition systems for a wide range of applications.
Açıklama
Thesis (Ph.D.) -- Istanbul Technical University, Graduate School, 2024
Anahtar kelimeler
brain computer interfaces,
beyin bilgisayar arayüzleri