LEE- Bilgisayar Mühendisliği-Yüksek Lisans
Bu koleksiyon için kalıcı URI
Gözat
Konu "artificial intelligence" ile LEE- Bilgisayar Mühendisliği-Yüksek Lisans'a göz atma
Sayfa başına sonuç
Sıralama Seçenekleri
-
ÖgeA graph neural network model with adaptive weights for session-based recommendation systems(Graduate School, 2024-07-02) Özbay, Begüm ; Öğüdücü Gündüz, Şule ; Tugay, Resul ; 504211508 ; Computer EngineeringThe development of artificial intelligence (AI) and machine learning (ML) models has revolutionized various industries, leading to the widespread adoption of predictive models. E-commerce platforms, in particular, have greatly benefited from these advancements, especially through the implementation of recommendation systems. These systems play a crucial role in enhancing user experience and increasing sales by personalizing the shopping journey. By suggesting products tailored to individual user's interests and preferences, recommendation systems not only improve customer satisfaction but also strengthen user loyalty to the platform. Recommendation systems can be categorized into several types. Content-based recommendation systems offer new suggestions based on the characteristics of items that a user has previously liked or interacted with. Collaborative filtering systems, on the other hand, make recommendations based on users' past behaviors and the preferences of similar users. Hybrid systems combine both content-based and collaborative filtering methods to provide more accurate and personalized recommendations. Additionally, there are more specialized recommendation systems tailored to specific needs or domains. Among these, session-based recommendation systems have emerged as particularly effective due to their ability to evaluate users' shopping behaviors and provide timely suggestions. Unlike traditional recommendation systems that rely on long-term user history, session-based systems focus on analyzing users' actions during their current sessions to offer real-time recommendations. By considering users' immediate preferences and dynamically updating algorithms, these systems significantly enhance the quality of recommendations. For instance, if a user explores a specific category or adds an item to their cart, session-based recommendation systems can suggest relevant products based on this activity. This real-time adaptability further personalizes the user experience and encourages users to spend more time on the platform, thereby increasing the likelihood of successful recommendations even with short-term shopping histories. In this study, we propose an innovative approach for session-based recommendation systems by implementing an adaptive weighting mechanism on graph neural network (GNN) vectors. The goal is to enhance the predictive accuracy of recommendation models by applying this weighting mechanism to an existing session-based recommendation model, SR-GNN (Session-based Recommendation with Graph Neural Networks). This mechanism is designed to include various types of contextual information obtained during the session. The traditional SR-GNN model focuses on users' last interactions during sessions and evaluates the relationships between these interactions and other items. However, to increase the model's effectiveness, it is necessary to dynamically determine the importance of each item. The adaptive weighting mechanism evaluates each interaction individually during the session and optimizes its impact on the model. This mechanism assigns different importance levels to each item during the session. These importance levels are dynamically adjusted based on users' immediate preferences and interactions. By implementing the weighting mechanism, the significance of each interaction during the session is taken into account, allowing for a more in-depth analysis of user behaviors. Each action performed by users during the session becomes a crucial source of contextual information for the next recommendation. These contextual details are used to increase the accuracy of recommendation models. By focusing on users' last actions and evaluating the relationships between similar actions, the weighting mechanism strengthens the recommendation system. Experimental evaluations on the Dressipi dataset have demonstrated the effectiveness of the proposed approach in enhancing user experience compared to traditional models. The ability to provide accurate and relevant recommendations in real time is key to improving user satisfaction and increasing sales on e-commerce platforms. The experimental results indicate that the adaptive weighting strategy significantly outperforms the SR-GNN model. Moreover, this strategy is particularly effective in addressing the cold start problem, providing more accurate recommendations for new users and newly added products. Future research aims to explore the scalability of session-based recommendation systems to larger datasets and more complex recommendation scenarios. As the volume of data continues to grow, developing models that can efficiently manage this influx while maintaining high performance becomes increasingly important. Enhancing the scalability and robustness of these models will be critical for their widespread adoption and effectiveness in diverse e-commerce environments. Additionally, ongoing advancements in AI and ML techniques are expected to lead to further improvements in recommendation algorithms, making them even more precise and responsive to users' needs. In conclusion, session-based recommendation systems represent a significant advancement in the field of e-commerce, offering a sophisticated and adaptive approach to product recommendations. By leveraging the power of AI and ML, these systems can analyze user behavior in real-time and provide personalized suggestions that enhance the overall shopping experience. As research continues to advance, the potential for further improvements in these systems is vast, promising even greater benefits for both users and e-commerce platforms. The integration of session context and advanced algorithms will undoubtedly play a pivotal role in shaping the future of recommendation systems, driving user engagement, and increasing sales in the competitive world of e-commerce.
-
ÖgeAdvanced techniques and comprehensive analysis in speech emotion recognition using deep neural networks(Graduate School, 2024-07-01) Yetkin, Ahmet Kemal ; Köse, Hatice ; 504201506 ; Computer EngineeringThe rapid advancement in artificial intelligence technologies has resulted in significant progress in human-computer interaction (HCI) and related fields. In HCI, the ability of machines to perceive and understand users' emotional states in real-time is crucial for enhancing the user experience. Accurate recognition of emotions enables machines to provide more personalized and effective services. Over the past fifty years, research on the recognition of speech and speech emotion recognition (SER) has made considerable strides, continuously expanding the knowledge base in this area. Speech is one of the fundamental elements of human communication and offers rich information about the speaker's emotional state. Changes in tone, speed, emphasis, and pitch play significant roles in reflecting the speaker's emotions. Therefore, analyzing speech can provide deeper insights into the speaker's feelings, thoughts, and intentions. It is widely accepted that the human voice is the primary instrument for emotional expression and that tone of voice is the oldest and most universal form of communication. In this context, the ability of machines to interpret these tones can greatly enhance the performance of HCI systems. Recognizing emotion from speech is a significant research area in affective computing. This task is challenging due to the highly personal nature of emotions, which even humans can find difficult to understand accurately. Speech emotion recognition has numerous practical applications, including emotion-aware HCI systems, traffic problem-solving, robotics, and mental health diagnosis and therapy. For instance, in customer service systems or mobile communication, a customer's emotional state can be inferred from their tone of voice, and this information can be used to provide better service. In educational support systems, it can help improve children's socio-emotional skills and academic abilities. Recognizing emotions from speech can also provide early warnings for drivers who are excessively nervous or angry, thereby reducing the likelihood of traffic accidents. Moreover, such systems hold great potential for individuals who struggle to express their emotions, such as children with autism spectrum disorder (ASD). This study aims to develop a method for detecting emotions from speech and to use this method to improve the performance of existing speech emotion recognition (SER) systems. In this context, various feature extraction methods have been evaluated to identify the most distinctive voice characteristics for recognizing emotions. These methods include Mel Frequency Cepstral Coefficients (MFCC), Mel spectrogram, Zero-Crossing Rate (ZCR), and Root Mean Square Energy (RMSE). The extracted features have been used in conjunction with deep learning models. Initially, these features were transformed into two-dimensional images and optimized on pre-trained networks, then trained on a one-dimensional convolutional neural network (CNN) architecture. Finally, a combined CNN and Long Short-Term Memory (LSTM) model was used. Throughout this research, critical questions were addressed, such as whether speech features can accurately detect human emotional states and which feature extraction method performs best in the literature. The study specifically examined the impact of various feature extraction methods, including MFCC, Mel spectrogram, Chroma, Root Mean Square Energy (RMSE), and Zero-Crossing Rate (ZCR). The effects of different image formats of MFCC and Mel-spectrogram audio features on accuracy rates and how these formats influence model performance were also explored. Additionally, the study aimed to determine which pre-trained model, among VGG16, VGG11\_bn, ResNet-18, ResNet-101, AlexNet, and DenseNet, performs best when fine-tuned. The impact of audio data augmentation methods on test results was evaluated, analyzing how increasing and diversifying the dataset affects the overall accuracy and robustness of the models. This research aims to address these questions to contribute to the development of more accurate and robust systems for speech emotion recognition.