Scene change detection with triplet loss network using self-supervised learning
Scene change detection with triplet loss network using self-supervised learning
dc.contributor.advisor | Akgül, Tankut | |
dc.contributor.author | Nayır, Burak | |
dc.contributor.authorID | 704211004 | |
dc.contributor.department | Computer Science | |
dc.date.accessioned | 2025-05-23T08:48:01Z | |
dc.date.available | 2025-05-23T08:48:01Z | |
dc.date.issued | 2024-07-17 | |
dc.description | Thesis (M.Sc.) -- Istanbul Technical University, Graduate School, 2024 | |
dc.description.abstract | Scene transition detection, one of the most critical topics in image processing, has attracted considerable attention in recent research initiatives. Detecting scene transitions is essential in various fields, including video editing, search algorithms, and analytical applications. The demand for automatic scene change detection has increased among many users, especially with the rapid increase in social media content. Various methodologies for scene transition detection include neural networks, classical audio processing techniques, and image processing algorithms. In this study, we created a CNN model called FraSim and a new dataset to train it, and combined it with classical image processing method Structural Similarity (SSIM). The process of creating the dataset involved enriching the transitions of scenes with frames taken from movie scenes collected over the internet. The same dataset is available in both grayscale and RGB format and also includes audio. A unique algorithm was designed to extract frames and associated audio during dataset creation, ensuring that only the most notable frames are retained. The frames in the dataset were carefully categorized per scene and per movie. Training the model was carried out using a self-supervised approach. For this purpose, we utilized powerful techniques like Triple Loss and Siamese Network architecture. Triple loss, in particular, played a crucial role in improving the model's effectiveness by optimizing distance measurements between similar and dissimilar samples. This research effort significantly contributes to the field of automatic video analysis. By introducing a new approach to scene transition detection that encompasses both the structure of the training dataset and the architecture of the deep learning model, we have opened up new possibilities for the field. The impressive accuracy rate of up to 97.84% achieved using FraSim with the RGB Dataset clearly indicates this research's potential impact. The integration of classical image processing techniques with the development of an intelligent system using FraSim further strengthens the effectiveness of scene transition detection, underlining the versatile nature of this innovative research effort. | |
dc.description.degree | M.Sc. | |
dc.identifier.uri | http://hdl.handle.net/11527/27157 | |
dc.language.iso | en_US | |
dc.publisher | Graduate School | |
dc.sdg.type | none | |
dc.subject | Image processing | |
dc.subject | Görüntü işleme | |
dc.title | Scene change detection with triplet loss network using self-supervised learning | |
dc.title.alternative | Üçlü kayıp ağı ile kendi kendine denetimli öğrenme metodu kullanarak sahne geçişlerinin tespiti | |
dc.type | Master Thesis |