Scene change detection with triplet loss network using self-supervised learning

dc.contributor.advisor Akgül, Tankut
dc.contributor.author Nayır, Burak
dc.contributor.authorID 704211004
dc.contributor.department Computer Science
dc.date.accessioned 2025-05-23T08:48:01Z
dc.date.available 2025-05-23T08:48:01Z
dc.date.issued 2024-07-17
dc.description Thesis (M.Sc.) -- Istanbul Technical University, Graduate School, 2024
dc.description.abstract Scene transition detection, one of the most critical topics in image processing, has attracted considerable attention in recent research initiatives. Detecting scene transitions is essential in various fields, including video editing, search algorithms, and analytical applications. The demand for automatic scene change detection has increased among many users, especially with the rapid increase in social media content. Various methodologies for scene transition detection include neural networks, classical audio processing techniques, and image processing algorithms. In this study, we created a CNN model called FraSim and a new dataset to train it, and combined it with classical image processing method Structural Similarity (SSIM). The process of creating the dataset involved enriching the transitions of scenes with frames taken from movie scenes collected over the internet. The same dataset is available in both grayscale and RGB format and also includes audio. A unique algorithm was designed to extract frames and associated audio during dataset creation, ensuring that only the most notable frames are retained. The frames in the dataset were carefully categorized per scene and per movie. Training the model was carried out using a self-supervised approach. For this purpose, we utilized powerful techniques like Triple Loss and Siamese Network architecture. Triple loss, in particular, played a crucial role in improving the model's effectiveness by optimizing distance measurements between similar and dissimilar samples. This research effort significantly contributes to the field of automatic video analysis. By introducing a new approach to scene transition detection that encompasses both the structure of the training dataset and the architecture of the deep learning model, we have opened up new possibilities for the field. The impressive accuracy rate of up to 97.84% achieved using FraSim with the RGB Dataset clearly indicates this research's potential impact. The integration of classical image processing techniques with the development of an intelligent system using FraSim further strengthens the effectiveness of scene transition detection, underlining the versatile nature of this innovative research effort.
dc.description.degree M.Sc.
dc.identifier.uri http://hdl.handle.net/11527/27157
dc.language.iso en_US
dc.publisher Graduate School
dc.sdg.type none
dc.subject Image processing
dc.subject Görüntü işleme
dc.title Scene change detection with triplet loss network using self-supervised learning
dc.title.alternative Üçlü kayıp ağı ile kendi kendine denetimli öğrenme metodu kullanarak sahne geçişlerinin tespiti
dc.type Master Thesis
Dosyalar
Orijinal seri
Şimdi gösteriliyor 1 - 1 / 1
thumbnail.default.alt
Ad:
704211004.pdf
Boyut:
12.31 MB
Format:
Adobe Portable Document Format
Açıklama
Lisanslı seri
Şimdi gösteriliyor 1 - 1 / 1
thumbnail.default.placeholder
Ad:
license.txt
Boyut:
1.58 KB
Format:
Item-specific license agreed upon to submission
Açıklama