CNN-based server state monitoring and fault diagnosis using infrared thermography

dc.contributor.advisor Erden, Hamza Salih
dc.contributor.advisor Töreyin, Behçet Uğur
dc.contributor.author Wiysobunri, Beltus Nkwawir
dc.contributor.authorID 714446
dc.contributor.department Computer Sciences Programme
dc.date.accessioned 2025-01-24T13:17:35Z
dc.date.available 2025-01-24T13:17:35Z
dc.date.issued 2022
dc.description Thesis (M.Sc.) -- İstanbul Technical University, Graduate School, 2022
dc.description.abstract Over the last few decades, data centers (DCs) have rapidly evolved to become the backbone of some of the world's most critical and prominent institutions such as banking, health, information and communication technology (ICT) industries. This exponential growth is triggered by the dramatic increase in the number of internet users and the high demand for diverse cloud-based applications such as Big Data, artificial intelligence (AI), internet of things (IoT), etc. As a consequence, there has been a simultaneous rise in the number of DCs and the amount of electricity consumption of DCs. This increase introduced new complex challenges in the DC facility such as thermal management, system reliability sustenance and server failure minimalization. To tackle these challenges, computational fluid dynamic (CFD) models have been proposed in the literature. The CFD models are capable of accurately describing the DC thermal dynamics and temperature distributions although they are computationally expensive. The availability of huge data and computational power has introduced machine learning (ML) data-driven approach as a promising method. Data-driven techniques have the ability to find complex patterns and relationships in data between system parameters without explicit knowledge of the physical behaviour of the system. However, their performance is limited by several factors including the type of data, feature extraction methods, and choice of algorithm. A hybrid approach that integrates CFD models with data-driven models provides an attractive alternative solution. However, it suffers from the drawbacks of both CFD and ML data-driven methods. In this study, we evaluate for the first time in the literature, seven state-of-the-art deep pretrained convolutional neural network (CNN)-based architectures and two shallow CNN-based architectures applied on server surface infrared thermography (IRT) images for the automatic diagnosis of five server operation conditions. These conditions include partial processor (CPU) load, maximum CPU load, main fan failure, CPU fan failure, and server entrance blockage. Our approach is a supervised learning approach based on the concept of transfer learning which involves two main stages. First, a CNN model classifier pretrained on the large ImageNet dataset is used to extract lower level features. Second, the IRT images are used to fine-tune the higher levels of the CNN model classifier. A stratified five-fold cross-validation resampling method is used to evaluate the effectiveness and generalization of the shallow and deep model architectures for five data sample split ratios. Results suggest that the CNN architectures achieve high prediction performance accuracies, with the majority having above 98% test accuracies across multiple split ratios. These results are significantly higher than those obtained using a traditional support vector machine classifier trained on handcrafted features. The effectiveness and robustness of the CNN-based algorithms can provide DC operators with an alternative approach to improve thermal management, energy efficiency, and system reliability of servers in DCs.
dc.description.degree M.Sc.
dc.identifier.uri http://hdl.handle.net/11527/26283
dc.language.iso en
dc.publisher Graduate School
dc.sdg.type Goal 9: Industry, Innovation and Infrastructure
dc.subject Computer Engineering
dc.subject Computer Science and Control
dc.subject data centers
dc.subject thermal dynamics
dc.title CNN-based server state monitoring and fault diagnosis using infrared thermography
dc.title.alternative Kızılötesi termografi kullanarak CNN tabanlı sunucu durumu izleme ve arıza teşhisi
dc.type Master Thesis
Dosyalar
Orijinal seri
Şimdi gösteriliyor 1 - 1 / 1
thumbnail.default.alt
Ad:
714446.pdf
Boyut:
1.2 MB
Format:
Adobe Portable Document Format
Açıklama
Lisanslı seri
Şimdi gösteriliyor 1 - 1 / 1
thumbnail.default.placeholder
Ad:
license.txt
Boyut:
1.58 KB
Format:
Item-specific license agreed upon to submission
Açıklama