Customer lifetime value prediction and segmentation analysis for commercial customers in the banking industry

dc.contributor.advisor Tuna, Süha
dc.contributor.author Bakır Tartar, Feyza
dc.contributor.authorID 528221082
dc.contributor.department Big Data and Business Analytics
dc.date.accessioned 2025-06-17T08:02:09Z
dc.date.available 2025-06-17T08:02:09Z
dc.date.issued 2024-08-12
dc.description Thesis (M.Sc.) -- Istanbul Technical University, Graduate School, 2024
dc.description.abstract In Türkiye, the banking sector plays a pivotal role in the growth of the national economy and maintenance of financial stability. Understanding and evaluating the behavior of corporate customers in the banking sector requires an accurate and comprehensive analysis. Customer Lifetime Value is a critical metric for understanding customer attitudes, therefore its accurate calculation is crucial. In this thesis, data on corporate customers from a company operating as a participation bank in Türkiye were used to create Customer Lifetime Value scores using various analytical techniques. In the first stage of the study, data from the past two years for each customer were collected from relevant databases and organized on a quarterly basis to calculate Customer Lifetime Value scores. The dataset was checked for missing values, inconsistencies, and errors. Outliers were identified using the z-score method during the data-cleaning process. In this study, data points with an absolute z-score greater than 3 were considered outliers. This threshold is a commonly used rule when assessing whether data conform to a normal distribution and helps minimize the impact of extreme values. These outliers are typically data points that could distort the overall structure of the dataset and negatively affect the model performance; thus, they were removed from the dataset. Before proceeding to the modeling phase, a standard scaling process was implemented to standardize the data. Scaling is a preprocessing step performed to eliminate problems arising from data with different units of measurement and to ensure that all data are on the same scale. Following the scaling process, the Customer Lifetime Value was predicted using five distinct machine learning algorithms. In this process, the outcomes of the Random Forest, Light Gradient-Boosting Machine, Extreme Gradient Boosting, Elastic-Net, and Linear Regression algorithms were assessed. To assess the model results, the values of Mean Absolute Error, Mean Squared Error, Root Mean Squared Error, R2, and the Adjusted R2 were used. These metrics were employed to utilized how well the model predictions aligned with the target variable. Upon analyzing the results, the highest R2 score was found to be 0.55 with the Random Forest algorithm, with Light Gradient-Boosting Machine being the second-best algorithm. Following the evaluation of the results, parameter optimization was performed using the Grid Search Cross Validation model to enhance the model performance and achieve the best possible results. Grid Search Cross Validation is a technique that explores all possible combinations within a specified parameter set to identify hyperparameters that deliver the best performance. After parameter optimization, the highest R2 value was calculated to be 0.68 using the Random Forest algorithm, and the second-highest R2 value was 0.56 using the Extreme Gradient Boosting algorithm. In this study, the results of the Random Forest algorithm, which provided the highest prediction accuracy, were adopted as the basis for the clustering analysis. The K-means clustering algorithm was used to partition the data into meaningful clusters. Graphical analysis using the elbow method and a detailed examination of cluster properties led to the conclusion that 5 clusters best represented customer segments and were the optimal number of clusters. The aim is to increase satisfaction and loyalty among high value customers by offering special deals and by activating low value customers through incentives and promotions.
dc.description.degree M.Sc.
dc.identifier.uri http://hdl.handle.net/11527/27325
dc.language.iso en_US
dc.publisher Graduate School
dc.sdg.type Goal 9: Industry, Innovation and Infrastructure
dc.subject banking
dc.subject bankacılık
dc.subject big data
dc.subject büyük veri
dc.subject customer lifetime value
dc.subject müşteri yaşam boyu değeri
dc.subject data analytics
dc.subject veri analitiği
dc.title Customer lifetime value prediction and segmentation analysis for commercial customers in the banking industry
dc.title.alternative Bankacılık sektöründeki tüzel müşteriler için müşteri yaşam boyu değeri tahmini ve segmentasyon analizi
dc.type Master Thesis
Dosyalar
Orijinal seri
Şimdi gösteriliyor 1 - 1 / 1
thumbnail.default.alt
Ad:
528221082.pdf
Boyut:
2.56 MB
Format:
Adobe Portable Document Format
Açıklama
Lisanslı seri
Şimdi gösteriliyor 1 - 1 / 1
thumbnail.default.placeholder
Ad:
license.txt
Boyut:
1.58 KB
Format:
Item-specific license agreed upon to submission
Açıklama