Publication:
Comparative study of federated learning for credit risk assessment and fairness evaluation

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Graduate School

Research Projects

Organizational Units

Journal Issue

Abstract

In this study, we proposed a horizontal federated learning based risk model study for credit card default prediction for data privacy requirements and fairness issues. In finance, healthcare and IoT fields where data collection on central servers is becoming increasingly difficult, information sharing is provided by sharing model parameters while user sensitive data remains on the local device. In addition, thanks to federated learning, communication load is reduced by sharing only model parameters instead of sharing all data. We used three different datasets in our studies. These are Default of Credit Card Clients (DCCC) dataset containing 30,000 samples, 23 features and 22.1\% default rate; Home Credit Default Risk (HCDR) dataset containing 356,255 samples, 122 features and 8.1\% default rate; and Home Equity (HMEQ) dataset containing 5,960 samples, 12 features and 19.9\% default rate. In the preprocessing phase of the datasets; scaling with StandardScaler and MinMaxScaler, categorical coding with One-Hot Encoding, SMOTE and RandomUnderSampler for imbalance, PCA and Select-kBest for feature reduction were applied. In addition, five machine learning algorithms were compared for our experiments: Logistic Regression, Multilayer Perceptron, SVM, XGBoost and Random Forest. We built the Federated Learning infrastructure with Flower framework and created five clients with non-IID distribution based on $\alpha$ parameter using DirichletPartitioner in the simulation environment. Each client data was divided into 70\% training, 10\% validation and 20\% test subsets and local preprocessing was performed. We designed 2 experiments in our study; performance comparison and fairness analysis. In our first experiment, we conducted experiments with different combinations of model, data sampling and feature reduction methods for each data set. This means that a total of 405 different experiments were conducted for three data sets, and each experiment was repeated 10 times to increase the consistency of the experiments and the results were averaged. Here, we tested the performance of the F1 Score Weighted Aggregation method, which we developed as an alternative to FedAvg. According to the results we obtained, aggregation methods generally achieved good or similar results compared to the central method and at the same time protected data privacy. In addition, FedF1 gave good results that could compete with FedAvg and in some cases, it performed better, allowing more consistent results to be produced with lower variance. For our fairness analysis experiment, we created the gender-dependent "Interest" feature on DCCC. Accordingly, we distributed the data as non-IID among five clients with different gender ratios. In our fairness experiments, we used three different scenarios to make the results easier to understand; Local, Central and FairFed. In these scenarios, the deviations from Local to Central and FairFed with the Equal Opportunity Difference (EOD) metric were 0.1444, 0.1332 and 0.0283, indicating an increase in fairness of over 80\%.

Description

Thesis (M.Sc.) -- Istanbul Technical University, Graduate School, 2025

Subject

federated learning, federe öğrenme, credit card, kredi kartı, data, veri

Citation

Endorsement

Review

Supplemented By

Referenced By

Related Goal

13

Views

123

Downloads