An empirical investigation on improving fairness testing for machine learning models
An empirical investigation on improving fairness testing for machine learning models
Dosyalar
Tarih
2024-02-06
Yazarlar
Karakaş, Umutcan
Süreli Yayın başlığı
Süreli Yayın ISSN
Cilt Başlığı
Yayınevi
Graduate School
Özet
The usage of machine learning has become a more common practice in our lives, so the effects of machine learning can be seen in various sectors such as healthcare, finance, entertainment, and commerce. Thus ML models started taking more crucial roles in influencing decisions and molding experiences. However, the power of machine learning doesn't come without challenges, especially with the issues with fairness. Bias in machine learning systems can skew results, leading to potential inaccuracies or injustices. For instance, a recruitment system might, due to historical data biases, favor one demographic over another, inadvertently perpetuating gender or ethnic disparities. Similarly, a healthcare diagnostic tool might provide unreliable results for certain racial groups if the data it's trained on doesn't account for diversity. Such examples of unfair machine learning behaviors show the crucial need for fairness in these systems. Previous approaches for improving the fairness performance of ML models have focused on the detection and correction of a wide range of data points scattered all over the feature space which generally leads to unrealistic or extreme cases. However, this method has flaws, focusing on those extreme data points can result in missing more common fairness issues, which makes the approach less effective. RSFair is a new approach that shifts the focus from unrealistic or extreme cases to more representative and realistic data instances. This technique aims to detect more common unfair behaviors with the idea that understanding and removing bias in common scenarios will solve the majority of fairness problems in return. In the methodology of RSFair, two primary techniques are employed: Orthogonal Matching Pursuit (OMP) and K-Singular Value Decomposition (K-SVD). These methods are used for sampling a representative set of data points out of a large dataset while keeping its essential characteristics. OMP reconstructs the dataset by the selection of the atom from the dictionary which is the most correlated with the goal signal. This dictionary doesn't includes every single element from the original dataset. Instead, it uses a strategic compilation of atoms that, when combined, represents the full scope of the original dataset. This can also be thought of as trying to recreate the original data set with minimum error, while this error will be reduced and optimized in the K-SVD process by updating the dictionary atoms. This process involves a careful and systematic approach, ensuring that the most representative data points are selected for the dictionary. K-SVD, on the other hand, continually refines the dictionary. It does this through an iterative process, where the dictionary is updated after each cycle. Each iteration aims to optimize the dictionary further, reinforcing its accuracy and reliability as a smaller mirror of the larger dataset. In the RSFair method, OMP and K-SVD are not standalone processes but are collaborative and complementary. The initial dictionary creation by OMP is crucial as it establishes a solid foundation. Still, it's the continuous optimization through K-SVD that ensures this foundation remains robust and reflective of the original dataset. In this study, we've focused on two main research questions: RQ1. How effective is RSFair in finding discriminatory inputs? RQ2. How useful are the generated test inputs to improve the fairness of the model? Addressing the first question, we decided to try out OMP and K-SVD, for creating a representative sampling. to use for discriminatory point detection. This facilitated a comprehensive comparison of RSFair's performance relative to the AEQUITAS and random sampling methodologies. As for the second question, we utilized the discriminatory points uncovered during the search phase to improve the fairness of the initial model. This procedure was replicated for AEQUITAS, random sampling, and RSFair, for the comparative analysis of the outcomes. The introduction of RSFair represents a meaningful advancement in efforts to enhance fairness in machine learning outcomes. By turning attention away from the extreme cases and considering common problems, it's possible to achieve a better understanding of how bias influences these systems.
Açıklama
Thesis (M.Sc.) -- Istanbul Technical University, Graduate School, 2024
Anahtar kelimeler
Machine learning,
Makine öğrenmesi,
Software test,
Yazılım testi