Comperative evaluation of unsupervised fraud detection algorithms with feature extraction and scaling in purchasing domain

dc.contributor.advisorErgün, Mehmet Ali
dc.contributor.authorTaşoğlu, Yiğit Can
dc.contributor.authorID528211079
dc.contributor.departmentBig Data and Business Analytics
dc.date.accessioned2024-12-20T08:45:19Z
dc.date.available2024-12-20T08:45:19Z
dc.date.issued2024-08-21
dc.descriptionThesis (M.Sc.) -- İstanbul Technical University, Graduate School, 2024
dc.description.abstractThe main aim of the research is to evaluate and compare various unsupervised outlier detection methods that do not require labeled data, making them suitable for real-world purchasing data where labels are often unavailable. The thesis highlights the challenges of fraud detection in large datasets, particularly in industries like finance and purchasing, where fraudulent activities can cause significant financial losses if not identified early. The motivation behind the research lies in the limitations of traditional, rule-based detection methods, which often fail to capture complex fraud patterns. Unsupervised algorithms, which can detect anomalies based on their deviation from the general behavior of the dataset, offer a proactive approach to fraud detection by identifying unseen fraud concepts. This study applies various methods, including distance-based, machine learning-based, and feature-based models, and focuses on enhancing these models through feature extraction and scaling techniques. The thesis evaluates several algorithms, such as Local Outlier Factor (LOF), DBSCAN, and Isolation Forest, using performance metrics like accuracy, precision, recall, and F1 score. LOF was identified as the most effective model, achieving the highest accuracy and demonstrating a robust ability to detect irregular patterns in the purchasing data. However, the effectiveness of all algorithms was significantly enhanced by data transformations, particularly scaling. Scaling ensures that features with differing magnitudes, such as quantities and prices, do not distort the results, allowing for more accurate anomaly detection. The importance of feature extraction is also emphasized, as it helps identify intricate patterns between data points. Extracted features, such as the frequency of purchase orders, vendor categories, and purchase amounts, provide deeper insights into potential fraud indicators. Additionally, the study recognizes that the integration of multiple models can reduce the limitations inherent in individual algorithms, thus creating a more comprehensive fraud detection framework. By combining different unsupervised methods and leveraging feature extraction, the research offers a more adaptive and reliable approach to identifying fraudulent activities. In conclusion, this study proves that employing a combination of unsupervised outlier detection methods, along with appropriate data preprocessing techniques, significantly improves fraud detection in purchasing systems. These methods not only enhance accuracy but also help businesses reduce financial risks and improve operational efficiency, ensuring a more secure and effective fraud prevention strategy.
dc.description.degreeM.Sc.
dc.identifier.urihttp://hdl.handle.net/11527/25898
dc.language.isoen_US
dc.publisherGraduate School
dc.sdg.typeGoal 9: Industry, Innovation and Infrastructure
dc.subjectdata analysis
dc.subjectveri analizi
dc.subjectmachine learning
dc.subjectmakine öğrenmesi
dc.subjectbig data
dc.subjectbüyük veri
dc.titleComperative evaluation of unsupervised fraud detection algorithms with feature extraction and scaling in purchasing domain
dc.title.alternativeSatın alma alanında özellik çıkarma ve ölçekleme ile denetimsiz sahtekarlık tespit algoritmalarının karşılaştırmalı değerlendirmesi
dc.typeMaster Thesis

Dosyalar

Orijinal seri

Şimdi gösteriliyor 1 - 1 / 1
Yükleniyor...
Küçük Resim
Ad:
528211079.pdf
Boyut:
1.19 MB
Format:
Adobe Portable Document Format

Lisanslı seri

Şimdi gösteriliyor 1 - 1 / 1
Yükleniyor...
Küçük Resim
Ad:
license.txt
Boyut:
1.58 KB
Format:
Item-specific license agreed upon to submission
Açıklama