Publication:
Identification of key gene pathways in alzheimer's diseaseusing machine learning methods on gene expression data

Loading...
Thumbnail Image

Date

Journal Title

Journal ISSN

Volume Title

Publisher

ITU Graduate School

Research Projects

Organizational Units

Journal Issue

Abstract

Alzheimer's disease (AD) is a progressive neurodegenerative disorder characterized by complex molecular mechanisms that complicate the development of effective treatment strategies and early diagnosis. This study aims to evaluate machine learning-based classification and pathway-level biological interpretation approaches within an integrated analytical framework using transcriptomic gene expression data associated with AD. The main objective of the study is not only to achieve high classification performance but also to ensure that disease-related biological processes are presented in an interpretable manner. In this context, Random Forest (RF), Support Vector Machine (SVM), and Multi-Layer Perceptron (MLP) models were applied to gene expression data obtained from different tissue and sample types. To improve the models' generalizability and minimize overfitting risk, a nested cross-validation structure with multiple iterations was used, and hyperparameter optimization was performed using the Bayesian-based Optuna framework. Model performance was evaluated using multiple metrics such as accuracy, AUC, precision, sensitivity, and F1-score. In the biological interpretation phase, gene expression data were analyzed at the pathway level using Hallmark gene sets; studies and results based on GSEA and GSVA in the literature were examined. Key biological processes associated with AD were investigated using gene set overlap analyses and visualization methods. The findings indicate that inflammatory responses, blood-brain barrier crossing regulation, insulin metabolism, and cellular stress responses play a prominent role in the pathophysiology of Alzheimer's disease. Furthermore, the interpretability dimension has been strengthened by evaluating the biological consistency and alignment with the literature of model-based results. In conclusion, this study demonstrates that machine learning (ML) methods have the potential to produce not only predictive power but also biologically plausible and reliable inferences at the pathway level on AD transcriptomic data. The proposed integrated analysis approach provides a methodological reference for future studies aimed at biomarker discovery and elucidating disease mechanisms.

Description

Thesis (M.Sc.) -- Istanbul Technical University, Graduate School, 2026

Subject

endüstri mühendisliği, industrial engineering, alzheimer hastalığı, alzheimer's disease, makine öğrenmesi, machine learning, gen ekspresyonu, gene expression

Citation

Endorsement

Review

Supplemented By

Referenced By

Related Goal

6

Views

37

Downloads