Novel centrality, topology and hierarchical-aware link prediction in dynamic networks

thumbnail.default.alt
Tarih
2023-09-05
Yazarlar
Sserwadda, Abubakhari
Süreli Yayın başlığı
Süreli Yayın ISSN
Cilt Başlığı
Yayınevi
Graduate School
Özet
The increasing availability of social network data has given rise to research devoted to solving problems associated with social network-related applications. However, the hugeness and complexity of relationships among social network elements render the prediction of links between the entities a challenging task. The previous research often focuses primarily on investigating local node connectivity data while ignoring other important network-characterizing properties. The key network-characterizing properties that are often underrated include network topology, node structural centrality roles, and network hierarchical information. Furthermore, whereas many real-world graphs change over time, several works assume static networks. In order to overcome these challenges, first, we compute several topological similarity-based convolution feature matrices by using various topological similarity metrics such as Common Neighbour, Jaccard Index, Adamic Adder, Salton Index, Resource Allocation, and Sørensen Index. We then utilize the resulting topological feature matrices to capture the prevailing topological information in the input graphs efficiently. Second, we leverage the strength centrality, a stronger variant of node degree, to conserve the node's centrality and the structural connectivity information in the network. In addition, we systematically aggregate such diverse features to yield quality higher-level feature representations. Lastly, we leverage an LSTM layer to capture the prevailing temporal information in the graph sequences. To learn the low dimensional node representations, first, we deployed a fully connected variational autoencoder that efficiently explores variations in the input graphs to learn high-quality node embeddings. Furthermore, we imposed centrality and topological constraints on the learning model to further enforce the preservation of the centrality and topological ınformation of input graphs in the learned embeddings. However, variational autoencoders have large computational time and memory requirements due large number of parameters characterizing the fully connected encoders and decoders, especially when they are applied on large networks. In order to extend our implementations to large datasets while minimizing the computational time and memory requirements, we adopted a Graph Convolution Network (GCN)-based implementation. The proposed Structural and Topological based geometric deep learning approach was evaluated on five real-world temporal social networks. Based on experimental results, on average, they yield a 4\% link prediction AUC improvement in link prediction accuracy, a small increment in training for each epoch (0.2s (10\%)), and a 56\% MSE reduction in centrality prediction when compared to the best benchmarks. The proposed end-to-end centrality and topological guided link prediction framework for dynamic networks preserve not only the centrality node roles and the topological information in the learned embeddings but also captures the prevailing temporal information in the dynamic networks. The models utilize node centrality and topological features to capture and conserve the network topology and the structural roles of nodes during embedding learning. Thus, obtaining pretty-quality embeddings that enhance the link prediction and centrality prediction accuracies. For all our proposed methods, we assess the impact of the various modules of the proposed models by comparing them with their variants that lack such modules, and we present and explain the results accordingly. In other related work, we introduce a Hierarchical and Centrality aware Polypharmacy Side Effect Prediction (HC-POSE) Model. We model side effect prediction as a link prediction task problem and leverage core decomposition to explore the prevailing hierarchical information in the heterogeneous protein-protein, protein-drug, and drug-drug interaction datasets. Following k-core decomposition, for each k-core subgraph produced, a node strength matrix is computed to store the centrality information of each subgraph. Then we systematically aggregate the obtained centrality with the k-core adjacency matrix to have higher-level diverse feature representations. We deployed a GCN-based auto-encoder to learn low-dimensional representations for the homogeneous sub-graphs and an RCGN-based auto-encoder for the heterogeneous subgraphs. Based on the experimental results, HC-POSE exhibited a 3\% accuracy improvement in POSE prediction as compared to the best baseline.
Açıklama
Thesis (Ph.D.) -- Istanbul Technical University, Graduate School, 2023
Anahtar kelimeler
Networks, Ağlar
Alıntı