Publication:
Efficient online comparison and visualization of high throughput genomic variant lists

Loading...
Thumbnail Image

Date

Journal Title

Journal ISSN

Volume Title

Publisher

ITU Graduate School

Research Projects

Organizational Units

Journal Issue

Abstract

Over recent years, the proliferation of high-throughput sequencing has led to the generation of large amounts of genetic data. One of the most significant types of this data is variant data. The comparison and visualization of variant data are commonly performed operations, however, there are no tools addressing this need. At present, each operation must be performed via specialized scripts. Hence, a graphical interface facilitating these operations is highly valuable, especially for users not comfortable working with code and the command line. This thesis presents a user-friendly web application for comparing and visualizing genetic variants. This application provides functionality absent in literature and allows users to get insights into their data. Due to the complex nature of obtaining this data, it is valuable to compare results produced via differing methods of raw data generation and processing. The presented tool addresses capabilities for comparing numerous files individually to one another as well as comparing them collectively. Benchmarking capabilities are also provided based on user-provided ground truth files. Due to the potential benefits of merging files of differing origin, file grouping based on user-defined metadata is also provided. Commonly, there are regions of interest in a genome, to which analysis may be wished to be limited. As such, filtering functionality is provided based on genomic regions and chromosomes. An efficient genomic interval-based filtering algorithm is presented and described. This application was developed using Python 3 and utilizes the Plotly Dash library for web development which combines Flask and React to produce efficient data analysis web applications. It is deployed on a server provided by Istanbul Technical University and is accessible at https://bioinformatics.itu.edu.tr/vcf-observer freely. Case studies investigating results obtained from quality control and reproducibility studies are provided in detail along with relevant visualizations produced using the application. Various filtering and grouping parameters are investigated and results pertaining to the performances of different data production methodologies are described via results obtained from the application. Throughout the first 4 months of 2025, the application has received over 90 unique users uploading data from over 20 different countries. It provides novel functionality through a user-friendly interface, facilitating accessible variant data exploration to researchers and clinicians.

Description

Thesis (M.Sc.) -- Istanbul Technical University, Graduate School, 2025

Subject

biyobilişim, bioinformatics, genetik veri, genetic data, genetik mühendisliği, genetic engineering, genler, genes, veri görselleştirme, data visualization, genetik data, genetic data

Citation

Endorsement

Review

Supplemented By

Referenced By

Related Goal

7

Views

18

Downloads