Publication:
The sbv IMPROVER Systems Toxicology computational challenge: Identification of human and species-independent blood response markers as predictors of smoking exposure and cessation status

Research Projects

Organizational Units

Journal Issue

Abstract

Cigarette smoking entails chronic exposure to a mixture of harmful chemicals that trigger molecular changes over time, and is known to increase the risk of developing diseases. Risk assessment in the context of 21st century toxicology relies on the elucidation of mechanisms of toxicity and the identification of exposure response markers, usually from high-throughput data, using advanced computational methodologies. The sbv IMPROVER Systems Toxicology computational challenge (Fall 2015-Spring 2016) aimed to evaluate whether robust and sparse (≤40 genes) human (sub-challenge 1, SC1) and species-independent (sub-challenge 2, SC2) exposure response markers (so called gene signatures) could be extracted from human and mouse blood transcriptomics data of current (S), former (FS) and never (NS) smoke-exposed subjects as predictors of smoking and cessation status. Best-performing computational methods were identified by scoring anonymized participants' predictions. Worldwide participation resulted in 12 (SC1) and six (SC2) final submissions qualified for scoring. The results showed that blood gene expression data were informative to predict smoking exposure (i.e. discriminating smoker versus never or former smokers) status in human and across species with a high level of accuracy. By contrast, the prediction of cessation status (i.e. distinguishing FS from NS) remained challenging, as reflected by lower classification performances. Participants successfully developed inductive predictive models and extracted human and species-independent gene signatures, including genes with high consensus across teams. Post-challenge analyses highlighted "feature selection" as a key step in the process of building a classifier and confirmed the importance of testing a gene signature in independent cohorts to ensure the generalized applicability of a predictive model at a population-based level. In conclusion, the Systems Toxicology challenge demonstrated the feasibility of extracting a consistent blood-based smoke exposure response gene signature and further stressed the importance of independent and unbiased data and method evaluations to provide confidence in systems toxicology-based scientific conclusions.

Description

Subject

Smoking biomarker, biological marker, Article, blood gene signature, blood sampling, CDKN1C gene, CLEC10A gene, computational fluid dynamics, computer model, consensus, DNA methylation, DSC2 gene, feasibility study, FSTL1 gene, gene expression, gene expression assay, gene mapping, genetic marker, GPR63 gene, GSE1 gene, GUCY1A3 gene, human, human versus animal comparison, immunity, nonhuman, priority journal, RNA hybridization, RNA isolation, scoring system, SEMA6B gene, sequence homology, smoking, smoking cessation, smoking exposure, support vector machine, training, transcriptomics, unindexed sequence, Blood biomarkers, Computational challenge, Gene signature, Smoking biomarker, Systems toxicology, Gene signature, Systems toxicology, Blood biomarkers, Computational challenge

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By

Related Goal

0

Views

0

Downloads
View PlumX Details