Identification of 13 blood-based gene expression signatures to accurately distinguish tuberculosis from other pulmonary diseases and healthy controls
Issue title: Frontiers in Biomedical Engineering and Biotechnology – Proceedings of the 4th International Conference on Biomedical Engineering and Biotechnology, 18–21 August 2015, Shanghai, China
Tuberculosis (TB), caused by infection with mycobacterium tuberculosis, is still a major threat to human health worldwide. Current diagnostic methods encounter some limitations, such as sample collection problem or unsatisfied sensitivity and specificity issue. Moreover, it is hard to identify TB from some of other lung diseases without invasive biopsy. In this paper, the logistic models with three representative regularization approaches including Lasso (the most popular regularization method), and L1/2 (the method that inclines to achieve more sparse solution than Lasso) and Elastic Net (the method that encourages a grouping effect of genes in the results) adopted together to select the common gene signatures in microarray data of peripheral blood cells. As the result, 13 common gene signatures were selected, and sequentially the classifier based on them is constructed by the SVM approach, which can accurately distinguish tuberculosis from other pulmonary diseases and healthy controls. In the test and validation datasets of the blood gene expression profiles, the generated classification model achieved 91.86% sensitivity and 93.48% specificity averagely. Its sensitivity is improved 6%, but only 26% gene signatures used compared to recent research results. These 13 gene signatures selected by our methods can be used as the basis of a blood-based test for the detection of TB from other pulmonary diseases and healthy controls.