Vector-G: Multi-Modular SVM-Based Heterotrimeric G Protein Prediction
Article type: Research Article
Authors: Jain, Preti | Wadhwa, Puneet | Aygun, Ramazan | Podila, Gopi
Affiliations: Department of Biological Sciences, University of Alabama in Huntsville, Huntsville, AL 35899, USA | Computer Science Department, University of Alabama in Huntsville, Huntsville, AL 35899, USA
Note: [] Corresponding author. Ramazan Aygun, Department of Computer Sciences, Technology Hall N360, University of Alabama in Huntsville, Huntsville, AL 35899, USA. E-mail: raygun@cs.uah.edu
Abstract: Heterotrimeric G proteins interact with G protein-coupled receptors in response to stimulation by hormones, neurotransmitters, chemokines, and sensory signals to intracellular signaling cascades. Recently reported studies indicate that G protein subunits play a significant role in different eukaryotic diseases including inflammation, neurological diseases, cardiovascular diseases, endocrine disorders as well as plant pathogen response, infectious hyphae growth, differentiation and virulence of pathogenic fungi. Thus a study of their functions, signaling pathways, and protein interactions may lead to the development of various preventive approaches. The diversity of α, β and γ subunits of G proteins necessitates a prediction algorithm that helps in the identification of new proteins such as Gβ where WD-40 repeats are not well characterized. The currently available techniques for finding G proteins are homology based search analyses and wet lab experiments, which are not very effective in finding new classes of proteins. We present here a robust computational method for finding new G proteins and their homologs using a SVM based pattern recognition algorithm. Several physicochemical and compositional properties including dipeptide, tripeptide and hydrophobicity composition are used for generating the SVM classifiers. This method has 96.17%, 95.38%, 97.6% sensitivity and 99.45%, 100%, 100% specificity on test sets for G protein α, β, and γ subunits, respectively. This algorithm correctly predicts the known α, β and γ subunits reported in literature. One important contribution of this algorithm is that it helps in improving genome annotation of several proteins as G proteins and serves as a useful tool for comparative genomic analysis of G proteins. Using this method, novel G protein subunits are predicted in 31 genomes covering plant, fungi and animal kingdom. The software is available at the website http://biomine.cs.uah.edu/bioinformatics/svm_prog/scripts/GProteins/vectorg.html. Supplementary files: The supplementary files are available on http://www.bioinfo.de/isb/2008/08/0013/supplementary_material/.
Keywords: Heterotrimeric G proteins, SVM, compositional properties, signal transduction
Journal: In Silico Biology, vol. 8, no. 2, pp. 141-155, 2008