Affiliations: Department of Chemical and Biomolecular Engineering,
National University of Singapore, Singapore 117576, Singapore | Synthetic Biology Team, RIKEN Genomic Sciences Center,
Tsurumi-ku, Yokohama 2300045, Japan | Bioinformatics and Systems Engineering Team, RIKEN,
Tsurumi-Ku, Yokohama 2300045, Japan
Abstract: The computational prediction of protein-protein interactions (PPI)
is an essential complement to direct experimental evidence. Traditional
approaches rely on less available or computationally predicted surface
properties, show database-specific performances and are computationally
expensive for large-scale datasets. Several sensitivity and specificity issues
remain. Here, we report a novel method based on 'Amino-acid Residue
Associations' (ARA) among interacting proteins which utilizes the accurate and
easily available primary sequence. Large scale PPI datasets for six model
species (from E. coli to human) were studied. The ARA method shows up to
73% sensitivity and 78% specificity. Furthermore, the method performs
remarkably well in terms of stability and generalizability. The performance of
ARA method benchmarked against existing prediction techniques shows performance
improvement up to 25%. Ability of ARA method to predict PPI across species
and across databases is also demonstrated. Overall, the ARA method provides a
significant improvement over existing ones in correctly identifying large scale
protein-protein interactions, irrespective of the data resource, network size
or organism.