This article is part of the supplement: Selected articles from the IEEE International Conference on Bioinformatics and Biomedicine 2010
A binary matrix factorization algorithm for protein complex prediction
-
* Corresponding author: Lei Xu lxu@cse.cuhk.edu.hk
- Equal contributors
1 Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong
2 Bioinformatics Laboratory and National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101
Proteome Science 2011, 9(Suppl 1):S18 doi:10.1186/1477-5956-9-S1-S18
Published: 14 October 2011Abstract
Background
Identifying biologically relevant protein complexes from a large protein-protein interaction (PPI) network, is essential to understand the organization of biological systems. However, high-throughput experimental techniques that can produce a large amount of PPIs are known to yield non-negligible rates of false-positives and false-negatives, making the protein complexes difficult to be identified.
Results
We propose a binary matrix factorization (BMF) algorithm under the Bayesian Ying-Yang (BYY) harmony learning, to detect protein complexes by clustering the proteins which share similar interactions through factorizing the binary adjacent matrix of a PPI network. The proposed BYY-BMF algorithm automatically determines the cluster number while this number is pre-given for most existing BMF algorithms. Also, BYY-BMF’s clustering results does not depend on any parameters or thresholds, unlike the Markov Cluster Algorithm (MCL) that relies on a so-called inflation parameter. On synthetic PPI networks, the predictions evaluated by the known annotated complexes indicate that BYY-BMF is more robust than MCL for most cases. On real PPI networks from the MIPS and DIP databases, BYY-BMF obtains a better balanced prediction accuracies than MCL and a spectral analysis method, while MCL has its own advantages, e.g., with good separation values.