Proteome Science

official impact factor 2.49

This article is part of the supplement: Selected articles from the IEEE International Conference on Bioinformatics and Biomedicine 2010

Open Access Proceedings

A binary matrix factorization algorithm for protein complex prediction

Shikui Tu1, Runsheng Chen2 and Lei Xu1*

Author Affiliations

1 Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong

2 Bioinformatics Laboratory and National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101

For all author emails, please log on.

Proteome Science 2011, 9(Suppl 1):S18 doi:10.1186/1477-5956-9-S1-S18

Published: 14 October 2011

Abstract

Background

Identifying biologically relevant protein complexes from a large protein-protein interaction (PPI) network, is essential to understand the organization of biological systems. However, high-throughput experimental techniques that can produce a large amount of PPIs are known to yield non-negligible rates of false-positives and false-negatives, making the protein complexes difficult to be identified.

Results

We propose a binary matrix factorization (BMF) algorithm under the Bayesian Ying-Yang (BYY) harmony learning, to detect protein complexes by clustering the proteins which share similar interactions through factorizing the binary adjacent matrix of a PPI network. The proposed BYY-BMF algorithm automatically determines the cluster number while this number is pre-given for most existing BMF algorithms. Also, BYY-BMF’s clustering results does not depend on any parameters or thresholds, unlike the Markov Cluster Algorithm (MCL) that relies on a so-called inflation parameter. On synthetic PPI networks, the predictions evaluated by the known annotated complexes indicate that BYY-BMF is more robust than MCL for most cases. On real PPI networks from the MIPS and DIP databases, BYY-BMF obtains a better balanced prediction accuracies than MCL and a spectral analysis method, while MCL has its own advantages, e.g., with good separation values.