Complexes Detection in Biological Networks via Diversified Dense Subgraphs Mining

Ma, Xiuli; Zhou, Guangyu; Wang, Jingjing; Peng, Jian; Han, Jiawei

Abstract: Protein-protein interaction (PPI) networks, providing a comprehensive landscape of protein interacting patterns, enable us to explore biological processes and cellular components at multiple resolutions. For a biological process, a number of proteins need to work together to perform the job. Proteins densely interact with each other, forming large molecular machines or cellular building blocks. Identification of such densely interconnected clusters or protein complexes from PPI networks enables us to obtain a better understanding of the hierarchy and organization of biological processes and cellular components. Most existing methods apply efficient graph clustering algorithms on PPI networks, often failing to detect possible densely connected subgraphs and overlapped subgraphs. Besides clustering-based methods, dense subgraph enumeration methods have also been used, which aim to find all densely connected protein sets. However, such methods are not practically tractable even on a small yeast PPI network, due to high computational complexity. In this paper, we introduce a novel approximate algorithm to efficiently enumerate putative protein complexes from biological networks. The key insight of our algorithm is that we do not need to enumerate all dense subgraphs. Instead we only need to find a small subset of subgraphs that cover as many proteins as possible. The problem is formulated as finding a diverse set of dense subgraphs, where we develop highly effective pruning techniques to guarantee efficiency. To handle large networks, we take a divide-and-conquer approach to speed up the algorithm in a distributed manner. By comparing with existing clustering and dense subgraph-based algorithms on several human and yeast PPI networks, we demonstrate that our method can detect more putative protein complexes and achieves better prediction accuracy.

Subjects:	Molecular Networks (q-bio.MN); Data Structures and Algorithms (cs.DS)
Cite as:	arXiv:1604.03244 [q-bio.MN]
	(or arXiv:1604.03244v1 [q-bio.MN] for this version)

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Quantitative Biology > Molecular Networks

Title: Complexes Detection in Biological Networks via Diversified Dense Subgraphs Mining

Submission history

> q-bio > arXiv:1604.03244