We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.AP

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Statistics > Applications

Title: Bayesian Bi-clustering Methods with Applications in Computational Biology

Abstract: Bi-clustering is a useful approach in analyzing biological data when observations come from heterogeneous groups and have a large number of features. We outline a general Bayesian approach in tackling bi-clustering problems in moderate to high dimensions, and propose three Bayesian bi-clustering models on categorical data, which increase in complexities in their modeling of the distributions of features across bi-clusters. Our proposed methods apply to a wide range of scenarios: from situations where data are cluster-distinguishable only among a small subset of features but masked by a large amount of noise, to situations where different groups of data are identified by different sets of features or data exhibit hierarchical structures. Through simulation studies, we show that our methods outperform existing (bi-)clustering methods in both identifying clusters and recovering feature distributional patterns across bi-clusters. We apply our methods to two genetic datasets, though the area of application of our methods is even broader.
Subjects: Applications (stat.AP)
Cite as: arXiv:2007.06136 [stat.AP]
  (or arXiv:2007.06136v2 [stat.AP] for this version)

Submission history

From: Han Yan [view email]
[v1] Mon, 13 Jul 2020 00:11:45 GMT (2529kb,D)
[v2] Tue, 9 Feb 2021 23:56:24 GMT (3420kb,D)

Link back to: arXiv, form interface, contact.