We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:

References & Citations


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Statistics > Methodology

Title: MM-PCA: Integrative Analysis of Multi-group and Multi-view Data

Abstract: Data integration is the problem of combining multiple data groups (studies, cohorts) and/or multiple data views (variables, features). This task is becoming increasingly important in many disciplines due to the prevalence of large and heterogeneous data sets. Data integration commonly aims to identify structure that is consistent across multiple cohorts and feature sets. While such joint analyses can boost information from single data sets, it is also possible that a globally restrictive integration of heterogeneous data may obscure signal of interest.
Here, we therefore propose a data adaptive integration method, allowing for structure in data to be shared across an a priori unknown \emph{subset of cohorts and views}. The method, Multi-group Multi-view Principal Component Analysis (MM-PCA), identifies partially shared, sparse low-rank components. This also results in an integrative bi-clustering across cohorts and views. The strengths of MM-PCA are illustrated on simulated data and on 'omics data from The Cancer Genome Atlas. MM-PCA is available as an R-package.
Key words: Data integration, Multi-view, Multi-group, Bi-clustering
Comments: Manuscript+Supplement
Subjects: Methodology (stat.ME); Genomics (q-bio.GN); Applications (stat.AP)
Cite as: arXiv:1911.04927 [stat.ME]
  (or arXiv:1911.04927v1 [stat.ME] for this version)

Submission history

From: Rebecka Jörnsten [view email]
[v1] Tue, 12 Nov 2019 15:18:35 GMT (5056kb,D)

Link back to: arXiv, form interface, contact.