We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.ML

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Statistics > Machine Learning

Title: Robust Principal Component Analysis: A Median of Means Approach

Abstract: Principal Component Analysis (PCA) is a fundamental tool for data visualization, denoising, and dimensionality reduction. It is widely popular in Statistics, Machine Learning, Computer Vision, and related fields. However, PCA is well known to fall prey to the presence of outliers and often fails to detect the true underlying low-dimensional structure within the dataset. Recent supervised learning methods, following the Median of Means (MoM) philosophy, have shown great success in dealing with outlying observations without much compromise to their large sample theoretical properties. In this paper, we propose a PCA procedure based on the MoM principle. Called the Median of Means Principal Component Analysis (MoMPCA), the proposed method is not only computationally appealing but also achieves optimal convergence rates under minimal assumptions. In particular, we explore the non-asymptotic error bounds of the obtained solution via the aid of Vapnik-Chervonenkis theory and Rademacher complexity, while granting absolutely no assumption on the outlying observations. The efficacy of the proposal is also thoroughly showcased through simulations and real data applications.
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Statistics Theory (math.ST)
Cite as: arXiv:2102.03403 [stat.ML]
  (or arXiv:2102.03403v1 [stat.ML] for this version)

Submission history

From: Saptarshi Chakraborty [view email]
[v1] Fri, 5 Feb 2021 19:59:05 GMT (91kb,D)

Link back to: arXiv, form interface, contact.