We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:

References & Citations


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Mathematics > Statistics Theory

Title: HePPCAT: Probabilistic PCA for Data with Heteroscedastic Noise

Abstract: Principal component analysis (PCA) is a classical and ubiquitous method for reducing data dimensionality, but it is suboptimal for heterogeneous data that are increasingly common in modern applications. PCA treats all samples uniformly so degrades when the noise is heteroscedastic across samples, as occurs, e.g., when samples come from sources of heterogeneous quality. This paper develops a probabilistic PCA variant that estimates and accounts for this heterogeneity by incorporating it in the statistical model. Unlike in the homoscedastic setting, the resulting nonconvex optimization problem is not seemingly solved by singular value decomposition. This paper develops a heteroscedastic probabilistic PCA technique (HePPCAT) that uses efficient alternating maximization algorithms to jointly estimate both the underlying factors and the unknown noise variances. Simulation experiments illustrate the comparative speed of the algorithms, the benefit of accounting for heteroscedasticity, and the seemingly favorable optimization landscape of this problem. Real data experiments on environmental air quality data show that HePPCAT can give a better PCA estimate than techniques that do not account for heteroscedasticity.
Comments: This article has been accepted for publication in the IEEE Transactions on Signal Processing. (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See this https URL for more information. 26 pages, 14 figures
Subjects: Statistics Theory (math.ST); Signal Processing (eess.SP)
Journal reference: IEEE Transactions on Signal Processing, Vol. 69, pp. 4819-4834, 2021
DOI: 10.1109/TSP.2021.3104979
Cite as: arXiv:2101.03468 [math.ST]
  (or arXiv:2101.03468v3 [math.ST] for this version)

Submission history

From: David Hong [view email]
[v1] Sun, 10 Jan 2021 03:52:56 GMT (7228kb,D)
[v2] Thu, 3 Jun 2021 03:12:26 GMT (7756kb,D)
[v3] Wed, 1 Dec 2021 07:37:44 GMT (7756kb,D)

Link back to: arXiv, form interface, contact.