We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:

References & Citations


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Statistics > Methodology

Title: Automatic sparse PCA for high-dimensional data

Abstract: Sparse principal component analysis (SPCA) methods have proven to efficiently analyze high-dimensional data. Among them, threshold-based SPCA (TSPCA) is computationally more cost-effective as compared to regularized SPCA, based on L1 penalties. Here, we investigate the efficacy of TSPCA for high-dimensional data settings and illustrate that, for a suitable threshold value, TSPCA achieves satisfactory performance for high-dimensional data. Thus, the performance of the TSPCA depends heavily on the selected threshold value. To this end, we propose a novel thresholding estimator to obtain the principal component (PC) directions using a customized noise-reduction methodology. The proposed technique is consistent under mild conditions, unaffected by threshold values, and therefore yields more accurate results quickly at a lower computational cost. Furthermore, we explore the shrinkage PC directions and their application in clustering high-dimensional data. Finally, we evaluate the performance of the estimated shrinkage PC directions in actual data analyses.
Subjects: Methodology (stat.ME); Statistics Theory (math.ST)
Cite as: arXiv:2209.14891 [stat.ME]
  (or arXiv:2209.14891v1 [stat.ME] for this version)

Submission history

From: Kazuyoshi Yata [view email]
[v1] Thu, 29 Sep 2022 15:58:28 GMT (354kb)

Link back to: arXiv, form interface, contact.