Permutation methods for factor analysis and PCA

Dobriban, Edgar

Full-text links:

Download:

Current browse context:

math.ST

< prev | next >

new | recent | 1710

Mathematics > Statistics Theory

Title: Permutation methods for factor analysis and PCA

Authors: Edgar Dobriban

(Submitted on 2 Oct 2017 (v1), last revised 13 Sep 2019 (this version, v3))

Abstract: Researchers often have datasets measuring features $x_{ij}$ of samples, such as test scores of students. In factor analysis and PCA, these features are thought to be influenced by unobserved factors, such as skills. Can we determine how many components affect the data? This is an important problem, because it has a large impact on all downstream data analysis. Consequently, many approaches have been developed to address it. Parallel Analysis is a popular permutation method. It works by randomly scrambling each feature of the data. It selects components if their singular values are larger than those of the permuted data. Despite widespread use in leading textbooks and scientific publications, as well as empirical evidence for its accuracy, it currently has no theoretical justification.
In this paper, we show that the parallel analysis permutation method consistently selects the large components in certain high-dimensional factor models. However, it does not select the smaller components. The intuition is that permutations keep the noise invariant, while "destroying" the low-rank signal. This provides justification for permutation methods in PCA and factor models under some conditions. Our work uncovers drawbacks of permutation methods, and paves the way to improvements.

Comments:	To appear in the Annals of Statistics
Subjects:	Statistics Theory (math.ST); Methodology (stat.ME)
Cite as:	arXiv:1710.00479 [math.ST]
	(or arXiv:1710.00479v3 [math.ST] for this version)

Submission history

From: Edgar Dobriban [view email]
[v1] Mon, 2 Oct 2017 04:29:22 GMT (369kb,D)
[v2] Sat, 6 Oct 2018 15:44:39 GMT (723kb,D)
[v3] Fri, 13 Sep 2019 13:25:11 GMT (723kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> math > arXiv:1710.00479

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Mathematics > Statistics Theory

Title: Permutation methods for factor analysis and PCA

Submission history