We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

math.ST

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Mathematics > Statistics Theory

Title: Factor selection by permutation

Abstract: Researchers often have data measuring features $x_{ij}$ of samples, such as test scores of students. In factor analysis and PCA, these features are thought to be influenced by unobserved factors, such as skills. Can we determine how many factors affect the data? Many approaches have been developed for this factor selection problem. The popular Parallel Analysis method randomly permutes each feature of the data. It selects factors if their singular values are larger than those of the permuted data. It is used by leading applied statisticians, including T Hastie, M Stephens, J Storey, R Tibshirani and WH Wong. Despite empirical evidence for its accuracy, there is currently no theoretical justification. This prevents us from knowing when it will work in the future.
In this paper, we show that parallel analysis consistently selects the significant factors in certain high-dimensional factor models. The intuition is that permutations keep the noise invariant, while "destroying" the low-rank signal. This provides justification for permutation methods in PCA and factor models under some conditions. A key requirement is that the factors must load on several variables. Our work points to improvements of permutation methods.
Comments: Feedback is welcome
Subjects: Statistics Theory (math.ST); Methodology (stat.ME)
Cite as: arXiv:1710.00479 [math.ST]
  (or arXiv:1710.00479v1 [math.ST] for this version)

Submission history

From: Edgar Dobriban [view email]
[v1] Mon, 2 Oct 2017 04:29:22 GMT (369kb,D)
[v2] Sat, 6 Oct 2018 15:44:39 GMT (723kb,D)
[v3] Fri, 13 Sep 2019 13:25:11 GMT (723kb,D)

Link back to: arXiv, form interface, contact.