References & Citations
Mathematics > Statistics Theory
Title: New asymptotic results in principal component analysis
(Submitted on 7 Jan 2016)
Abstract: Let $X$ be a mean zero Gaussian random vector in a separable Hilbert space ${\mathbb H}$ with covariance operator $\Sigma:={\mathbb E}(X\otimes X).$ Let $\Sigma=\sum_{r\geq 1}\mu_r P_r$ be the spectral decomposition of $\Sigma$ with distinct eigenvalues $\mu_1>\mu_2> \dots$ and the corresponding spectral projectors $P_1, P_2, \dots.$ Given a sample $X_1,\dots, X_n$ of size $n$ of i.i.d. copies of $X,$ the sample covariance operator is defined as $\hat \Sigma_n := n^{-1}\sum_{j=1}^n X_j\otimes X_j.$ The main goal of principal component analysis is to estimate spectral projectors $P_1, P_2, \dots$ by their empirical counterparts $\hat P_1, \hat P_2, \dots$ properly defined in terms of spectral decomposition of the sample covariance operator $\hat \Sigma_n.$ The aim of this paper is to study asymptotic distributions of important statistics related to this problem, in particular, of statistic $\|\hat P_r-P_r\|_2^2,$ where $\|\cdot\|_2^2$ is the squared Hilbert--Schmidt norm. This is done in a "high-complexity" asymptotic framework in which the so called effective rank ${\bf r}(\Sigma):=\frac{{\rm tr}(\Sigma)}{\|\Sigma\|_{\infty}}$ (${\rm tr}(\cdot)$ being the trace and $\|\cdot\|_{\infty}$ being the operator norm) of the true covariance $\Sigma$ is becoming large simultaneously with the sample size $n,$ but ${\bf r}(\Sigma)=o(n)$ as $n\to\infty.$ In this setting, we prove that, in the case of one-dimensional spectral projector $P_r,$ the properly centered and normalized statistic $\|\hat P_r-P_r\|_2^2$ with {\it data-dependent} centering and normalization converges in distribution to a Cauchy type limit. The proofs of this and other related results rely on perturbation analysis and Gaussian concentration.
Link back to: arXiv, form interface, contact.