### Current browse context:

math.ST

### Change to browse by:

### References & Citations

# Mathematics > Statistics Theory

# Title: Mean Estimation in High-Dimensional Binary Markov Gaussian Mixture Models

(Submitted on 6 Jun 2022 (v1), last revised 12 Oct 2022 (this version, v3))

Abstract: We consider a high-dimensional mean estimation problem over a binary hidden Markov model, which illuminates the interplay between memory in data, sample size, dimension, and signal strength in statistical inference. In this model, an estimator observes $n$ samples of a $d$-dimensional parameter vector $\theta_{*}\in\mathbb{R}^{d}$, multiplied by a random sign $ S_i $ ($1\le i\le n$), and corrupted by isotropic standard Gaussian noise. The sequence of signs $\{S_{i}\}_{i\in[n]}\in\{-1,1\}^{n}$ is drawn from a stationary homogeneous Markov chain with flip probability $\delta\in[0,1/2]$. As $\delta$ varies, this model smoothly interpolates two well-studied models: the Gaussian Location Model for which $\delta=0$ and the Gaussian Mixture Model for which $\delta=1/2$. Assuming that the estimator knows $\delta$, we establish a nearly minimax optimal (up to logarithmic factors) estimation error rate, as a function of $\|\theta_{*}\|,\delta,d,n$. We then provide an upper bound to the case of estimating $\delta$, assuming a (possibly inaccurate) knowledge of $\theta_{*}$. The bound is proved to be tight when $\theta_{*}$ is an accurately known constant. These results are then combined to an algorithm which estimates $\theta_{*}$ with $\delta$ unknown a priori, and theoretical guarantees on its error are stated.

## Submission history

From: Yihan Zhang [view email]**[v1]**Mon, 6 Jun 2022 09:34:04 GMT (172kb,D)

**[v2]**Tue, 7 Jun 2022 10:56:10 GMT (171kb,D)

**[v3]**Wed, 12 Oct 2022 09:22:16 GMT (175kb,D)

Link back to: arXiv, form interface, contact.