### References & Citations

# Mathematics > Statistics Theory

# Title: Rho-estimators revisited: General theory and applications

(Submitted on 17 May 2016 (v1), last revised 29 Nov 2017 (this version, v5))

Abstract: Following Baraud, Birg\'e and Sart (2017), we pursue our attempt to design a robust universal estimator of the joint ditribution of $n$ independent (but not necessarily i.i.d.) observations for an Hellinger-type loss. Given such observations with an unknown joint distribution $\mathbf{P}$ and a dominated model $\mathscr{Q}$ for $\mathbf{P}$, we build an estimator $\widehat{\mathbf{P}}$ based on $\mathscr{Q}$ and measure its risk by an Hellinger-type distance. When $\mathbf{P}$ does belong to the model, this risk is bounded by some quantity which relies on the local complexity of the model in a vicinity of $\mathbf{P}$. In most situations this bound corresponds to the minimax risk over the model (up to a possible logarithmic factor). When $\mathbf{P}$ does not belong to the model, its risk involves an additional bias term proportional to the distance between $\mathbf{P}$ and $\mathscr{Q}$, whatever the true distribution $\mathbf{P}$. From this point of view, this new version of $\rho$-estimators improves upon the previous one described in Baraud, Birg\'e and Sart (2017) which required that $\mathbf{P}$ be absolutely continuous with respect to some known reference measure. Further additional improvements have been brought as compared to the former construction. In particular, it provides a very general treatment of the regression framework with random design as well as a computationally tractable procedure for aggregating estimators. We also give some conditions for the Maximum Likelihood Estimator to be a $\rho$-estimator. Finally, we consider the situation where the Statistician has at disposal many different models and we build a penalized version of the $\rho$-estimator for model selection and adaptation purposes. In the regression setting, this penalized estimator not only allows to estimate the regression function but also the distribution of the errors.

## Submission history

From: Lucien Birgé [view email]**[v1]**Tue, 17 May 2016 08:16:48 GMT (49kb)

**[v2]**Fri, 24 Jun 2016 15:36:17 GMT (51kb)

**[v3]**Mon, 27 Jun 2016 09:34:15 GMT (51kb)

**[v4]**Sun, 2 Jul 2017 10:46:08 GMT (56kb)

**[v5]**Wed, 29 Nov 2017 17:56:14 GMT (78kb)

Link back to: arXiv, form interface, contact.