We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

math.ST

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Mathematics > Statistics Theory

Title: Rho-estimators revisited: General theory and applications

Authors: Yannick Baraud (1), Lucien Birgé (2) ((1) JAD, (2) LPMA)
Abstract: Following Baraud, Birg\'e and Sart (2017), we pursue our attempt to design a robust universal estimator of the joint ditribution of $n$ independent (but not necessarily i.i.d.) observations for an Hellinger-type loss. Given such observations with an unknown joint distribution $\mathbf{P}$ and a dominated model $\mathscr{Q}$ for $\mathbf{P}$, we build an estimator $\widehat{\mathbf{P}}$ based on $\mathscr{Q}$ and measure its risk by an Hellinger-type distance. When $\mathbf{P}$ does belong to the model, this risk is bounded by some quantity which relies on the local complexity of the model in a vicinity of $\mathbf{P}$. In most situations this bound corresponds to the minimax risk over the model (up to a possible logarithmic factor). When $\mathbf{P}$ does not belong to the model, its risk involves an additional bias term proportional to the distance between $\mathbf{P}$ and $\mathscr{Q}$, whatever the true distribution $\mathbf{P}$. From this point of view, this new version of $\rho$-estimators improves upon the previous one described in Baraud, Birg\'e and Sart (2017) which required that $\mathbf{P}$ be absolutely continuous with respect to some known reference measure. Further additional improvements have been brought as compared to the former construction. In particular, it provides a very general treatment of the regression framework with random design as well as a computationally tractable procedure for aggregating estimators. We also give some conditions for the Maximum Likelihood Estimator to be a $\rho$-estimator. Finally, we consider the situation where the Statistician has at disposal many different models and we build a penalized version of the $\rho$-estimator for model selection and adaptation purposes. In the regression setting, this penalized estimator not only allows to estimate the regression function but also the distribution of the errors.
Comments: 73 pages
Subjects: Statistics Theory (math.ST)
MSC classes: 62G05 (Primary), 62G35, 62G07, 62G08, 62C20, 62F99 (secondary)
Cite as: arXiv:1605.05051 [math.ST]
  (or arXiv:1605.05051v5 [math.ST] for this version)

Submission history

From: Lucien Birgé [view email]
[v1] Tue, 17 May 2016 08:16:48 GMT (49kb)
[v2] Fri, 24 Jun 2016 15:36:17 GMT (51kb)
[v3] Mon, 27 Jun 2016 09:34:15 GMT (51kb)
[v4] Sun, 2 Jul 2017 10:46:08 GMT (56kb)
[v5] Wed, 29 Nov 2017 17:56:14 GMT (78kb)

Link back to: arXiv, form interface, contact.