We gratefully acknowledge support from
the Simons Foundation and member institutions.

Statistics Theory

New submissions

[ total of 20 entries: 1-20 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Tue, 24 May 22

[1]  arXiv:2205.10524 [pdf, ps, other]
Title: Robust density estimation with the $\mathbb{L}_{1}$-loss. Applications to the estimation of a density on the line satisfying a shape constraint
Subjects: Statistics Theory (math.ST)

We solve the problem of estimating the distribution of presumed i.i.d. observations for the total variation loss. Our approach is based on density models and is versatile enough to cope with many different ones, including some density models for which the Maximum Likelihood Estimator (MLE for short) does not exist. We mainly illustrate the properties of our estimator on models of densities on the line that satisfy a shape constraint. We show that it possesses some similar optimality properties, with regard to some global rates of convergence, as the MLE does when it exists. It also enjoys some adaptation properties with respect to some specific target densities in the model for which our estimator is proven to converge at parametric rate. More important is the fact that our estimator is robust, not only with respect to model misspecification, but also to contamination, the presence of outliers among the dataset and the equidistribution assumption. This means that the estimator performs almost as well as if the data were i.i.d. with density $p$ in a situation where these data are only independent and most of their marginals are close enough in total variation to a distribution with density $p$. Our main result on the risk of the estimator takes the form of an exponential deviation inequality which is non-asymptotic and involves explicit numerical constants. We deduce from it several global rates of convergence, including some bounds for the minimax $\mathbb{L}_{1}$-risks over the sets of concave and log-concave densities. These bounds derive from some specific results on the approximation of densities which are monotone, convex, concave and log-concave. Such results may be of independent interest.

[2]  arXiv:2205.10799 [pdf, ps, other]
Title: On point estimators for Gamma and Beta distributions
Authors: Nickos Papadatos
Comments: Dedicated to Professor Stavros Kourouklis (18 pages, including one Table)
Subjects: Statistics Theory (math.ST); Methodology (stat.ME)

Let $X_1,\ldots,X_n$ be a random sample from the Gamma distribution with density $f(x)=\lambda^{\alpha}x^{\alpha-1}e^{-\lambda x}/\Gamma(\alpha)$, $x>0$, where both $\alpha>0$ (the shape parameter) and $\lambda>0$ (the reciprocal scale parameter) are unknown. The main result shows that the uniformly minimum variance unbiased estimator (UMVUE) of the shape parameter, $\alpha$, exists if and only if $n\geq 4$; moreover, it has finite variance if and only if $n\geq 6$. More precisely, the form of the UMVUE is given for all parametric functions $\alpha$, $\lambda$, $1/\alpha$ and $1/\lambda$. Furthermore, a highly efficient estimating procedure for the two-parameter Beta distribution is also given. This is based on a Stein-type covariance identity for the Beta distribution, followed by an application of the theory of $U$-statistics and the delta-method.
MSC: Primary 62F10; 62F12; Secondary 62E15.
Key words and phrases: unbiased estimation; Gamma distribution; Beta distribution; Ye-Chen-type closed-form estimators; asymptotic efficiency; $U$-statistics; Stein-type covariance identity; delta-method.

[3]  arXiv:2205.10886 [pdf, other]
Title: Adaptive estimation for the nonparametric bivariate additive model in random design with long-memory dependent errors
Subjects: Statistics Theory (math.ST); Methodology (stat.ME)

We investigate the nonparametric bivariate additive regression estimation in the random design and long-memory errors and construct adaptive thresholding estimators based on wavelet series. The proposed approach achieves asymptotically near-optimal convergence rates when the unknown function and its univariate additive components belong to Besov space. We consider the problem under two noise structures; (1) homoskedastic Gaussian long memory errors and (2) heteroskedastic Gaussian long memory errors. In the homoskedastic long-memory error case, the estimator is completely adaptive with respect to the long-memory parameter. In the heteroskedastic long-memory case, the estimator may not be adaptive with respect to the long-memory parameter unless the heteroskedasticity is of polynomial form. In either case, the convergence rates depend on the long-memory parameter only when long-memory is strong enough, otherwise, the rates are identical to those under i.i.d. errors. The proposed approach is extended to the general $r$-dimensional additive case, with $r>2$, and the corresponding convergence rates are free from the curse of dimensionality.

[4]  arXiv:2205.11092 [pdf, ps, other]
Title: Estimation of the Hurst parameter from continuous noisy data
Subjects: Statistics Theory (math.ST)

This paper addresses the problem of estimating the Hurst exponent of the fractional Brownian motion from continuous time noisy sample. Consistent estimation in the setup under consideration is possible only if either the length of the observation interval increases to infinity or intensity of the noise decreases to zero. The main result is a proof of the Local Asymptotic Normality (LAN) of the model in these two regimes, which reveals the optimal minimax rates.

[5]  arXiv:2205.11302 [pdf, other]
Title: Exchangeable FGM copulas
Subjects: Statistics Theory (math.ST)

Copulas are a powerful tool to model dependence between the components of a random vector. One well-known class of copulas when working in two dimensions is the Farlie-GumbelMorgenstern (FGM) copula since their simple analytic shape enables closed-form solutions to many problems in applied probability. However, the classical definition of high-dimensional FGM copula does not enable a straightforward understanding of the effect of the copula parameters on the dependence, nor a geometric understanding of their admissible range. We circumvent this issue by studying the FGM copula from a probabilistic approach based on multivariate Bernoulli distributions. This paper studies high-dimensional exchangeable FGM copulas, a subclass of FGM copulas. We show that dependence parameters of exchangeable FGM can be expressed as convex hulls of a finite number of extreme points and establish partial orders for different exchangeable FGM copulas (including maximal and minimal dependence). We also leverage the probabilistic interpretation to develop efficient sampling and estimating procedures and provide a simulation study. Throughout, we discover geometric interpretations of the copula parameters that assist one in decoding the dependence of high-dimensional exchangeable FGM copulas.

Cross-lists for Tue, 24 May 22

[6]  arXiv:2205.10697 (cross-list from stat.ML) [pdf, ps, other]
Title: The Selectively Adaptive Lasso
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Statistics Theory (math.ST)

Machine learning regression methods allow estimation of functions without unrealistic parametric assumptions. Although they can perform exceptionally in prediction error, most lack theoretical convergence rates necessary for semi-parametric efficient estimation (e.g. TMLE, AIPW) of parameters like average treatment effects. The Highly Adaptive Lasso (HAL) is the only regression method proven to converge quickly enough for a meaningfully large class of functions, independent of the dimensionality of the predictors. Unfortunately, HAL is not computationally scalable. In this paper we build upon the theory of HAL to construct the Selectively Adaptive Lasso (SAL), a new algorithm which retains HAL's dimension-free, nonparametric convergence rate but which also scales computationally to massive datasets. To accomplish this, we prove some general theoretical results pertaining to empirical loss minimization in nested Donsker classes. Our resulting algorithm is a form of gradient tree boosting with an adaptive learning rate, which makes it fast and trivial to implement with off-the-shelf software. Finally, we show that our algorithm retains the performance of standard gradient boosting on a diverse group of real-world datasets. SAL makes semi-parametric efficient estimators practically possible and theoretically justifiable in many big data settings.

[7]  arXiv:2205.10798 (cross-list from cs.LG) [pdf, other]
Title: PAC-Wrap: Semi-Supervised PAC Anomaly Detection
Comments: Accepted by SIGKDD 2022
Subjects: Machine Learning (cs.LG); Statistics Theory (math.ST); Machine Learning (stat.ML)

Anomaly detection is essential for preventing hazardous outcomes for safety-critical applications like autonomous driving. Given their safety-criticality, these applications benefit from provable bounds on various errors in anomaly detection. To achieve this goal in the semi-supervised setting, we propose to provide Probably Approximately Correct (PAC) guarantees on the false negative and false positive detection rates for anomaly detection algorithms. Our method (PAC-Wrap) can wrap around virtually any existing semi-supervised and unsupervised anomaly detection method, endowing it with rigorous guarantees. Our experiments with various anomaly detectors and datasets indicate that PAC-Wrap is broadly effective.

[8]  arXiv:2205.10810 (cross-list from math.PR) [pdf, ps, other]
Title: On the inversion of the Laplace transform (In Memory of Dimitris Gatzouras)
Authors: Nickos Papadatos
Comments: 14 pages
Subjects: Probability (math.PR); Statistics Theory (math.ST); Methodology (stat.ME)

The Laplace transform is a useful and powerful analytic tool with applications to several areas of applied mathematics, including differential equations, probability and statistics. Similarly to the inversion of the Fourier transform, inversion formulae for the Laplace transform are of central importance; such formulae are old and well-known (Fourier-Mellin or Bromwich integral, Post-Widder inversion). The present work is motivated from an elementary statistical problem, namely, the unbiased estimation of a parametric function of the scale in the basic model of a random sample from exponential distribution. The form of the uniformly minimum variance unbiased estimator of a parametric function $h(\lambda)$, as well as its variance, are obtained as series in Laguerre polynomials and the corresponding Fourier coefficients, and a particular application of this result yields a novel inversion formula for the Laplace transform.
MSC: Primary 44A10, 62F10.
Key words and phrases: Exponential Distribution, Unbiased Estimation; Fourier-Laguerre Series; Inverse Laplace Transform; Laguerre Polynomials.

[9]  arXiv:2205.10895 (cross-list from cs.LG) [pdf, ps, other]
Title: Contextual Information-Directed Sampling
Comments: Accepted at ICML 2022
Subjects: Machine Learning (cs.LG); Statistics Theory (math.ST); Machine Learning (stat.ML)

Information-directed sampling (IDS) has recently demonstrated its potential as a data-efficient reinforcement learning algorithm. However, it is still unclear what is the right form of information ratio to optimize when contextual information is available. We investigate the IDS design through two contextual bandit problems: contextual bandits with graph feedback and sparse linear contextual bandits. We provably demonstrate the advantage of contextual IDS over conditional IDS and emphasize the importance of considering the context distribution. The main message is that an intelligent agent should invest more on the actions that are beneficial for the future unseen contexts while the conditional IDS can be myopic. We further propose a computationally-efficient version of contextual IDS based on Actor-Critic and evaluate it empirically on a neural network contextual bandit.

[10]  arXiv:2205.11078 (cross-list from stat.ML) [pdf, other]
Title: Beyond EM Algorithm on Over-specified Two-Component Location-Scale Gaussian Mixtures
Comments: 38 pages, 4 figures. Tongzheng Ren and Fuheng Cui contributed equally to this work
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Statistics Theory (math.ST)

The Expectation-Maximization (EM) algorithm has been predominantly used to approximate the maximum likelihood estimation of the location-scale Gaussian mixtures. However, when the models are over-specified, namely, the chosen number of components to fit the data is larger than the unknown true number of components, EM needs a polynomial number of iterations in terms of the sample size to reach the final statistical radius; this is computationally expensive in practice. The slow convergence of EM is due to the missing of the locally strong convexity with respect to the location parameter on the negative population log-likelihood function, i.e., the limit of the negative sample log-likelihood function when the sample size goes to infinity. To efficiently explore the curvature of the negative log-likelihood functions, by specifically considering two-component location-scale Gaussian mixtures, we develop the Exponential Location Update (ELU) algorithm. The idea of the ELU algorithm is that we first obtain the exact optimal solution for the scale parameter and then perform an exponential step-size gradient descent for the location parameter. We demonstrate theoretically and empirically that the ELU iterates converge to the final statistical radius of the models after a logarithmic number of iterations. To the best of our knowledge, it resolves the long-standing open question in the literature about developing an optimization algorithm that has optimal statistical and computational complexities for solving parameter estimation even under some specific settings of the over-specified Gaussian mixture models.

[11]  arXiv:2205.11121 (cross-list from cs.CR) [pdf, ps, other]
Title: A normal approximation for joint frequency estimatation under Local Differential Privacy
Authors: Thomas Carette
Comments: Preliminary development, draft
Subjects: Cryptography and Security (cs.CR); Databases (cs.DB); Statistics Theory (math.ST)

In the recent years, Local Differential Privacy (LDP) has been one of the corner stone of privacy preserving data analysis. However, many challenges still opposes its widespread application. One of these problems is the scalability of LDP to high dimensional data, in particular for estimating joint-distributions. In this paper, we develop an approximate estimator for category frequency joint-distribution under so-called pure LDP protocols.

Replacements for Tue, 24 May 22

[12]  arXiv:2003.13208 (replaced) [pdf, other]
Title: Minimax optimality of permutation tests
Comments: Appendix D is added (Monte Carlo-based permutation tests) / Several typos are fixed
Subjects: Statistics Theory (math.ST)
[13]  arXiv:2109.13190 (replaced) [pdf, ps, other]
Title: Estimating the characteristics of stochastic damping Hamiltonian systems from continuous observations
Subjects: Statistics Theory (math.ST); Probability (math.PR)
[14]  arXiv:2111.03705 (replaced) [pdf, ps, other]
Title: Strong Recovery In Group Synchronization
Authors: Bradley Stich
Subjects: Statistics Theory (math.ST); Probability (math.PR)
[15]  arXiv:2204.11038 (replaced) [pdf, ps, other]
Title: Dimension free non-asymptotic bounds on the accuracy of high dimensional Laplace approximation
Subjects: Statistics Theory (math.ST); Numerical Analysis (math.NA)
[16]  arXiv:1910.09219 (replaced) [pdf, other]
Title: A Transformation Perspective on Marginal and Conditional Models
Subjects: Methodology (stat.ME); Statistics Theory (math.ST)
[17]  arXiv:2004.04254 (replaced) [pdf, other]
Title: Posterior computation with the Gibbs zig-zag sampler
Subjects: Computation (stat.CO); Statistics Theory (math.ST)
[18]  arXiv:2102.03607 (replaced) [pdf, other]
Title: Bootstrapping Fitted Q-Evaluation for Off-Policy Inference
Comments: Accepted at ICML 2021
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Statistics Theory (math.ST)
[19]  arXiv:2102.12225 (replaced) [pdf, other]
Title: Valid Instrumental Variables Selection Methods using Auxiliary Variable and Constructing Efficient Estimator
Comments: Keywords: Causal inference, Exclusion restriction, Instrumental variable, Mendelian randomization, Negative control outcome, Semiparametric efficiency, Variable selection, Unmeasured covariates
Subjects: Methodology (stat.ME); Statistics Theory (math.ST); Other Statistics (stat.OT)
[20]  arXiv:2202.12431 (replaced) [pdf, other]
Title: Thompson Sampling with Unrestricted Delays
Authors: Han Wu, Stefan Wager
Subjects: Machine Learning (cs.LG); Statistics Theory (math.ST); Methodology (stat.ME)
[ total of 20 entries: 1-20 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, math, recent, 2205, contact, help  (Access key information)