Statistics Theory
New submissions
[ showing up to 1000 entries per page: fewer  more ]
New submissions for Thu, 26 May 22
 [1] arXiv:2205.12489 [pdf, other]

Title: Bayesian Multiscale Analysis of the Cox ModelComments: 82 pages, 6 figures, 2 tablesSubjects: Statistics Theory (math.ST)
Piecewise constant priors are routinely used in the Bayesian Cox proportional hazards model for survival analysis. Despite its popularity, large sample properties of this Bayesian method are not yet well understood. This work provides a unified theory for posterior distributions in this setting, not requiring the priors to be conjugate. We first derive contraction rate results for wide classes of histogram priors on the unknown hazard function and prove asymptotic normality of linear functionals of the posterior hazard in the form of Bernsteinvon Mises theorems. Second, using recently developed multiscale techniques, we derive functional limiting results for the cumulative hazard and survival function. Frequentist coverage properties of Bayesian credible sets are investigated: we prove that certain easily computable credible bands for the survival function are optimal frequentist confidence bands. We conduct simulation studies that confirm these predictions, with an excellent behavior particularly in finite samples, showing that even simplest possible Bayesian credible bands for the survival function can outperform stateoftheart frequentist bands in terms of coverage.
 [2] arXiv:2205.12744 [pdf, ps, other]

Title: High dimensional Bernoulli distributions: algebraic representation and applicationsSubjects: Statistics Theory (math.ST)
The main contribution of this paper is to find a representation of the class $\mathcal{F}_d(p)$ of multivariate Bernoulli distributions with the same mean $p$ that allows us to find its generators analytically in any dimension. We map $\mathcal{F}_d(p)$ to an ideal of points and we prove that the class $\mathcal{F}_d(p)$ can be generated from a finite set of simple polynomials. We present two applications. Firstly, we show that polynomial generators help to find extremal points of the convex polytope $\mathcal{F}_d(p)$ in high dimensions. Secondly, we solve the problem of determining the lower bounds in the convex order for sums of multivariate Bernoulli distributions with given margins, but with an unspecified dependence structure.
 [3] arXiv:2205.12924 [pdf, ps, other]

Title: Clustering consistency with Dirichlet process mixturesSubjects: Statistics Theory (math.ST); Methodology (stat.ME); Machine Learning (stat.ML)
Dirichlet process mixtures are flexible nonparametric models, particularly suited to density estimation and probabilistic clustering. In this work we study the posterior distribution induced by Dirichlet process mixtures as the sample size increases, and more specifically focus on consistency for the unknown number of clusters when the observed data are generated from a finite mixture. Crucially, we consider the situation where a prior is placed on the concentration parameter of the underlying Dirichlet process. Previous findings in the literature suggest that Dirichlet process mixtures are typically not consistent for the number of clusters if the concentration parameter is held fixed and data come from a finite mixture. Here we show that consistency for the number of clusters can be achieved if the concentration parameter is adapted in a fully Bayesian way, as commonly done in practice. Our results are derived for data coming from a class of finite mixtures, with mild assumptions on the prior for the concentration parameter and for a variety of choices of likelihood kernels for the mixture.
 [4] arXiv:2205.12937 [pdf, other]

Title: Mitigating multiple descents: A modelagnostic framework for risk monotonizationComments: 110 pages, 15 figuresSubjects: Statistics Theory (math.ST); Machine Learning (cs.LG); Machine Learning (stat.ML)
Recent empirical and theoretical analyses of several commonly used prediction procedures reveal a peculiar risk behavior in high dimensions, referred to as double/multiple descent, in which the asymptotic risk is a nonmonotonic function of the limiting aspect ratio of the number of features or parameters to the sample size. To mitigate this undesirable behavior, we develop a general framework for risk monotonization based on crossvalidation that takes as input a generic prediction procedure and returns a modified procedure whose outofsample prediction risk is, asymptotically, monotonic in the limiting aspect ratio. As part of our framework, we propose two datadriven methodologies, namely zero and onestep, that are akin to bagging and boosting, respectively, and show that, under very mild assumptions, they provably achieve monotonic asymptotic risk behavior. Our results are applicable to a broad variety of prediction procedures and loss functions, and do not require a wellspecified (parametric) model. We exemplify our framework with concrete analyses of the minimum $\ell_2$, $\ell_1$norm least squares prediction procedures. As one of the ingredients in our analysis, we also derive novel additive and multiplicative forms of oracle risk inequalities for split crossvalidation that are of independent interest.
Crosslists for Thu, 26 May 22
 [5] arXiv:2205.12431 (crosslist from stat.ME) [pdf, other]

Title: Detecting Abrupt Changes in Sequential Pairwise Comparison DataComments: 31 pages, 2 figures, 2 tablesSubjects: Methodology (stat.ME); Statistics Theory (math.ST)
The BradleyTerryLuce (BTL) model is a classic and very popular statistical approach for eliciting a global ranking among a collection of items using pairwise comparison data. In applications in which the comparison outcomes are observed as a time series, it is often the case that data are nonstationary, in the sense that the true underlying ranking changes over time. In this paper we are concerned with localizing the change points in a highdimensional BTL model with piecewise constant parameters. We propose novel and practicable algorithms based on dynamic programming that can consistently estimate the unknown locations of the change points. We provide consistency rates for our methodology that depend explicitly on the model parameters, the temporal spacing between two consecutive change points and the magnitude of the change. We corroborate our findings with extensive numerical experiments and a reallife example.
 [6] arXiv:2205.12695 (crosslist from stat.ML) [pdf, other]

Title: Surprises in adversariallytrained linear regressionSubjects: Machine Learning (stat.ML); Cryptography and Security (cs.CR); Machine Learning (cs.LG); Signal Processing (eess.SP); Statistics Theory (math.ST)
Stateoftheart machine learning models can be vulnerable to very small input perturbations that are adversarially constructed. Adversarial training is one of the most effective approaches to defend against such examples. We show that for linear regression problems, adversarial training can be formulated as a convex problem. This fact is then used to show that $\ell_\infty$adversarial training produces sparse solutions and has many similarities to the lasso method. Similarly, $\ell_2$adversarial training has similarities with ridge regression. We use a robust regression framework to analyze and understand these similarities and also point to some differences. Finally, we show how adversarial training behaves differently from other regularization methods when estimating overparameterized models (i.e., models with more parameters than datapoints). It minimizes a sum of three terms which regularizes the solution, but unlike lasso and ridge regression, it can sharply transition into an interpolation mode. We show that for sufficiently many features or sufficiently small regularization parameters, the learned model perfectly interpolates the training data while still exhibiting good outofsample performance.
Replacements for Thu, 26 May 22
 [7] arXiv:2003.03886 (replaced) [pdf, other]

Title: Divided Differences, Falling Factorials, and Discrete Splines: Another Look at Trend Filtering and Related ProblemsAuthors: Ryan J. TibshiraniComments: 75 pages, 9 figures; 1 tableSubjects: Statistics Theory (math.ST); Numerical Analysis (math.NA); Methodology (stat.ME)
 [8] arXiv:2003.13208 (replaced) [pdf, other]

Title: Minimax optimality of permutation testsComments: Typo in Eq.(38) is fixedSubjects: Statistics Theory (math.ST)
 [9] arXiv:2008.09787 (replaced) [pdf, ps, other]

Title: Approximation of probability density functions via locationscale finite mixtures in Lebesgue spacesComments: To appear in Communications in Statistics  Theory and MethodsSubjects: Statistics Theory (math.ST)
 [10] arXiv:2106.09387 (replaced) [pdf, other]

Title: Taming Nonconvexity in Kernel Feature Selection  Favorable Properties of the Laplace KernelComments: 26 pages main text; 74 pages total; appendix rewritten (typo fixed; proof structure reorganized)Subjects: Statistics Theory (math.ST); Methodology (stat.ME); Machine Learning (stat.ML)
 [11] arXiv:2107.01120 (replaced) [pdf, ps, other]

Title: Asymptotic Analysis of Statistical Estimators related to MultiGraphex Processes under MisspecificationSubjects: Statistics Theory (math.ST)
 [12] arXiv:2104.08279 (replaced) [pdf, other]

Title: Testing for Outliers with Conformal pvaluesComments: Revision May 24, 2022: added "asymptotic" and "Monte Carlo" conditional calibration methods; added power analyses; updated numerical experiments to include new methodsSubjects: Methodology (stat.ME); Statistics Theory (math.ST); Machine Learning (stat.ML)
 [13] arXiv:2111.15546 (replaced) [pdf, ps, other]

Title: Black box tests for algorithmic stabilityComments: 26 pages. Updates to Section 2.1.1 and Sections B.1 & B.2Subjects: Machine Learning (cs.LG); Statistics Theory (math.ST)
 [14] arXiv:2201.11211 (replaced) [pdf, other]

Title: Learning Mixtures of Linear Dynamical SystemsComments: Accepted to ICML 2022. arXiv v2 update: add references and experimentsSubjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Systems and Control (eess.SY); Statistics Theory (math.ST)
[ showing up to 1000 entries per page: fewer  more ]
Disable MathJax (What is MathJax?)
Links to: arXiv, form interface, find, math, recent, 2205, contact, help (Access key information)