We gratefully acknowledge support from
the Simons Foundation and member institutions.

Statistics Theory

New submissions

[ total of 31 entries: 1-31 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Tue, 7 Feb 23

[1]  arXiv:2302.02247 [pdf, ps, other]
Title: Spectral Density Estimation of Function-Valued Spatial Processes
Comments: 84 pages, 0 figures
Subjects: Statistics Theory (math.ST)

The spectral density function describes the second-order properties of a stationary stochastic process on $\mathbb{R}^d$. This paper considers the nonparametric estimation of the spectral density of a continuous-time stochastic process taking values in a separable Hilbert space. Our estimator is based on kernel smoothing and can be applied to a wide variety of spatial sampling schemes including those in which data are observed at irregular spatial locations. Thus, it finds immediate applications in Spatial Statistics, where irregularly sampled data naturally arise. The rates for the bias and variance of the estimator are obtained under general conditions in a mixed-domain asymptotic setting. When the data are observed on a regular grid, the optimal rate of the estimator matches the minimax rate for the class of covariance functions that decay according to a power law. The asymptotic normality of the spectral density estimator is also established under general conditions for Gaussian Hilbert-space valued processes. Finally, with a view towards practical applications the asymptotic results are specialized to the case of discretely-sampled functional data in a reproducing kernel Hilbert space.

[2]  arXiv:2302.02415 [pdf, ps, other]
Title: On Kronecker Separability of Multiway Covariance
Comments: 15 pages
Subjects: Statistics Theory (math.ST); Methodology (stat.ME)

Multiway data analysis is aimed at inferring patterns from data represented as a multi-dimensional array. Estimating covariance from multiway data is a fundamental statistical task, however, the intrinsic high dimensionality poses significant statistical and computational challenges. Recently, several factorized covariance models, paired with estimation algorithms, have been proposed to circumvent these obstacles. Despite several promising results on the algorithmic front, it remains under-explored whether and when such a model is valid. To address this question, we define the notion of Kronecker-separable multiway covariance, which can be written as a sum of $r$ tensor products of mode-wise covariances. The question of whether a given covariance can be represented as a separable multiway covariance is then reduced to an equivalent question about separability of quantum states. Using this equivalence, it follows directly that a generic multiway covariance tends to be non-separable (even if $r \to \infty$), and moreover, finding its best separable approximation is NP-hard. These observations imply that factorized covariance models are restrictive and should be used only when there is a compelling rationale for such a model.

[3]  arXiv:2302.02482 [pdf, other]
Title: Continuously Indexed Graphical Models
Subjects: Statistics Theory (math.ST); Probability (math.PR); Methodology (stat.ME)

Let $X = \{X_{u}\}_{u \in U}$ be a real-valued Gaussian process indexed by a set $U$. It can be thought of as an undirected graphical model with every random variable $X_{u}$ serving as a vertex. We characterize this graph in terms of the covariance of $X$ through its reproducing kernel property. Unlike other characterizations in the literature, our characterization does not restrict the index set $U$ to be finite or countable, and hence can be used to model the intrinsic dependence structure of stochastic processes in continuous time/space. Consequently, the said characterization is not (and apparently cannot be) of the inverse-zero type. This poses novel challenges for the problem of recovery of the dependence structure from a sample of independent realizations of $X$, also known as structure estimation. We propose a methodology that circumvents these issues, by targeting the recovery of the underlying graph up to a finite resolution, which can be arbitrarily fine and is limited only by the available sample size. The recovery is shown to be consistent so long as the graph is sufficiently regular in an appropriate sense, and convergence rates are provided. Our methodology is illustrated by simulation and two data analyses.

[4]  arXiv:2302.02497 [pdf, other]
Title: High-dimensional Location Estimation via Norm Concentration for Subgamma Vectors
Subjects: Statistics Theory (math.ST); Information Theory (cs.IT); Machine Learning (cs.LG); Probability (math.PR); Machine Learning (stat.ML)

In location estimation, we are given $n$ samples from a known distribution $f$ shifted by an unknown translation $\lambda$, and want to estimate $\lambda$ as precisely as possible. Asymptotically, the maximum likelihood estimate achieves the Cram\'er-Rao bound of error $\mathcal N(0, \frac{1}{n\mathcal I})$, where $\mathcal I$ is the Fisher information of $f$. However, the $n$ required for convergence depends on $f$, and may be arbitrarily large. We build on the theory using \emph{smoothed} estimators to bound the error for finite $n$ in terms of $\mathcal I_r$, the Fisher information of the $r$-smoothed distribution. As $n \to \infty$, $r \to 0$ at an explicit rate and this converges to the Cram\'er-Rao bound. We (1) improve the prior work for 1-dimensional $f$ to converge for constant failure probability in addition to high probability, and (2) extend the theory to high-dimensional distributions. In the process, we prove a new bound on the norm of a high-dimensional random variable whose 1-dimensional projections are subgamma, which may be of independent interest.

[5]  arXiv:2302.02544 [pdf, other]
Title: Sequential change detection via backward confidence sequences
Comments: 24 pages, 10 figures
Subjects: Statistics Theory (math.ST); Information Theory (cs.IT); Machine Learning (cs.LG); Methodology (stat.ME); Machine Learning (stat.ML)

We present a simple reduction from sequential estimation to sequential changepoint detection (SCD). In short, suppose we are interested in detecting changepoints in some parameter or functional $\theta$ of the underlying distribution. We demonstrate that if we can construct a confidence sequence (CS) for $\theta$, then we can also successfully perform SCD for $\theta$. This is accomplished by checking if two CSs -- one forwards and the other backwards -- ever fail to intersect. Since the literature on CSs has been rapidly evolving recently, the reduction provided in this paper immediately solves several old and new change detection problems. Further, our "backward CS", constructed by reversing time, is new and potentially of independent interest. We provide strong nonasymptotic guarantees on the frequency of false alarms and detection delay, and demonstrate numerical effectiveness on several problems.

[6]  arXiv:2302.02613 [pdf, ps, other]
Title: An asymptotic behavior of a finite-section of the optimal causal filter
Authors: Junho Yang
Subjects: Statistics Theory (math.ST)

We derive an $L_1$-bound between the coefficients of the optimal causal filter applied to the data-generating process and its approximation based on finite sample observations. Here, we assume that the data-generating process is second-order stationary with either short or long memory autocovariances. To obtain the $L_1$-bound, we first provide an exact expression of the causal filter coefficients and their approximation in terms of the absolute convergent series of the multistep ahead infinite and finite predictor coefficients, respectively. Then, we prove a so-called uniform-type Baxter's inequality to obtain a bound for the difference between the two multistep ahead predictor coefficients (under both short and memory time series). The $L_1$-approximation error bound of the causal filter coefficients can be used to evaluate the quality of the predictions of time series through the mean squared error criterion.

[7]  arXiv:2302.02954 [pdf, other]
Title: Maximum likelihood estimator for skew Brownian motion: the convergence rate
Subjects: Statistics Theory (math.ST); Probability (math.PR)

We give a thorough description of the asymptotic property of the maximum likelihood estimator (MLE) of the skewness parameter of a Skew Brownian Motion (SBM). Thanks to recent results on the Central Limit Theorem of the rate of convergence of estimators for the SBM, we prove a conjecture left open that the MLE has asymptotically a mixed normal distribution involving the local time with a rate of convergence of order $1/4$. We also give a series expansion of the MLE and study the asymptotic behavior of the score and its derivatives, as well as their variation with the skewness parameter. In particular, we exhibit a specific behavior when the SBM is actually a Brownian motion, and quantify the explosion of the coefficients of the expansion when the skewness parameter is close to $-1$ or $1$.

Cross-lists for Tue, 7 Feb 23

[8]  arXiv:2302.02200 (cross-list from math.CO) [pdf, other]
Title: Rank-based linkage I: triplet comparisons and oriented simplicial complexes
Comments: 37 pages, 12 figures
Subjects: Combinatorics (math.CO); Statistics Theory (math.ST)

Rank-based linkage is a new tool for summarizing a collection $S$ of objects according to their relationships. These objects are not mapped to vectors, and ``similarity'' between objects need be neither numerical nor symmetrical. All an object needs to do is rank nearby objects by similarity to itself, using a Comparator which is transitive, but need not be consistent with any metric on the whole set. Call this a ranking system on $S$. Rank-based linkage is applied to the $K$-nearest neighbor digraph derived from a ranking system. Computations occur on a 2-dimensional abstract oriented simplicial complex whose faces are among the points, edges, and triangles of the line graph of the undirected $K$-nearest neighbor graph on $S$. In $|S| K^2$ steps it builds an edge-weighted linkage graph $(S, \mathcal{L}, \sigma)$ where $\sigma(\{x, y\})$ is called the in-sway between objects $x$ and $y$. Take $\mathcal{L}_t$ to be the links whose in-sway is at least $t$, and partition $S$ into components of the graph $(S, \mathcal{L}_t)$, for varying $t$. Rank-based linkage is a functor from a category of out-ordered digraphs to a category of partitioned sets, with the practical consequence that augmenting the set of objects in a rank-respectful way gives a fresh clustering which does not ``rip apart`` the previous one. The same holds for single linkage clustering in the metric space context, but not for typical optimization-based methods. Open combinatorial problems are presented in the last section.

[9]  arXiv:2302.02254 (cross-list from stat.CO) [pdf, other]
Title: Getting to "rate-optimal'' in ranking & selection
Journal-ref: Proceedings of the 2021 Winter Simulation Conference
Subjects: Computation (stat.CO); Statistics Theory (math.ST)

In their 2004 seminal paper, Glynn and Juneja formally and precisely established the rate-optimal, probability-of-incorrect-selection, replication allocation scheme for selecting the best of k simulated systems. In the case of independent, normally distributed outputs this allocation has a simple form that depends in an intuitively appealing way on the true means and variances. Of course the means and (typically) variances are unknown, but the rate-optimal allocation provides a target for implementable, dynamic, data-driven policies to achieve. In this paper we compare the empirical behavior of four related replication-allocation policies: mCEI from Chen and Rzyhov and our new gCEI policy that both converge to the Glynn and Juneja allocation; AOMAP from Peng and Fu that converges to the OCBA optimal allocation; and TTTS from Russo that targets the rate of convergence of the posterior probability of incorrect selection. We find that these policies have distinctly different behavior in some settings.

[10]  arXiv:2302.02486 (cross-list from stat.ME) [pdf, other]
Title: The Difference-of-Log-Normals Distribution: Properties, Estimation, and Growth
Authors: Robert Parham
Subjects: Methodology (stat.ME); Statistics Theory (math.ST); General Finance (q-fin.GN)

This paper describes the Difference-of-Log-Normals (DLN) distribution. A companion paper makes the case that the DLN is a fundamental distribution in nature, and shows how a simple application of the CLT gives rise to the DLN in many disparate phenomena. Here, I characterize its PDF, CDF, moments, and parameter estimators; generalize it to N-dimensions using spherical distribution theory; describe methods to deal with its signature ``double-exponential'' nature; and use it to generalize growth measurement to possibly-negative variates distributing DLN. I also conduct Monte-Carlo experiments to establish some properties of the estimators and measures described.

[11]  arXiv:2302.02774 (cross-list from stat.ML) [pdf, other]
Title: The SSL Interplay: Augmentations, Inductive Bias, and Generalization
Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Statistics Theory (math.ST)

Self-supervised learning (SSL) has emerged as a powerful framework to learn representations from raw data without supervision. Yet in practice, engineers face issues such as instability in tuning optimizers and collapse of representations during training. Such challenges motivate the need for a theory to shed light on the complex interplay between the choice of data augmentation, network architecture, and training algorithm. We study such an interplay with a precise analysis of generalization performance on both pretraining and downstream tasks in a theory friendly setup, and highlight several insights for SSL practitioners that arise from our theory.

[12]  arXiv:2302.02988 (cross-list from cs.LG) [pdf, other]
Title: Asymptotically Minimax Optimal Fixed-Budget Best Arm Identification for Expected Simple Regret Minimization
Subjects: Machine Learning (cs.LG); Econometrics (econ.EM); Statistics Theory (math.ST); Methodology (stat.ME); Machine Learning (stat.ML)

We investigate fixed-budget best arm identification (BAI) for expected simple regret minimization. In each round of an adaptive experiment, a decision maker draws one of multiple treatment arms based on past observations and subsequently observes the outcomes of the chosen arm. After the experiment, the decision maker recommends a treatment arm with the highest projected outcome. We evaluate this decision in terms of the expected simple regret, a difference between the expected outcomes of the best and recommended treatment arms. Due to the inherent uncertainty, we evaluate the regret using the minimax criterion. For distributions with fixed variances (location-shift models), such as Gaussian distributions, we derive asymptotic lower bounds for the worst-case expected simple regret. Then, we show that the Random Sampling (RS)-Augmented Inverse Probability Weighting (AIPW) strategy proposed by Kato et al. (2022) is asymptotically minimax optimal in the sense that the leading factor of its worst-case expected simple regret asymptotically matches our derived worst-case lower bound. Our result indicates that, for location-shift models, the optimal RS-AIPW strategy draws treatment arms with varying probabilities based on their variances. This result contrasts with the results of Bubeck et al. (2011), which shows that drawing each treatment arm with an equal ratio is minimax optimal in a bounded outcome setting.

Replacements for Tue, 7 Feb 23

[13]  arXiv:2109.02959 (replaced) [pdf, other]
Title: Fast approximations of pseudo-observations in the context of right-censoring and interval-censoring
Authors: Olivier Bouaziz (MAP5 - UMR 8145)
Subjects: Statistics Theory (math.ST)
[14]  arXiv:2207.00357 (replaced) [pdf, other]
Title: Efficient parameter estimation for parabolic SPDEs based on a log-linear model for realized volatilities
Subjects: Statistics Theory (math.ST)
[15]  arXiv:2207.08038 (replaced) [pdf, other]
Title: A Singular Woodbury and Pseudo-Determinant Matrix Identities and Application to Gaussian Process Regression
Subjects: Statistics Theory (math.ST); Machine Learning (cs.LG); Numerical Analysis (math.NA); Computation (stat.CO)
[16]  arXiv:2209.05153 (replaced) [pdf, other]
Title: The test of exponentiality based on the mean residual life function revisited
Authors: Bruno Ebner
Comments: 16 pages, 1 figure, 5 tables
Subjects: Statistics Theory (math.ST)
[17]  arXiv:2209.07791 (replaced) [pdf, ps, other]
Title: Maximum likelihood estimation and prediction error for a Mat{é}rn model on the circle
Authors: Sébastien Petit (L2S, LNE )
Subjects: Statistics Theory (math.ST)
[18]  arXiv:2101.02094 (replaced) [pdf, ps, other]
Title: Bernstein-Type Bounds for Beta Distribution
Authors: Maciej Skorski
Comments: major revision - fixed a mistake in the proof
Subjects: Probability (math.PR); Statistics Theory (math.ST); Applications (stat.AP)
[19]  arXiv:2109.09367 (replaced) [pdf, other]
Title: Extending Bootstrap AMG for Clustering of Attributed Graphs
Comments: 32 pages, 12 figures, preprint
Subjects: Machine Learning (cs.LG); Numerical Analysis (math.NA); Statistics Theory (math.ST)
[20]  arXiv:2111.03289 (replaced) [pdf, ps, other]
Title: Improved Regret Analysis for Variance-Adaptive Linear Bandits and Horizon-Free Linear Mixture MDPs
Comments: accepted to neurips'22
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Statistics Theory (math.ST)
[21]  arXiv:2202.03835 (replaced) [pdf, other]
Title: A covariant, discrete time-frequency representation tailored for zero-based signal detection
Comments: Accepted for publication in IEEE Transactions on Signal Processing on May, 26, 2022
Subjects: Signal Processing (eess.SP); Statistics Theory (math.ST); Methodology (stat.ME)
[22]  arXiv:2204.08964 (replaced) [pdf, other]
Title: Adaptive measurement filter: efficient strategy for optimal estimation of quantum Markov chains
Comments: 25 pages 7 figures
Subjects: Quantum Physics (quant-ph); Mathematical Physics (math-ph); Statistics Theory (math.ST)
[23]  arXiv:2206.02659 (replaced) [pdf, other]
Title: Robust Fine-Tuning of Deep Neural Networks with Hessian-based Generalization Guarantees
Comments: 36 pages, 5 figures, 8 tables (Fixed typos). ICML 2022
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Statistics Theory (math.ST); Machine Learning (stat.ML)
[24]  arXiv:2206.14275 (replaced) [pdf, other]
Title: Dynamic CoVaR Modeling
Subjects: Econometrics (econ.EM); Statistics Theory (math.ST); Risk Management (q-fin.RM); Methodology (stat.ME)
[25]  arXiv:2207.02287 (replaced) [pdf, other]
Title: Branching Processes in Random Environments with Thresholds
Comments: 47 pages, 3 figures, 5 tables
Subjects: Probability (math.PR); Statistics Theory (math.ST)
[26]  arXiv:2207.14088 (replaced) [pdf, other]
Title: On the Sequential Probability Ratio Test in Hidden Markov Models
Comments: 28 pages, 10 figures, submitted to CONCUR 2022
Subjects: Probability (math.PR); Logic in Computer Science (cs.LO); Statistics Theory (math.ST)
[27]  arXiv:2208.00959 (replaced) [pdf, other]
Title: HUG model: an interaction point process for Bayesian detection of multiple sources in groundwaters from hydrochemical data
Authors: Christophe Reype (IECL, PASTA), Radu S. Stoica (IECL, PASTA), Antonin Richard, Madalina Deaconu (IECL, PASTA)
Subjects: Applications (stat.AP); Statistics Theory (math.ST); Methodology (stat.ME)
[28]  arXiv:2209.12651 (replaced) [pdf, other]
Title: Learning Variational Models with Unrolling and Bilevel Optimization
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Statistics Theory (math.ST)
[29]  arXiv:2210.00895 (replaced) [pdf, other]
Title: On Best-Arm Identification with a Fixed Budget in Non-Parametric Multi-Armed Bandits
Authors: Antoine Barrier (UMPA-ENSL, LMO, CELESTE), Aurélien Garivier (UMPA-ENSL, LIP), Gilles Stoltz (LMO, CELESTE)
Journal-ref: ALT 2023 - The 34th International Conference on Algorithmic Learning Theory, Feb 2023, Singapour, Singapore
Subjects: Machine Learning (cs.LG); Information Theory (cs.IT); Statistics Theory (math.ST); Machine Learning (stat.ML)
[30]  arXiv:2211.14908 (replaced) [pdf, other]
Title: A Permutation-free Kernel Two-Sample Test
Comments: Published at the Thirty-sixth Conference on Neural Information Processing Systems (NeurIPS), with an oral presentation
Subjects: Methodology (stat.ME); Machine Learning (cs.LG); Statistics Theory (math.ST); Machine Learning (stat.ML)
[31]  arXiv:2212.09178 (replaced) [pdf, ps, other]
Title: Support Vector Regression: Risk Quadrangle Framework
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Statistics Theory (math.ST)
[ total of 31 entries: 1-31 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, math, recent, 2302, contact, help  (Access key information)