# Statistics Theory

## New submissions

[ total of 14 entries: 1-14 ]
[ showing up to 250 entries per page: fewer | more ]

### New submissions for Thu, 11 Aug 22

[1]
Title: Fisher transformation via Edgeworth expansion
Authors: Jan Vrbik
Subjects: Statistics Theory (math.ST)

We show how to calculate individual terms of the Edgeworth series to approximate the distribution of the Pearson correlation coefficient with the help of a simple Mathematica program. We also demonstrate how to eliminate the corresponding skewness, thus making the approximation substantially more accurate. This leads, in a rather natural way, to deriving a superior (in terms of its accuracy) version of Fisher's z transformation. The code can be easily modified to deal with any sample statistics defined as a function of several sample means, based on a random independent sample from a multivariate distribution.

[2]
Title: Dispersion Parameter Extension of Precise Generalized Linear Mixed Model Asymptotics
Subjects: Statistics Theory (math.ST)

We extend a recently established asymptotic normality theorem for generalized linear mixed models to include the dispersion parameter. The new results show that the maximum likelihood estimators of all model parameters have asymptotically normal distributions with asymptotic mutual independence between fixed effects, random effects covariance and dispersion parameters. The dispersion parameter maximum likelihood estimator has a particularly simple asymptotic distribution which enables straightforward valid likelihood-based inference.

[3]
Title: Trace Moments of the Sample Covariance Matrix with Graph-Coloring
Authors: Ben Deitmar
Subjects: Statistics Theory (math.ST)

Let $S_{p,n}$ denote the sample covariance matrix based on $n$ independent identically distributed $p$-dimensional random vectors in the null-case. The main result of this paper is an expansion of trace moments and power-trace covariances of $S_{p,n}$ simultaneously for both high- and low-dimensional data. To this end we develop a graph theory oriented ansatz of describing trace moments as weighted sums over colored graphs. Specifically, explicit formulas for the highest order coefficients in the expansion are deduced by restricting attention to graphs with either no or one cycle. The novelty is a color-preserving decomposition of graphs into a tree-structure and their seed graphs, which allows for the identification of Euler circuits from graphs with the same tree-structure but different seed graphs. This approach may also be used to approximate the mean and covariance to even higher degrees of accuracy.

### Cross-lists for Thu, 11 Aug 22

[4]  arXiv:2208.05344 (cross-list from econ.EM) [pdf, other]
Title: Testing for error invariance in separable instrumental variable models
Subjects: Econometrics (econ.EM); Statistics Theory (math.ST)

The hypothesis of error invariance is central to the instrumental variable literature. It means that the error term of the model is the same across all potential outcomes. In other words, this assumption signifies that treatment effects are constant across all subjects. It allows to interpret instrumental variable estimates as average treatment effects over the whole population of the study. When this assumption does not hold, the bias of instrumental variable estimators can be larger than that of naive estimators ignoring endogeneity. This paper develops two tests for the assumption of error invariance when the treatment is endogenous, an instrumental variable is available and the model is separable. The first test assumes that the potential outcomes are linear in the regressors and is computationally simple. The second test is nonparametric and relies on Tikhonov regularization. The treatment can be either discrete or continuous. We show that the tests have asymptotically correct level and asymptotic power equal to one against a range of alternatives. Simulations demonstrate that the proposed tests attain excellent finite sample performances. The methodology is also applied to the evaluation of returns to schooling and the effect of price on demand in a fish market.

[5]  arXiv:2208.05406 (cross-list from cs.LG) [pdf, other]
Title: Active Sampling of Multiple Sources for Sequential Estimation
Subjects: Machine Learning (cs.LG); Statistics Theory (math.ST)

Consider $K$ processes, each generating a sequence of identical and independent random variables. The probability measures of these processes have random parameters that must be estimated. Specifically, they share a parameter $\theta$ common to all probability measures. Additionally, each process $i\in\{1, \dots, K\}$ has a private parameter $\alpha_i$. The objective is to design an active sampling algorithm for sequentially estimating these parameters in order to form reliable estimates for all shared and private parameters with the fewest number of samples. This sampling algorithm has three key components: (i)~data-driven sampling decisions, which dynamically over time specifies which of the $K$ processes should be selected for sampling; (ii)~stopping time for the process, which specifies when the accumulated data is sufficient to form reliable estimates and terminate the sampling process; and (iii)~estimators for all shared and private parameters. Owing to the sequential estimation being known to be analytically intractable, this paper adopts \emph {conditional} estimation cost functions, leading to a sequential estimation approach that was recently shown to render tractable analysis. Asymptotically optimal decision rules (sampling, stopping, and estimation) are delineated, and numerical experiments are provided to compare the efficacy and quality of the proposed procedure with those of the relevant approaches.

### Replacements for Thu, 11 Aug 22

[6]  arXiv:2006.02397 (replaced) [pdf, other]
Title: One Step to Efficient Synthetic Data
Subjects: Statistics Theory (math.ST); Cryptography and Security (cs.CR); Computation (stat.CO)
[7]  arXiv:2012.07167 (replaced) [pdf, other]
Title: Pseudo-likelihood-based $M$-estimation of random graphs with dependent edges and parameter vectors of increasing dimension
Subjects: Statistics Theory (math.ST)
[8]  arXiv:2109.02690 (replaced) [pdf, other]
Title: Estimating nuisance parameters often reduces the variance (with consistent variance estimation)
Authors: Judith J. Lok
Comments: 43 pages, and supplementary material
Subjects: Statistics Theory (math.ST); Methodology (stat.ME)
[9]  arXiv:2110.03874 (replaced) [pdf, other]
Title: Uncertainty quantification in the Bradley-Terry-Luce model
Subjects: Statistics Theory (math.ST); Machine Learning (stat.ML)
[10]  arXiv:2205.08010 (replaced) [pdf, other]
Title: The e-value and the Full Bayesian Significance Test: Logical Properties and Philosophical Consequences
Subjects: Statistics Theory (math.ST)
[11]  arXiv:2208.04184 (replaced) [pdf, other]
Title: A Gaussian model for survival data subject to dependent censoring and confounding
Subjects: Statistics Theory (math.ST)
[12]  arXiv:2107.05824 (replaced) [pdf, ps, other]
Title: Covariance's Loss is Privacy's Gain: Computationally Efficient, Private and Accurate Synthetic Data
Subjects: Cryptography and Security (cs.CR); Probability (math.PR); Statistics Theory (math.ST)
[13]  arXiv:2203.00521 (replaced) [pdf, ps, other]
Title: A Transformational Characterization of Unconditionally Equivalent Bayesian Networks
Comments: 12 pages, 1 figure. Accepted for publication at the 11th International Conference on Probabilistic Graphical Models (PGM 2022)
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Combinatorics (math.CO); Statistics Theory (math.ST)
[14]  arXiv:2207.10855 (replaced) [pdf, other]
Title: Graph-Based Tests for Multivariate Covariate Balance Under Multi-Valued Treatments
Authors: Eric A. Dunipace
Subjects: Methodology (stat.ME); Statistics Theory (math.ST)
[ total of 14 entries: 1-14 ]
[ showing up to 250 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, math, recent, 2208, contact, help  (Access key information)