We gratefully acknowledge support from
the Simons Foundation and member institutions.

Statistics Theory

New submissions

[ total of 5 entries: 1-5 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Fri, 24 Jan 20

[1]  arXiv:2001.08336 [pdf, other]
Title: Geometric Conditions for the Discrepant Posterior Phenomenon and Connections to Simpson's Paradox
Subjects: Statistics Theory (math.ST); Methodology (stat.ME)

The discrepant posterior phenomenon (DPP) is a counterintuitive phenomenon that occurs in the Bayesian analysis of multivariate parameters. It refers to when an estimate of a marginal parameter obtained from the posterior is more extreme than both of those obtained using either the prior or the likelihood alone. Inferential claims that exhibit DPP defy intuition, and the phenomenon can be surprisingly ubiquitous in well-behaved Bayesian models. Using point estimation as an example, we derive conditions under which the DPP occurs in Bayesian models with exponential quadratic likelihoods, including Gaussian models and those with local asymptotic normality property, with conjugate multivariate Gaussian priors. We also examine the DPP for the Binomial model, in which the posterior mean is not a linear combination of that of the prior and the likelihood. We provide an intuitive geometric interpretation of the phenomenon and show that there exists a non-trivial space of marginal directions such that the DPP occurs. We further relate the phenomenon to the Simpson's paradox and discover their deep-rooted connection that is associated with marginalization. We also draw connections with Bayesian computational algorithms when difficult geometry exists. Theoretical results are complemented by numerical illustrations. Scenarios covered in this study have implications for parameterization, sensitivity analysis, and prior choice for Bayesian modeling.

[2]  arXiv:2001.08512 [pdf, ps, other]
Title: A precise local limit theorem for the multinomial distribution
Comments: 7 pages, 0 figure
Subjects: Statistics Theory (math.ST); Probability (math.PR)

We develop a precise local limit theorem for the multinomial distribution where the error terms are explicit up to an order smaller than previous known results by a factor of $N^{1/2}$. We show how it can be used to approximate multinomial probabilities on most subsets of $\mathbb{R}^d$ and we also describe potential applications related to asymptotic properties of Bernstein estimators on the simplex, bounds for the deficiency distance with multivariate normal experiments and finely tuned continuity corrections.

Cross-lists for Fri, 24 Jan 20

[3]  arXiv:2001.08431 (cross-list from stat.ME) [pdf, other]
Title: On the Hauck-Donner Effect in Wald Tests: Detection, Tipping Points, and Parameter Space Characterization
Comments: 6 figures
Subjects: Methodology (stat.ME); Statistics Theory (math.ST); Computation (stat.CO)

The Wald test remains ubiquitous in statistical practice despite shortcomings such as its inaccuracy in small samples and lack of invariance under reparameterization. This paper develops on another but lesser-known shortcoming called the Hauck--Donner effect (HDE) whereby a Wald test statistic is not monotonely increasing as a function of increasing distance between the parameter estimate and the null value. Resulting in an upward biased $p$-value and loss of power, the aberration can lead to very damaging consequences such as in variable selection. The HDE afflicts many types of regression models and corresponds to estimates near the boundary of the parameter space. This article presents several new results, and its main contributions are to (i) propose a very general test for detecting the HDE, regardless of its underlying cause; (ii) fundamentally characterize the HDE by pairwise ratios of Wald and Rao score and likelihood ratio test statistics for 1-parameter distributions; (iii) show that the parameter space may be partitioned into an interior encased by 5 HDE severity measures (faint, weak, moderate, strong, extreme); (iv) prove that a necessary condition for the HDE in a 2 by 2 table is a log odds ratio of at least 2; (v) give some practical guidelines about HDE-free hypothesis testing. Overall, practical post-fit tests can now be conducted potentially to any model estimated by iteratively reweighted least squares, such as the generalized linear model (GLM) and Vector GLM (VGLM) classes, the latter which encompasses many popular regression models.

Replacements for Fri, 24 Jan 20

[4]  arXiv:1910.07341 (replaced) [pdf, other]
Title: Splinets -- efficient orthonormalization of the B-splines
Subjects: Statistics Theory (math.ST); Numerical Analysis (math.NA)
[5]  arXiv:1912.12150 (replaced) [pdf, other]
Title: The Chi-Square Test of Distance Correlation
Comments: 12 pages + 8 pages appendix, 3 figures
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Statistics Theory (math.ST); Methodology (stat.ME)
[ total of 5 entries: 1-5 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, stat, recent, 2001, contact, help  (Access key information)