Statistical significance testing for mixed priors: a combined Bayesian and frequentist analysis

Robnik, Jakob; Seljak, Uroš

doi:10.3390/e24101328

Full-text links:

Download:

Current browse context:

astro-ph

< prev | next >

new | recent | 2207

Physics > Data Analysis, Statistics and Probability

Title: Statistical significance testing for mixed priors: a combined Bayesian and frequentist analysis

Authors: Jakob Robnik, Uroš Seljak

(Submitted on 14 Jul 2022)

Abstract: In many hypothesis testing applications, we have mixed priors, with well-motivated informative priors for some parameters but not for others. The Bayesian methodology uses the Bayes factor and is helpful for the informative priors, as it incorporates Occam's razor via multiplicity or trials factor in the Look Elsewhere Effect. However, if the prior is not known completely, the frequentist hypothesis test via the false positive rate is a better approach, as it is less sensitive to the prior choice. We argue that when only partial prior information is available, it is best to combine the two methodologies by using the Bayes factor as a test statistic in the frequentist analysis. We show that the standard frequentist likelihood-ratio test statistic corresponds to the Bayes factor with a non-informative Jeffrey's prior. We also show that mixed priors increase the statistical power in frequentist analyses over the likelihood ratio test statistic. We develop an analytic formalism that does not require expensive simulations using a statistical mechanics approach to hypothesis testing in Bayesian and frequentist statistics. We introduce the counting of states in a continuous parameter space using the uncertainty volume as the quantum of the state. We show that both the p-value and Bayes factor can be expressed as energy versus entropy competition. We present analytic expressions that generalize Wilks' theorem beyond its usual regime of validity and work in a non-asymptotic regime. In specific limits, the formalism reproduces existing expressions, such as the p-value of linear models and periodograms. We apply the formalism to an example of exoplanet transits, where multiplicity can be more than $10^7$. We show that our analytic expressions reproduce the p-values derived from numerical simulations.

Comments:	15 pages, 7 figures
Subjects:	Data Analysis, Statistics and Probability (physics.data-an); Earth and Planetary Astrophysics (astro-ph.EP)
DOI:	10.3390/e24101328
Cite as:	arXiv:2207.06784 [physics.data-an]
	(or arXiv:2207.06784v1 [physics.data-an] for this version)

Submission history

From: Jakob Robnik [view email]
[v1] Thu, 14 Jul 2022 09:47:53 GMT (1097kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> physics > arXiv:2207.06784

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Physics > Data Analysis, Statistics and Probability

Title: Statistical significance testing for mixed priors: a combined Bayesian and frequentist analysis

Submission history