Conditionally Calibrated Predictive Distributions by Probability-Probability Map: Application to Galaxy Redshift Estimation and Probabilistic Forecasting

Dey, Biprateep; Zhao, David; Newman, Jeffrey A.; Andrews, Brett H.; Izbicki, Rafael; Lee, Ann B.

Full-text links:

Download:

Current browse context:

stat

< prev | next >

new | recent | 2205

Statistics > Machine Learning

Title: Conditionally Calibrated Predictive Distributions by Probability-Probability Map: Application to Galaxy Redshift Estimation and Probabilistic Forecasting

Authors: Biprateep Dey, David Zhao, Jeffrey A. Newman, Brett H. Andrews, Rafael Izbicki, Ann B. Lee

(Submitted on 29 May 2022 (v1), last revised 17 Jul 2023 (this version, v4))

Abstract: Uncertainty quantification is crucial for assessing the predictive ability of AI algorithms. Much research has been devoted to describing the predictive distribution (PD) $F(y|\mathbf{x})$ of a target variable $y \in \mathbb{R}$ given complex input features $\mathbf{x} \in \mathcal{X}$. However, off-the-shelf PDs (from, e.g., normalizing flows and Bayesian neural networks) often lack conditional calibration with the probability of occurrence of an event given input $\mathbf{x}$ being significantly different from the predicted probability. Current calibration methods do not fully assess and enforce conditionally calibrated PDs. Here we propose \texttt{Cal-PIT}, a method that addresses both PD diagnostics and recalibration by learning a single probability-probability map from calibration data. The key idea is to regress probability integral transform scores against $\mathbf{x}$. The estimated regression provides interpretable diagnostics of conditional coverage across the feature space. The same regression function morphs the misspecified PD to a re-calibrated PD for all $\mathbf{x}$. We benchmark our corrected prediction bands (a by-product of corrected PDs) against oracle bands and state-of-the-art predictive inference algorithms for synthetic data. We also provide results for two applications: (i) probabilistic nowcasting given sequences of satellite images, and (ii) conditional density estimation of galaxy distances given imaging data (so-called photometric redshift estimation). Our code is available as a Python package this https URL .

Comments:	21 pages, 11 figures. Under review. Code available as a Python package this https URL
Subjects:	Machine Learning (stat.ML); Instrumentation and Methods for Astrophysics (astro-ph.IM); Machine Learning (cs.LG); Methodology (stat.ME)
Cite as:	arXiv:2205.14568 [stat.ML]
	(or arXiv:2205.14568v4 [stat.ML] for this version)

Submission history

From: Biprateep Dey [view email]
[v1] Sun, 29 May 2022 03:52:44 GMT (3897kb,D)
[v2] Wed, 26 Oct 2022 15:17:22 GMT (4858kb,D)
[v3] Fri, 7 Jul 2023 18:34:02 GMT (7585kb,D)
[v4] Mon, 17 Jul 2023 16:58:54 GMT (7584kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> stat > arXiv:2205.14568

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Statistics > Machine Learning

Title: Conditionally Calibrated Predictive Distributions by Probability-Probability Map: Application to Galaxy Redshift Estimation and Probabilistic Forecasting

Submission history