We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.ME

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Statistics > Methodology

Title: Combining Experimental and Observational Data for Identification and Estimation of Long-Term Causal Effects

Abstract: We consider the task of identifying and estimating the causal effect of a treatment variable on a long-term outcome variable using data from an observational domain and an experimental domain. The observational domain is subject to unobserved confounding. Furthermore, subjects in the experiment are only followed for a short period of time; hence, long-term effects of treatment are unobserved but short-term effects will be observed. Therefore, data from neither domain alone suffices for causal inference about the effect of the treatment on the long-term outcome, and must be pooled in a principled way, instead. Athey et al. (2020) proposed a method for systematically combining such data for identifying the downstream causal effect in view. Their approach is based on the assumptions of internal and external validity of the experimental data, and an extra novel assumption called latent unconfoundedness. In this paper, we first review their proposed approach, and then we propose three alternative approaches for data fusion for the purpose of identifying and estimating average treatment effect as well as the effect of treatment on the treated. Our first approach is based on assuming equi-confounding bias for the short-term and long-term outcomes. Our second approach is based on a relaxed version of the equi-confounding bias assumption, where we assume the existence of an observed confounder such that the short-term and long-term potential outcome variables have the same partial additive association with that confounder. Our third approach is based on the proximal causal inference framework, in which we assume the existence of an extra variable in the system which is a proxy of the latent confounder of the treatment-outcome relation. We propose influence function-based estimation strategies for each of our data fusion frameworks and study the robustness properties of the proposed estimators.
Subjects: Methodology (stat.ME); Econometrics (econ.EM); Statistics Theory (math.ST); Machine Learning (stat.ML)
Cite as: arXiv:2201.10743 [stat.ME]
  (or arXiv:2201.10743v3 [stat.ME] for this version)

Submission history

From: AmirEmad Ghassami [view email]
[v1] Wed, 26 Jan 2022 04:21:14 GMT (249kb)
[v2] Sun, 27 Mar 2022 05:31:09 GMT (496kb)
[v3] Fri, 29 Apr 2022 04:37:00 GMT (679kb)

Link back to: arXiv, form interface, contact.