We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:

References & Citations


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Economics > Econometrics

Title: Variable Selection for Causal Inference via Outcome-Adaptive Random Forest

Authors: Daniel Jacob
Abstract: Estimating a causal effect from observational data can be biased if we do not control for self-selection. This selection is based on confounding variables that affect the treatment assignment and the outcome. Propensity score methods aim to correct for confounding. However, not all covariates are confounders. We propose the outcome-adaptive random forest (OARF) that only includes desirable variables for estimating the propensity score to decrease bias and variance. Our approach works in high-dimensional datasets and if the outcome and propensity score model are non-linear and potentially complicated. The OARF excludes covariates that are not associated with the outcome, even in the presence of a large number of spurious variables. Simulation results suggest that the OARF produces unbiased estimates, has a smaller variance and is superior in variable selection compared to other approaches. The results from two empirical examples, the effect of right heart catheterization on mortality and the effect of maternal smoking during pregnancy on birth weight, show comparable treatment effects to previous findings but tighter confidence intervals and more plausible selected variables.
Subjects: Econometrics (econ.EM)
Cite as: arXiv:2109.04154 [econ.EM]
  (or arXiv:2109.04154v1 [econ.EM] for this version)

Submission history

From: Daniel Jacob [view email]
[v1] Thu, 9 Sep 2021 10:29:26 GMT (641kb,D)

Link back to: arXiv, form interface, contact.