We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:

References & Citations


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Statistics > Methodology

Title: Regression with missing Ys: An improved strategy for analyzing multiply imputed data

Abstract: When fitting a generalized linear model -- such as a linear regression, a logistic regression, or a hierarchical linear model -- analysts often wonder how to handle missing values of the dependent variable Y. If missing values have been filled in using multiple imputation, the usual advice is to use the imputed Y values in analysis. We show, however, that using imputed Ys can add needless noise to the estimates. Better estimates can usually be obtained using a modified strategy that we call multiple imputation, then deletion (MID). Under MID, all cases are used for imputation, but following imputation cases with imputed Y values are excluded from the analysis. When there is something wrong with the imputed Y values, MID protects the estimates from the problematic imputations. And when the imputed Y values are acceptable, MID usually offers somewhat more efficient estimates than an ordinary MI strategy.
Subjects: Methodology (stat.ME)
Journal reference: Sociological Methodology (2007) volume 37, pp. 83-117
DOI: 10.1111/j.1467-9531.2007.00180.x
Cite as: arXiv:1605.01095 [stat.ME]
  (or arXiv:1605.01095v1 [stat.ME] for this version)

Submission history

From: Paul von Hippel [view email]
[v1] Tue, 3 May 2016 21:20:50 GMT (185kb)

Link back to: arXiv, form interface, contact.