We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.ME

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Statistics > Methodology

Title: Optimal Data Collection for Randomized Control Trials

Abstract: In a randomized control trial, the precision of an average treatment effect estimator can be improved either by collecting data on additional individuals, or by collecting additional covariates that predict the outcome variable. We propose the use of pre-experimental data such as a census, or a household survey, to inform the choice of both the sample size and the covariates to be collected. Our procedure seeks to minimize the resulting average treatment effect estimator's mean squared error, subject to the researcher's budget constraint. We rely on a modification of an orthogonal greedy algorithm that is conceptually simple and easy to implement in the presence of a large number of potential covariates, and does not require any tuning parameters. In two empirical applications, we show that our procedure can lead to substantial gains of up to 58%, measured either in terms of reductions in data collection costs or in terms of improvements in the precision of the treatment effect estimator.
Comments: 54 pages, 1 figure
Subjects: Methodology (stat.ME); Econometrics (econ.EM)
MSC classes: 62P20
Cite as: arXiv:1603.03675 [stat.ME]
  (or arXiv:1603.03675v4 [stat.ME] for this version)

Submission history

From: Sokbae Lee [view email]
[v1] Fri, 11 Mar 2016 16:06:03 GMT (62kb,D)
[v2] Tue, 29 Mar 2016 16:16:11 GMT (58kb,D)
[v3] Wed, 20 Apr 2016 14:37:33 GMT (59kb,D)
[v4] Mon, 22 Aug 2016 08:11:03 GMT (64kb,D)

Link back to: arXiv, form interface, contact.