Better estimates from binned income data: Interpolated CDFs and mean-matching

von Hippel, Paul T.; Hunter, David J.; Drown, McKalie

Full-text links:

Download:

Current browse context:

stat.ME

< prev | next >

new | recent | 1709

Statistics > Methodology

Title: Better estimates from binned income data: Interpolated CDFs and mean-matching

Authors: Paul T. von Hippel, David J. Hunter, McKalie Drown

(Submitted on 27 Sep 2017 (v1), last revised 17 Oct 2017 (this version, v3))

Abstract: Researchers often estimate income statistics from summaries that report the number of incomes in bins such as \$0-10,000, \$10,001-20,000,...,\$200,000+. Some analysts assign incomes to bin midpoints, but this treats income as discrete. Other analysts fit a continuous parametric distribution, but the distribution may not fit well.
We fit nonparametric continuous distributions that reproduce the bin counts perfectly by interpolating the cumulative distribution function (CDF). We also show how both midpoints and interpolated CDFs can be constrained to reproduce the mean of income when it is known.
We compare the methods' accuracy in estimating the Gini coefficients of all 3,221 US counties. Fitting parametric distributions is very slow. Fitting interpolated CDFs is much faster and slightly more accurate. Both interpolated CDFs and midpoints give dramatically better estimates if constrained to match a known mean.
We have implemented interpolated CDFs in the binsmooth package for R. We have implemented the midpoint method in the rpme command for Stata. Both implementations can be constrained to match a known mean.

Comments:	20 pages (including Appendix), 3 tables, 2 figures (+2 in Appendix)
Subjects:	Methodology (stat.ME)
Cite as:	arXiv:1709.09705 [stat.ME]
	(or arXiv:1709.09705v3 [stat.ME] for this version)

Submission history

From: Paul von Hippel [view email]
[v1] Wed, 27 Sep 2017 19:13:50 GMT (353kb)
[v2] Fri, 29 Sep 2017 16:55:29 GMT (353kb)
[v3] Tue, 17 Oct 2017 02:52:52 GMT (354kb)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> stat > arXiv:1709.09705

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Statistics > Methodology

Title: Better estimates from binned income data: Interpolated CDFs and mean-matching

Submission history