We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.ME

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Statistics > Methodology

Title: How Much Data Do You Need? An Operational, Pre-Asymptotic Metric for Fat-tailedness

Abstract: This note presents an operational measure of fat-tailedness for univariate probability distributions, in $[0,1]$ where 0 is maximally thin-tailed (Gaussian) and 1 is maximally fat-tailed. Among others,1) it helps assess the sample size needed to establish a comparative $n$ needed for statistical significance, 2) allows practical comparisons across classes of fat-tailed distributions, 3) helps understand some inconsistent attributes of the lognormal, pending on the parametrization of its scale parameter. The literature is rich for what concerns asymptotic behavior, but there is a large void for finite values of $n$, those needed for operational purposes. Conventional measures of fat-tailedness, namely 1) the tail index for the power law class, and 2) Kurtosis for finite moment distributions fail to apply to some distributions, and do not allow comparisons across classes and parametrization, that is between power laws outside the Levy-Stable basin, or power laws to distributions in other classes, or power laws for different number of summands. How can one compare a sum of 100 Student T distributed random variables with 3 degrees of freedom to one in a Levy-Stable or a Lognormal class? How can one compare a sum of 100 Student T with 3 degrees of freedom to a single Student T with 2 degrees of freedom? We propose an operational and heuristic measure that allow us to compare $n$-summed independent variables under all distributions with finite first moment. The method is based on the rate of convergence of the Law of Large numbers for finite sums, $n$-summands specifically. We get either explicit expressions or simulation results and bounds for the lognormal, exponential, Pareto, and the Student T distributions in their various calibrations --in addition to the general Pearson classes.
Subjects: Methodology (stat.ME); Statistical Finance (q-fin.ST)
Journal reference: International Journal of Forecasting, 35-2, 677-686, 2019
DOI: 10.1016/j.ijforecast.2018.10.003
Cite as: arXiv:1802.05495 [stat.ME]
  (or arXiv:1802.05495v3 [stat.ME] for this version)

Submission history

From: Nassim Nicholas Taleb [view email]
[v1] Thu, 15 Feb 2018 11:57:08 GMT (206kb,D)
[v2] Fri, 18 May 2018 15:24:55 GMT (206kb,D)
[v3] Mon, 26 Nov 2018 15:21:52 GMT (371kb,D)

Link back to: arXiv, form interface, contact.