We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.ME

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Statistics > Methodology

Title: Measuring Dependence between Events

Abstract: Measuring dependence between two events, or equivalently between two binary random variables, amounts to expressing the dependence structure inherent in a $2\times 2$ contingency table in a real number between $-1$ and $1$. Countless such dependence measures exist, but there is little theoretical guidance on how they compare and on their advantages and shortcomings. Thus, practitioners might be overwhelmed by the problem of choosing a suitable measure. We provide a set of natural desirable properties that a proper dependence measure should fulfill. We show that Yule's Q and the little-known Cole coefficient are proper, while the most widely-used measures, the phi coefficient and all contingency coefficients, are improper. They have a severe attainability problem, that is, even under perfect dependence they can be very far away from $-1$ and $1$, and often differ substantially from the proper measures in that they understate strength of dependence. The structural reason is that these are measures for equality of events rather than of dependence. We derive the (in some instances non-standard) limiting distributions of the measures and illustrate how asymptotically valid confidence intervals can be constructed. In a case study on drug consumption we demonstrate how misleading conclusions may arise from the use of improper dependence measures.
Subjects: Methodology (stat.ME)
Cite as: arXiv:2403.17580 [stat.ME]
  (or arXiv:2403.17580v1 [stat.ME] for this version)

Submission history

From: Marc-Oliver Pohle [view email]
[v1] Tue, 26 Mar 2024 10:41:48 GMT (1489kb,D)

Link back to: arXiv, form interface, contact.