References & Citations
Statistics > Methodology
Title: Generalized R-squared for Detecting Non-independence
(Submitted on 10 Apr 2016 (v1), revised 11 Oct 2016 (this version, v2), latest version 18 Nov 2016 (v3))
Abstract: Detecting non-independence between two random variables is a fundamental problem in statistics and machine learning. Although the celebrated Pearson correlation is effective for capturing linear dependency, it can be entirely powerless for detecting nonlinear and/or heteroscedastic patterns. We introduce a new measure, G-squared, as a generalization of the classic R-squared statistic to test whether two univariate random variables are mutually independent and to measure the strength of their relationship. The G-squared has an intuitive meaning of the piece-wise R-squared between the two variables. It is thus almost identical to the R-squared for linear relationships with constant error variances, and is particularly effective in handling nonlinearity and heteroscedastic errors. We propose two statistics to estimate the population G-squared and show that they are both consistent. Through intensive simulation studies, we demonstrate that the G-squared statistics are among the most powerful performers compared with several state-of-art methods.
Submission history
From: Xufei Wang [view email][v1] Sun, 10 Apr 2016 21:03:53 GMT (1275kb,D)
[v2] Tue, 11 Oct 2016 03:15:08 GMT (686kb,D)
[v3] Fri, 18 Nov 2016 03:07:49 GMT (373kb,D)
Link back to: arXiv, form interface, contact.