References & Citations
Statistics > Methodology
Title: Impartial Predictive Modeling and the Use of Proxy Variables
(Submitted on 1 Aug 2016 (v1), last revised 7 Jan 2022 (this version, v4))
Abstract: Fairness aware data mining (FADM) aims to prevent algorithms from discriminating against protected groups. The literature has come to an impasse as to what constitutes explainable variability as opposed to discrimination. This distinction hinges on a rigorous understanding of the role of proxy variables; i.e., those variables which are associated both the protected feature and the outcome of interest. We demonstrate that fairness is achieved by ensuring impartiality with respect to sensitive characteristics and provide a framework for impartiality by accounting for different perspectives on the data generating process. In particular, fairness can only be precisely defined in a full-data scenario in which all covariates are observed. We then analyze how these models may be conservatively estimated via regression in partial-data settings. Decomposing the regression estimates provides insights into previously unexplored distinctions between explainable variability and discrimination that illuminate the use of proxy variables in fairness aware data mining.
Submission history
From: Kory Johnson [view email][v1] Mon, 1 Aug 2016 19:06:49 GMT (4535kb,D)
[v2] Thu, 6 Oct 2016 13:05:26 GMT (4365kb,D)
[v3] Sun, 11 Oct 2020 16:09:25 GMT (4217kb,D)
[v4] Fri, 7 Jan 2022 21:15:29 GMT (31kb)
Link back to: arXiv, form interface, contact.