Fair Classification with Adversarial Perturbations

Celis, L. Elisa; Mehrotra, Anay; Vishnoi, Nisheeth K.

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 2106

Computer Science > Machine Learning

Title: Fair Classification with Adversarial Perturbations

Authors: L. Elisa Celis, Anay Mehrotra, Nisheeth K. Vishnoi

(Submitted on 10 Jun 2021 (v1), last revised 23 Nov 2021 (this version, v2))

Abstract: We study fair classification in the presence of an omniscient adversary that, given an $\eta$, is allowed to choose an arbitrary $\eta$-fraction of the training samples and arbitrarily perturb their protected attributes. The motivation comes from settings in which protected attributes can be incorrect due to strategic misreporting, malicious actors, or errors in imputation; and prior approaches that make stochastic or independence assumptions on errors may not satisfy their guarantees in this adversarial setting. Our main contribution is an optimization framework to learn fair classifiers in this adversarial setting that comes with provable guarantees on accuracy and fairness. Our framework works with multiple and non-binary protected attributes, is designed for the large class of linear-fractional fairness metrics, and can also handle perturbations besides protected attributes. We prove near-tightness of our framework's guarantees for natural hypothesis classes: no algorithm can have significantly better accuracy and any algorithm with better fairness must have lower accuracy. Empirically, we evaluate the classifiers produced by our framework for statistical rate on real-world and synthetic datasets for a family of adversaries.

Comments:	Full version of a paper accepted for presentation in NeurIPS 2021
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computers and Society (cs.CY); Data Structures and Algorithms (cs.DS); Machine Learning (stat.ML)
Cite as:	arXiv:2106.05964 [cs.LG]
	(or arXiv:2106.05964v2 [cs.LG] for this version)

Submission history

From: Anay Mehrotra [view email]
[v1] Thu, 10 Jun 2021 17:56:59 GMT (719kb,D)
[v2] Tue, 23 Nov 2021 03:55:37 GMT (649kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2106.05964

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Fair Classification with Adversarial Perturbations

Submission history