Binary classification with corrupted labels

Lee, Yonghoon; Barber, Rina Foygel

Full-text links:

Download:

Current browse context:

math.ST

< prev | next >

new | recent | 2106

Mathematics > Statistics Theory

Title: Binary classification with corrupted labels

Authors: Yonghoon Lee, Rina Foygel Barber

(Submitted on 16 Jun 2021)

Abstract: In a binary classification problem where the goal is to fit an accurate predictor, the presence of corrupted labels in the training data set may create an additional challenge. However, in settings where likelihood maximization is poorly behaved-for example, if positive and negative labels are perfectly separable-then a small fraction of corrupted labels can improve performance by ensuring robustness. In this work, we establish that in such settings, corruption acts as a form of regularization, and we compute precise upper bounds on estimation error in the presence of corruptions. Our results suggest that the presence of corrupted data points is beneficial only up to a small fraction of the total sample, scaling with the square root of the sample size.

Subjects:	Statistics Theory (math.ST); Machine Learning (stat.ML)
Cite as:	arXiv:2106.09136 [math.ST]
	(or arXiv:2106.09136v1 [math.ST] for this version)

Submission history

From: Yonghoon Lee [view email]
[v1] Wed, 16 Jun 2021 21:23:48 GMT (25kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> math > arXiv:2106.09136

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Mathematics > Statistics Theory

Title: Binary classification with corrupted labels

Submission history