We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

math

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Machine Learning

Title: Detecting Label Noise via Leave-One-Out Cross-Validation

Abstract: We present a simple algorithm for identifying and correcting real-valued noisy labels from a mixture of clean and corrupted sample points using Gaussian process regression. A heteroscedastic noise model is employed, in which additive Gaussian noise terms with independent variances are associated with each and all of the observed labels. Optimizing the noise model using maximum likelihood estimation leads to the containment of the GPR model's predictive error by the posterior standard deviation in leave-one-out cross-validation. A multiplicative update scheme is proposed for solving the maximum likelihood estimation problem under non-negative constraints. While we provide proof of convergence for certain special cases, the multiplicative scheme has empirically demonstrated monotonic convergence behavior in virtually all our numerical experiments. We show that the presented method can pinpoint corrupted sample points and lead to better regression models when trained on synthetic and real-world scientific data sets.
Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
Cite as: arXiv:2103.11352 [cs.LG]
  (or arXiv:2103.11352v2 [cs.LG] for this version)

Submission history

From: Yu-Hang Tang [view email]
[v1] Sun, 21 Mar 2021 10:02:50 GMT (554kb,D)
[v2] Sun, 28 Mar 2021 19:00:30 GMT (556kb,D)

Link back to: arXiv, form interface, contact.