Current browse context:
cs.LG
Change to browse by:
References & Citations
Computer Science > Machine Learning
Title: Heteroskedastic and Imbalanced Deep Learning with Adaptive Regularization
(Submitted on 29 Jun 2020 (v1), last revised 18 Mar 2021 (this version, v2))
Abstract: Real-world large-scale datasets are heteroskedastic and imbalanced -- labels have varying levels of uncertainty and label distributions are long-tailed. Heteroskedasticity and imbalance challenge deep learning algorithms due to the difficulty of distinguishing among mislabeled, ambiguous, and rare examples. Addressing heteroskedasticity and imbalance simultaneously is under-explored. We propose a data-dependent regularization technique for heteroskedastic datasets that regularizes different regions of the input space differently. Inspired by the theoretical derivation of the optimal regularization strength in a one-dimensional nonparametric classification setting, our approach adaptively regularizes the data points in higher-uncertainty, lower-density regions more heavily. We test our method on several benchmark tasks, including a real-world heteroskedastic and imbalanced dataset, WebVision. Our experiments corroborate our theory and demonstrate a significant improvement over other methods in noise-robust deep learning.
Submission history
From: Kaidi Cao [view email][v1] Mon, 29 Jun 2020 01:09:50 GMT (37kb,D)
[v2] Thu, 18 Mar 2021 07:49:18 GMT (332kb,D)
Link back to: arXiv, form interface, contact.