Provable Guarantees for Self-Supervised Deep Learning with Spectral Contrastive Loss

HaoChen, Jeff Z.; Wei, Colin; Gaidon, Adrien; Ma, Tengyu

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 2106

Computer Science > Machine Learning

Title: Provable Guarantees for Self-Supervised Deep Learning with Spectral Contrastive Loss

Authors: Jeff Z. HaoChen, Colin Wei, Adrien Gaidon, Tengyu Ma

(Submitted on 8 Jun 2021 (v1), last revised 24 Jun 2022 (this version, v7))

Abstract: Recent works in self-supervised learning have advanced the state-of-the-art by relying on the contrastive learning paradigm, which learns representations by pushing positive pairs, or similar examples from the same class, closer together while keeping negative pairs far apart. Despite the empirical successes, theoretical foundations are limited -- prior analyses assume conditional independence of the positive pairs given the same class label, but recent empirical applications use heavily correlated positive pairs (i.e., data augmentations of the same image). Our work analyzes contrastive learning without assuming conditional independence of positive pairs using a novel concept of the augmentation graph on data. Edges in this graph connect augmentations of the same data, and ground-truth classes naturally form connected sub-graphs. We propose a loss that performs spectral decomposition on the population augmentation graph and can be succinctly written as a contrastive learning objective on neural net representations. Minimizing this objective leads to features with provable accuracy guarantees under linear probe evaluation. By standard generalization bounds, these accuracy guarantees also hold when minimizing the training contrastive loss. Empirically, the features learned by our objective can match or outperform several strong baselines on benchmark vision datasets. In all, this work provides the first provable analysis for contrastive learning where guarantees for linear probe evaluation can apply to realistic empirical settings.

Comments:	Accepted as an oral to NeurIPS 2021
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2106.04156 [cs.LG]
	(or arXiv:2106.04156v7 [cs.LG] for this version)

Submission history

From: Jeff Z. HaoChen [view email]
[v1] Tue, 8 Jun 2021 07:41:02 GMT (6392kb,D)
[v2] Thu, 17 Jun 2021 01:25:06 GMT (5688kb,D)
[v3] Sun, 25 Jul 2021 07:39:31 GMT (5694kb,D)
[v4] Wed, 28 Jul 2021 18:49:08 GMT (5694kb,D)
[v5] Fri, 6 Aug 2021 05:49:11 GMT (5694kb,D)
[v6] Wed, 6 Apr 2022 08:39:52 GMT (5115kb,D)
[v7] Fri, 24 Jun 2022 01:36:12 GMT (5579kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2106.04156

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Provable Guarantees for Self-Supervised Deep Learning with Spectral Contrastive Loss

Submission history