We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Machine Learning

Title: Predicting What You Already Know Helps: Provable Self-Supervised Learning

Abstract: Self-supervised representation learning solves auxiliary prediction tasks (known as pretext tasks) without requiring labeled data to learn useful semantic representations. These pretext tasks are created solely using the input features, such as predicting a missing image patch, recovering the color channels of an image from context, or predicting missing words in text; yet predicting this \textit{known} information helps in learning representations effective for downstream prediction tasks. We posit a mechanism exploiting the statistical connections between certain {\em reconstruction-based} pretext tasks that guarantee to learn a good representation. Formally, we quantify how the approximate independence between the components of the pretext task (conditional on the label and latent variables) allows us to learn representations that can solve the downstream task by just training a linear layer on top of the learned representation. We prove the linear layer yields small approximation error even for complex ground truth function class and will drastically reduce labeled sample complexity. Next, we show a simple modification of our method leads to nonlinear CCA, analogous to the popular SimSiam algorithm, and show similar guarantees for nonlinear CCA.
Comments: NeurIPS 2021
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as: arXiv:2008.01064 [cs.LG]
  (or arXiv:2008.01064v2 [cs.LG] for this version)

Submission history

From: Qi Lei [view email]
[v1] Mon, 3 Aug 2020 17:56:13 GMT (1089kb,D)
[v2] Sun, 14 Nov 2021 04:26:31 GMT (1815kb,D)

Link back to: arXiv, form interface, contact.