We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:

References & Citations


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Quantitative Biology > Biomolecules

Title: Contrastive Representation Learning for 3D Protein Structures

Abstract: Learning from 3D protein structures has gained wide interest in protein modeling and structural bioinformatics. Unfortunately, the number of available structures is orders of magnitude lower than the training data sizes commonly used in computer vision and machine learning. Moreover, this number is reduced even further, when only annotated protein structures can be considered, making the training of existing models difficult and prone to over-fitting. To address this challenge, we introduce a new representation learning framework for 3D protein structures. Our framework uses unsupervised contrastive learning to learn meaningful representations of protein structures, making use of proteins from the Protein Data Bank. We show, how these representations can be used to solve a large variety of tasks, such as protein function prediction, protein fold classification, structural similarity prediction, and protein-ligand binding affinity prediction. Moreover, we show how fine-tuned networks, pre-trained with our algorithm, lead to significantly improved task performance, achieving new state-of-the-art results in many tasks.
Subjects: Biomolecules (q-bio.BM); Machine Learning (cs.LG)
Cite as: arXiv:2205.15675 [q-bio.BM]
  (or arXiv:2205.15675v1 [q-bio.BM] for this version)

Submission history

From: Pedro Hermosilla Casajus [view email]
[v1] Tue, 31 May 2022 10:33:06 GMT (589kb,D)

Link back to: arXiv, form interface, contact.