We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

physics.chem-ph

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Physics > Chemical Physics

Title: Recent advances in the Self-Referencing Embedding Strings (SELFIES) library

Abstract: String-based molecular representations play a crucial role in cheminformatics applications, and with the growing success of deep learning in chemistry, have been readily adopted into machine learning pipelines. However, traditional string-based representations such as SMILES are often prone to syntactic and semantic errors when produced by generative models. To address these problems, a novel representation, SELF-referencIng Embedded Strings (SELFIES), was proposed that is inherently 100% robust, alongside an accompanying open-source implementation. Since then, we have generalized SELFIES to support a wider range of molecules and semantic constraints and streamlined its underlying grammar. We have implemented this updated representation in subsequent versions of \selfieslib, where we have also made major advances with respect to design, efficiency, and supported features. Hence, we present the current status of \selfieslib (version 2.1.1) in this manuscript.
Comments: 11 pages, 2 figures
Subjects: Chemical Physics (physics.chem-ph); Machine Learning (cs.LG)
Journal reference: Digital Discovery 2, 897 (2023)
DOI: 10.1039/D3DD00044C
Cite as: arXiv:2302.03620 [physics.chem-ph]
  (or arXiv:2302.03620v1 [physics.chem-ph] for this version)

Submission history

From: Mario Krenn [view email]
[v1] Tue, 7 Feb 2023 17:24:08 GMT (184kb,D)

Link back to: arXiv, form interface, contact.