We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.ML

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Statistics > Machine Learning

Title: How Much Chemistry Does a Deep Neural Network Need to Know to Make Accurate Predictions?

Abstract: In the last few years, we have seen the rise of deep learning applications in a broad range of chemistry research problems. Recently, we reported on the development of Chemception, a deep convolutional neural network (CNN) architecture for general-purpose small molecule property prediction. In this work, we investigate the effects of systematically removing and adding basic chemical information to the image channels of the 2D images used to train Chemception. By augmenting images with only 3 additional basic chemical information, we demonstrate that Chemception now outperforms contemporary deep learning models trained on more sophisticated chemical representations (molecular fingerprints) for the prediction of toxicity, activity, and solvation free energy, as well as physics-based free energy simulation methods. Thus, our work demonstrates that a firm grasp of first-principles chemical knowledge is not a pre-requisite for deep learning models to accurately predict chemical properties. Lastly, by altering the chemical information content in the images, and examining the resulting performance of Chemception, we also identify two different learning patterns in predicting toxicity/activity as compared to solvation free energy, and these patterns suggest that Chemception is learning about its tasks in the manner that is consistent with established knowledge.
Comments: Submitted to a chemistry peer-reviewed journal
Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Cite as: arXiv:1710.02238 [stat.ML]
  (or arXiv:1710.02238v1 [stat.ML] for this version)

Submission history

From: Garrett Goh [view email]
[v1] Thu, 5 Oct 2017 23:53:59 GMT (1020kb)
[v2] Sun, 18 Mar 2018 14:03:12 GMT (251kb,D)

Link back to: arXiv, form interface, contact.