Improved architectures and training algorithms for deep operator networks

Wang, Sifan; Wang, Hanwen; Perdikaris, Paris

Full-text links:

Download:

Current browse context:

math

< prev | next >

new | recent | 2110

Computer Science > Machine Learning

Title: Improved architectures and training algorithms for deep operator networks

Authors: Sifan Wang, Hanwen Wang, Paris Perdikaris

(Submitted on 4 Oct 2021 (v1), last revised 11 Oct 2021 (this version, v2))

Abstract: Operator learning techniques have recently emerged as a powerful tool for learning maps between infinite-dimensional Banach spaces. Trained under appropriate constraints, they can also be effective in learning the solution operator of partial differential equations (PDEs) in an entirely self-supervised manner. In this work we analyze the training dynamics of deep operator networks (DeepONets) through the lens of Neural Tangent Kernel (NTK) theory, and reveal a bias that favors the approximation of functions with larger magnitudes. To correct this bias we propose to adaptively re-weight the importance of each training example, and demonstrate how this procedure can effectively balance the magnitude of back-propagated gradients during training via gradient descent. We also propose a novel network architecture that is more resilient to vanishing gradient pathologies. Taken together, our developments provide new insights into the training of DeepONets and consistently improve their predictive accuracy by a factor of 10-50x, demonstrated in the challenging setting of learning PDE solution operators in the absence of paired input-output observations. All code and data accompanying this manuscript are publicly available at \url{this https URL}

Comments:	40 pages, 27 figures, 11 tables
Subjects:	Machine Learning (cs.LG); Numerical Analysis (math.NA); Computational Physics (physics.comp-ph); Machine Learning (stat.ML)
Cite as:	arXiv:2110.01654 [cs.LG]
	(or arXiv:2110.01654v2 [cs.LG] for this version)

Submission history

From: Sifan Wang [view email]
[v1] Mon, 4 Oct 2021 18:34:41 GMT (6043kb,D)
[v2] Mon, 11 Oct 2021 21:40:57 GMT (6042kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2110.01654

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Computer Science > Machine Learning

Title: Improved architectures and training algorithms for deep operator networks

Submission history