A toolkit for data-driven discovery of governing equations in high-noise regimes

Delahunt, Charles B.; Kutz, J. Nathan

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 2111

Change to browse by:

Computer Science > Machine Learning

Title: A toolkit for data-driven discovery of governing equations in high-noise regimes

Authors: Charles B. Delahunt, J. Nathan Kutz

(Submitted on 8 Nov 2021 (this version), latest version 29 Dec 2021 (v2))

Abstract: We consider the data-driven discovery of governing equations from time-series data in the limit of high noise. The algorithms developed describe an extensive toolkit of methods for circumventing the deleterious effects of noise in the context of the sparse identification of nonlinear dynamics (SINDy) framework. We offer two primary contributions, both focused on noisy data acquired from a system x' = f(x). First, we propose, for use in high-noise settings, an extensive toolkit of critically enabling extensions for the SINDy regression method, to progressively cull functionals from an over-complete library and yield a set of sparse equations that regress to the derivate x'. These innovations can extract sparse governing equations and coefficients from high-noise time-series data (e.g. 300% added noise). For example, it discovers the correct sparse libraries in the Lorenz system, with median coefficient estimate errors equal to 1% - 3% (for 50% noise), 6% - 8% (for 100% noise); and 23% - 25% (for 300% noise). The enabling modules in the toolkit are combined into a single method, but the individual modules can be tactically applied in other equation discovery methods (SINDy or not) to improve results on high-noise data. Second, we propose a technique, applicable to any model discovery method based on x' = f(x), to assess the accuracy of a discovered model in the context of non-unique solutions due to noisy data. Currently, this non-uniqueness can obscure a discovered model's accuracy and thus a discovery method's effectiveness. We describe a technique that uses linear dependencies among functionals to transform a discovered model into an equivalent form that is closest to the true model, enabling more accurate assessment of a discovered model's accuracy.

Comments:	Body 21 pages. Total length with Appendix 32 pages. 17 Figures, 8 Tables
Subjects:	Machine Learning (cs.LG)
MSC classes:	68T05
ACM classes:	I.2.6; J.2
Cite as:	arXiv:2111.04870 [cs.LG]
	(or arXiv:2111.04870v1 [cs.LG] for this version)

Submission history

From: Charles Delahunt [view email]
[v1] Mon, 8 Nov 2021 23:32:11 GMT (3466kb,D)
[v2] Wed, 29 Dec 2021 22:16:23 GMT (3466kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2111.04870v1

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: A toolkit for data-driven discovery of governing equations in high-noise regimes

Submission history