Current browse context:
cs.LG
Change to browse by:
References & Citations
Computer Science > Machine Learning
Title: Double descent in the condition number
(Submitted on 12 Dec 2019 (v1), last revised 28 Apr 2020 (this version, v3))
Abstract: In solving a system of $n$ linear equations in $d$ variables $Ax=b$, the condition number of the $n,d$ matrix $A$ measures how much errors in the data $b$ affect the solution $x$. Estimates of this type are important in many inverse problems. An example is machine learning where the key task is to estimate an underlying function from a set of measurements at random points in a high dimensional space and where low sensitivity to error in the data is a requirement for good predictive performance. Here we discuss the simple observation, which is known but surprisingly little quoted (see Theorem 4.2 in \cite{Brgisser:2013:CGN:2526261}): when the columns of $A$ are random vectors, the condition number of $A$ is highest if $d=n$, that is when the inverse of $A$ exists. An overdetermined system ($n>d$) as well as an underdetermined system ($n<d$), for which the pseudoinverse must be used instead of the inverse, typically have significantly better, that is lower, condition numbers. Thus the condition number of $A$ plotted as function of $d$ shows a double descent behavior with a peak at $d=n$.
Submission history
From: Andrzej Banburski [view email][v1] Thu, 12 Dec 2019 20:16:11 GMT (608kb,D)
[v2] Mon, 10 Feb 2020 14:46:25 GMT (592kb,D)
[v3] Tue, 28 Apr 2020 04:29:37 GMT (679kb,D)
Link back to: arXiv, form interface, contact.