Learning Curves for Analysis of Deep Networks

Hoiem, Derek; Gupta, Tanmay; Li, Zhizhong; Shlapentokh-Rothman, Michal M.

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 2010

Computer Science > Machine Learning

Title: Learning Curves for Analysis of Deep Networks

Authors: Derek Hoiem, Tanmay Gupta, Zhizhong Li, Michal M. Shlapentokh-Rothman

(Submitted on 21 Oct 2020 (v1), last revised 5 Apr 2021 (this version, v2))

Abstract: Learning curves model a classifier's test error as a function of the number of training samples. Prior works show that learning curves can be used to select model parameters and extrapolate performance. We investigate how to use learning curves to evaluate design choices, such as pretraining, architecture, and data augmentation. We propose a method to robustly estimate learning curves, abstract their parameters into error and data-reliance, and evaluate the effectiveness of different parameterizations. Our experiments exemplify use of learning curves for analysis and yield several interesting observations.

Comments:	Improved text and figure organization, additional experiments on optimization
Subjects:	Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
Cite as:	arXiv:2010.11029 [cs.LG]
	(or arXiv:2010.11029v2 [cs.LG] for this version)

Submission history

From: Derek Hoiem [view email]
[v1] Wed, 21 Oct 2020 14:20:05 GMT (6043kb,D)
[v2] Mon, 5 Apr 2021 17:01:02 GMT (10589kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2010.11029

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Learning Curves for Analysis of Deep Networks

Submission history