Exact Gaussian Processes on a Million Data Points

Wang, Ke Alexander; Pleiss, Geoff; Gardner, Jacob R.; Tyree, Stephen; Weinberger, Kilian Q.; Wilson, Andrew Gordon

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 1903

Computer Science > Machine Learning

Title: Exact Gaussian Processes on a Million Data Points

Authors: Ke Alexander Wang, Geoff Pleiss, Jacob R. Gardner, Stephen Tyree, Kilian Q. Weinberger, Andrew Gordon Wilson

(Submitted on 19 Mar 2019 (v1), last revised 10 Dec 2019 (this version, v2))

Abstract: Gaussian processes (GPs) are flexible non-parametric models, with a capacity that grows with the available data. However, computational constraints with standard inference procedures have limited exact GPs to problems with fewer than about ten thousand training points, necessitating approximations for larger datasets. In this paper, we develop a scalable approach for exact GPs that leverages multi-GPU parallelization and methods like linear conjugate gradients, accessing the kernel matrix only through matrix multiplication. By partitioning and distributing kernel matrix multiplies, we demonstrate that an exact GP can be trained on over a million points, a task previously thought to be impossible with current computing hardware, in less than 2 hours. Moreover, our approach is generally applicable, without constraints to grid data or specific kernel classes. Enabled by this scalability, we perform the first-ever comparison of exact GPs against scalable GP approximations on datasets with $10^4 \!-\! 10^6$ data points, showing dramatic performance improvements.

Comments:	Published at NeurIPS 2019
Subjects:	Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (stat.ML)
Cite as:	arXiv:1903.08114 [cs.LG]
	(or arXiv:1903.08114v2 [cs.LG] for this version)

Submission history

From: Ke Alexander Wang [view email]
[v1] Tue, 19 Mar 2019 17:10:28 GMT (431kb,D)
[v2] Tue, 10 Dec 2019 18:44:52 GMT (731kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:1903.08114

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Exact Gaussian Processes on a Million Data Points

Submission history