We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.CO

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Statistics > Computation

Title: Fast and stable multivariate kernel density estimation by fast sum updating

Abstract: Kernel density estimation and kernel regression are powerful but computationally expensive techniques: a direct evaluation of kernel density estimates at $M$ evaluation points given $N$ input sample points requires a quadratic $\mathcal{O}(MN)$ operations, which is prohibitive for large scale problems. For this reason, approximate methods such as binning with Fast Fourier Transform or the Fast Gauss Transform have been proposed to speed up kernel density estimation. Among these fast methods, the Fast Sum Updating approach is an attractive alternative, as it is an exact method and its speed is independent of the input sample and the bandwidth. Unfortunately, this method, based on data sorting, has for the most part been limited to the univariate case. In this paper, we revisit the fast sum updating approach and extend it in several ways. Our main contribution is to extend it to the general multivariate case for general input data and rectilinear evaluation grid. Other contributions include its extension to a wider class of kernels, including the triangular, cosine and Silverman kernels, its combination with parsimonious additive multivariate kernels, and its combination with a fast approximate k-nearest-neighbors bandwidth for multivariate datasets. Our numerical tests of multivariate regression and density estimation confirm the speed, accuracy and stability of the method. We hope this paper will renew interest for the fast sum updating approach and help solve large-scale practical density estimation and regression problems.
Comments: 38 pages, 29 figures
Subjects: Computation (stat.CO)
MSC classes: 62G07, 62G08, 65C60
ACM classes: G.3; F.2.1; G.1.0
Journal reference: Journal of Computational and Graphical Statistics 28(3) 596-608 (2019)
DOI: 10.1080/10618600.2018.1549052
Cite as: arXiv:1712.00993 [stat.CO]
  (or arXiv:1712.00993v2 [stat.CO] for this version)

Submission history

From: Nicolas Langrené [view email]
[v1] Mon, 4 Dec 2017 10:31:44 GMT (885kb,D)
[v2] Mon, 22 Oct 2018 01:51:13 GMT (915kb,D)

Link back to: arXiv, form interface, contact.