We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.ML

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Statistics > Machine Learning

Title: Data-driven confidence bands for distributed nonparametric regression

Abstract: Gaussian Process Regression and Kernel Ridge Regression are popular nonparametric regression approaches. Unfortunately, they suffer from high computational complexity rendering them inapplicable to the modern massive datasets. To that end a number of approximations have been suggested, some of them allowing for a distributed implementation. One of them is the divide and conquer approach, splitting the data into a number of partitions, obtaining the local estimates and finally averaging them. In this paper we suggest a novel computationally efficient fully data-driven algorithm, quantifying uncertainty of this method, yielding frequentist $L_2$-confidence bands. We rigorously demonstrate validity of the algorithm. Another contribution of the paper is a minimax-optimal high-probability bound for the averaged estimator, complementing and generalizing the known risk bounds.
Comments: COLT2020 (to appear)
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Statistics Theory (math.ST)
Cite as: arXiv:1912.06689 [stat.ML]
  (or arXiv:1912.06689v2 [stat.ML] for this version)

Submission history

From: Valeriy Avanesov [view email]
[v1] Fri, 13 Dec 2019 20:13:55 GMT (82kb)
[v2] Mon, 8 Jun 2020 18:17:00 GMT (2932kb,D)

Link back to: arXiv, form interface, contact.