Divide-and-conquer methods for big data analysis

Chen, Xueying; Cheng, Jerry Q.; Xie, Min-ge

Full-text links:

Download:

Current browse context:

stat.ML

< prev | next >

new | recent | 2102

Statistics > Machine Learning

Title: Divide-and-conquer methods for big data analysis

Authors: Xueying Chen, Jerry Q. Cheng, Min-ge Xie

(Submitted on 22 Feb 2021)

Abstract: In the context of big data analysis, the divide-and-conquer methodology refers to a multiple-step process: first splitting a data set into several smaller ones; then analyzing each set separately; finally combining results from each analysis together. This approach is effective in handling large data sets that are unsuitable to be analyzed entirely by a single computer due to limits either from memory storage or computational time. The combined results will provide a statistical inference which is similar to the one from analyzing the entire data set. This article reviews some recently developments of divide-and-conquer methods in a variety of settings, including combining based on parametric, semiparametric and nonparametric models, online sequential updating methods, among others. Theoretical development on the efficiency of the divide-and-conquer methods is also discussed.

Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as:	arXiv:2102.10771 [stat.ML]
	(or arXiv:2102.10771v1 [stat.ML] for this version)

Submission history

From: Xueying Chen [view email]
[v1] Mon, 22 Feb 2021 04:40:55 GMT (231kb)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> stat > arXiv:2102.10771

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Statistics > Machine Learning

Title: Divide-and-conquer methods for big data analysis

Submission history